Probable race condition in Radiohead library

dgranger

Well-known member
Environment: Win7, Arduino 1.8.1, Teensyduino 1.35, Radiohead v1.62 or v1.67 or Teensy RH version

I’ve been trying to get the example sketch on Adafruit’s RFM9x page: https://learn.adafruit.com/radio-featherwing/using-the-rfm-9x-radio
to run with my Teensy 3.2 feather board. The example Tx sketch is basically the rf95_client example code included with the Radiohead library.

The sketch would simply loop twice and on the second “rf95.waitPacketSent()” would hang indefinitely. I tried substituting RH v1.67 for the v1.62 library linked to the Adafruit page and saw exactly the same behavior. The current (v1.69) RH library is available here: http://www.airspayce.com/mikem/arduino/RadioHead/

The Adafruit Tx sketch:

Code:
          // Feather9x_TX
    // -*- mode: C++ -*-
    // Example sketch showing how to create a simple messaging client (transmitter)
    // with the RH_RF95 class. RH_RF95 class does not provide for addressing or
    // reliability, so you should only use RH_RF95 if you do not need the higher
    // level messaging abilities.
    // It is designed to work with the other example Feather9x_RX
     
    #include <SPI.h>
    #include <RH_RF95.h>
     
    /* for feather32u4 
    #define RFM95_CS 8
    #define RFM95_RST 4
    #define RFM95_INT 7
     */
	 
    /* for feather m0  
    #define RFM95_CS 8
    #define RFM95_RST 4
    #define RFM95_INT 3
    */
     
    /* for shield 
    #define RFM95_CS 10
    #define RFM95_RST 9
    #define RFM95_INT 7
    */
     
     
    /* for ESP w/featherwing 
    #define RFM95_CS  2    // "E"
    #define RFM95_RST 16   // "D"
    #define RFM95_INT 15   // "B"
    */
     
    /* Feather 32u4 w/wing
    #define RFM95_RST     11   // "A"
    #define RFM95_CS      10   // "B"
    #define RFM95_INT     2    // "SDA" (only SDA/SCL/RX/TX have IRQ!)
    */
     
    /* Feather m0 w/wing 
    #define RFM95_RST     11   // "A"
    #define RFM95_CS      10   // "B"
    #define RFM95_INT     6    // "D"
    */
     
    /* Teensy 3.x w/wing */
    #define RFM95_RST     9   // "A"
    #define RFM95_CS      10   // "B"
    #define RFM95_INT     4    // "C"
    
     
    // Change to 434.0 or other frequency, must match RX's freq!
    #define RF95_FREQ 433.0
     
    // Singleton instance of the radio driver
    RH_RF95 rf95(RFM95_CS, RFM95_INT);
     
    void setup() 
    {
      pinMode(RFM95_RST, OUTPUT);
      digitalWrite(RFM95_RST, HIGH);
     
      while (!Serial);
      Serial.begin(9600);
      delay(100);
     
      Serial.println("Feather LoRa TX Test!");
     
      // manual reset
      digitalWrite(RFM95_RST, LOW);
      delay(10);
      digitalWrite(RFM95_RST, HIGH);
      delay(10);
     
      while (!rf95.init()) {
        Serial.println("LoRa radio init failed");
        while (1);
      }
      Serial.println("LoRa radio init OK!");
     
      // Defaults after init are 434.0MHz, modulation GFSK_Rb250Fd250, +13dbM
      if (!rf95.setFrequency(RF95_FREQ)) {
        Serial.println("setFrequency failed");
        while (1);
      }
      Serial.print("Set Freq to: "); Serial.println(RF95_FREQ);
      
      // Defaults after init are 434.0MHz, 13dBm, Bw = 125 kHz, Cr = 4/5, Sf = 128chips/symbol, CRC on
     
      // The default transmitter power is 13dBm, using PA_BOOST.
      // If you are using RFM95/96/97/98 modules which uses the PA_BOOST transmitter pin, then 
      // you can set transmitter powers from 5 to 23 dBm:
      rf95.setTxPower(23, false);
    }
     
    int16_t packetnum = 0;  // packet counter, we increment per xmission
     
    void loop()
    {
      Serial.println("Sending to rf95_server");
      // Send a message to rf95_server
      
      char radiopacket[20] = "Hello World #      ";
      itoa(packetnum++, radiopacket+13, 10);
      Serial.print("Sending "); Serial.println(radiopacket);
      radiopacket[19] = 0;
      
      Serial.println("Sending..."); delay(10);
      rf95.send((uint8_t *)radiopacket, 20);
     
      Serial.println("Waiting for packet to complete..."); delay(10);
      rf95.waitPacketSent();
      // Now wait for a reply
      uint8_t buf[RH_RF95_MAX_MESSAGE_LEN];
      uint8_t len = sizeof(buf);
     
      Serial.println("Waiting for reply..."); delay(10);
      if (rf95.waitAvailableTimeout(1000))
      { 
        // Should be a reply message for us now   
        if (rf95.recv(buf, &len))
       {
          Serial.print("Got reply: ");
          Serial.println((char*)buf);
          Serial.print("RSSI: ");
          Serial.println(rf95.lastRssi(), DEC);    
        }
        else
        {
          Serial.println("Receive failed");
        }
      }
      else
      {
        Serial.println("No reply, is there a listener around?");
      }
      delay(1000);
    }

I hung a scope probe on pin 4 of the Teensy (IRQ from RF9x module) to verify interrupts were indeed firing. I then instrumented the RH RF95 interrupt handler code to print the current _mode and IRQ flags upon entry.

Interestingly, the first send IRQ resulted in 0x308 = _mode = Tx, flags = TxCmplt and the second send IRQ resulted in 0x208 = _mode = idle, flags = TxCmplt after which the sketch would always hang in the following call to “rf95.waitPacketSent()”.

I started instrumenting the sketch with various delays and debug prints and found a combo which consistently yielded IRQ outputs of 0x308 and the sketch no longer hung after the second send.

This led me to believe that the RH library might have a timing window where a race condition could cause it to misbehave.

I read the RH code for “RH_RF95::send” looking for where Tx might be initiated without a corresponding _mode set and zeroed in on “RH_RF95::setModeTx()”.

Code:
void RH_RF95::setModeTx()
{
    if (_mode != RHModeTx)
    {
	spiWrite(RH_RF95_REG_01_OP_MODE, RH_RF95_MODE_TX);
	spiWrite(RH_RF95_REG_40_DIO_MAPPING1, 0x40); // Interrupt on TxDone
	_mode = RHModeTx;
    }
}

As you can see the code sets _mode after sending the Tx command via spi to the module, leaving a timing window where a fast interrupt could get to the handler without _mode being set correctly.

I simply moved the _mode set before the spiWrite’s and the sketch ran continuously with IRQ outputs of 0x308 (correct _mode for IRQ) with or without debug delays and prints.

Code:
void RH_RF95::setModeTx()
{
    if (_mode != RHModeTx)
    {
	_mode = RHModeTx;		// set mode prior to sending Tx cmd to avoid fast TxCmplt race condition
	spiWrite(RH_RF95_REG_01_OP_MODE, RH_RF95_MODE_TX);
	spiWrite(RH_RF95_REG_40_DIO_MAPPING1, 0x40); // Interrupt on TxDone
//	_mode = RHModeTx;
    }
}

I also moved the _mode set in “RH_RF95::setModeRx()” and “RH_RF95::isChannelActive()” as these also cause interrupts to fire and exhibited the same potential timing windows.

As the Radiohead lib supports numerous radio modules with different specific low level code there are sure to be other potential windows for race conditions. I only verified the fix for my specific case.

I noticed that there is a Radiohead library in the Teensyduino distro that appears to be quite a bit older. I did try it with my setup and it hung as well. Looking very briefly at that code I didn’t see any specific Teensy related changes such as spi transactions. It may be time for Paul to consider updating the library as newer versions have added various params and features.

Looking at v1.69 of the RH library, the code for “RH_RF95::setModeTx()”, “RH_RF95::setModeRx()” and “RH_RF95::isChannelActive()” is the same as v1.67 (& v1.62), so I’d expect the race condition to be in v1.69 as well.

Hopefully the above may help others who encounter similar issues using the Radiohead library.

David
 
I have pretty much the same problem with Radiohead and RFM95W radio's. I used KurtE's version, which made the sketch always get stuck at rf95.waitPacketSent(). The original version also has this problem, but occuring at unpredictable intervals.
 
Looks like we need to play around some more...

Yep - I mainly added begin/end transactions at the places where it was controlling the CS pin. Also I duplicated the hardwarespi module for hardware SPI1 and harddware SPI2...

If you have version of program that always get's stuck would be good to see. I am playing around with simple stuff with the RF95s, which is currently just using RHDatagram.

And likewise if you have fixes that work, I will be glad to pull them into my version and then Paul can decide at some point to take in Pull request with these.

Kurt
 
Hi Kurt,

Is your version the same as what is in the Teensyduino 1.35 distribution? If so, the posted sketch does hang reliably (on a 3.2 w/ the Adafuit rf9x Feather module) on the second rf95::waitPacketSent(). If not, can you point me to your version so I can test with it? The version in the 1.35 distro is somewhat out of date and lacks a few features such as putting the rf95 into high power mode.

I only made changes to the rf95.cpp module as that's the radio module I'm using. The changes are simple and I'd be glad to post them. However, there are at least two places in the rf95 driver that will spin forever if the code misses an interrupt or otherwise gets out of sync. I'm looking at implementing some timeouts for these cases, and perhaps putting some fail safe code in the interrupt handler.

Also, I'm hoping to use the RH library along with the SD library on my 3.2 project. From what I see in the current RH distribution, it doesn't look like it would play nice with other SPI users. I'm guessing that your version addresses this with SPI transactions. Without transaction support I was considering switching to a 3.5 to separate the SD from the RH SPI bus.

So yes, point me to whatever RH version you'd like to work with and I'll test with that and share any problems and fixes I come up with.

David
 
If you have version of program that always get's stuck would be good to see. I am playing around with simple stuff with the RF95s, which is currently just using RHDatagram.
I pulled my version from your Github. Like dgranger, it always gets stuck at the second waitPacketsent(). Same with the version included in TD. If I use the latest version from mikem, it hangs intermittently. I'm using it on a T3.2 with Adafruit RFM95W breakout.
 
Hi David and Epyon - A mentioned I am using the version that I hacked up. Epyon - have you tried making the change the David did to see if that fixes the issue?

As mentioned looks like a true race condition. I hate these types as unclear of proper fix. That is, is it possible to get an interrupt for not TX during the time you set the flag and tell it to be in TX mode? If so race the other direction. I hate doing the fixes like cli/sei to resolve these.

I will make the same fix you did in my version as it seems more valid than current
 
Moving the _mode sets is pretty safe. The IRQ flags on a RF95 are broken out so you know the source of the interrupt (RxTimeout, RxDone, RXCRC, TxDone, CadDone, etc). Unfortunately, the RF95 interrupt handler in the RH library doesn't handle interrupts who's source flags don't match the currently set _mode, which results in hangs. That's why I was looking into adding some fail safe error handling.

Anyway, the sketch posted in #1 on this thread always hangs for me with every RH library I've tried (1.62, 1.67, 1.69, & the one in the 1.35 Teensyduino).

I'd be happy to test w/ your (Kurt) version as SPI transactions would be a plus in the long run.
 
Sounds good. Will try your fix, which is nice and lightweight.

If still issue, was not sure if it would make sense to use bigger hammer, like:

Code:
void RH_RF95::setModeTx()
{
    ATOMIC_BLOCK_START;
    if (_mode != RHModeTx)
    {
	spiWrite(RH_RF95_REG_01_OP_MODE, RH_RF95_MODE_TX);
	spiWrite(RH_RF95_REG_40_DIO_MAPPING1, 0x40); // Interrupt on TxDone
	_mode = RHModeTx;
    }
    ATOMIC_BLOCK_END;
}

Do you find same issue with the setModeRx ?
 
I thought about turning off interrupts around the setModes, but was worried that the SPI stuff might take long enough to disrupt other time critical interrupt handling. So I simply moved the _mode variable sets, which eliminated the problem as far as I can tell.

The original test sketch simply loops forever transmitting packets and looking for a reply. Since it hung immediately in waitPacketSent() I worked on tracking down the source on the Tx side. I didn't test the RX code, I simply noticed that all the setModes set the _mode var after sending the command to the radio module and therefore had the potential for a race condition. I believe that only Tx, Rx, and Cad commands can result in interrupts so those are the ones I changed.
 
I pushed up this change for TX and RX modes and tried out the simple server/client or TX/RX.... And I don't think I am getting hang now...
Warning these are configured for my Well monitor board with RFM95 soldered directly on board using SPI1 pins...
Client:
Code:
// Feather9x_TX
// -*- mode: C++ -*-
// Example sketch showing how to create a simple messaging client (transmitter)
// with the RH_RF95 class. RH_RF95 class does not provide for addressing or
// reliability, so you should only use RH_RF95 if you do not need the higher
// level messaging abilities.
// It is designed to work with the other example Feather9x_RX

#include <SPI.h>
#include <RH_RF95.h>
#define TRY_SPI1
#ifdef TRY_SPI1
#include <RHHardwareSPI1.h>
// MISO 1, MOSI 0, SCK 20
#define RFM95_CS 31
#define RFM95_RST 37
#define RFM95_INT 2

// SPI1 Miso=D5, Mosi=21, sck=20, CS=31
RH_RF95 rf95(RFM95_CS, RFM95_INT, hardware_spi1);

// Singleton instance of the radio driver
#else

/* for feather32u4 
#define RFM95_CS 8
#define RFM95_RST 4
#define RFM95_INT 7
*/

/* for feather m0 */
#define RFM95_CS 8
#define RFM95_RST 4
#define RFM95_INT 3

/* for shield 
#define RFM95_CS 10
#define RFM95_RST 9
#define RFM95_INT 7
*/
// Singleton instance of the radio driver
RH_RF95 rf95(RFM95_CS, RFM95_INT);
#endif

// Change to 434.0 or other frequency, must match RX's freq!
#define RF95_FREQ 915.0


void setup() 
{
  pinMode(RFM95_RST, OUTPUT);
  digitalWrite(RFM95_RST, HIGH);

  while (!Serial);
  Serial.begin(9600);
  delay(100);

  Serial.println("Feather LoRa TX Test!");

  // manual reset
  digitalWrite(RFM95_RST, LOW);
  delay(10);
  digitalWrite(RFM95_RST, HIGH);
  delay(10);

#ifdef TRY_SPI1
  SPI1.setMISO(5);
  SPI1.setMOSI(21);
  SPI1.setSCK(20); 
#endif

  while (!rf95.init()) {
    Serial.println("LoRa radio init failed");
    while (1);
  }
  Serial.println("LoRa radio init OK!");

  // Defaults after init are 434.0MHz, modulation GFSK_Rb250Fd250, +13dbM
  if (!rf95.setFrequency(RF95_FREQ)) {
    Serial.println("setFrequency failed");
    while (1);
  }
  Serial.print("Set Freq to: "); Serial.println(RF95_FREQ);
  
  // Defaults after init are 434.0MHz, 13dBm, Bw = 125 kHz, Cr = 4/5, Sf = 128chips/symbol, CRC on

  // The default transmitter power is 13dBm, using PA_BOOST.
  // If you are using RFM95/96/97/98 modules which uses the PA_BOOST transmitter pin, then 
  // you can set transmitter powers from 5 to 23 dBm:
  rf95.setTxPower(23, false);
}

int16_t packetnum = 0;  // packet counter, we increment per xmission

void loop()
{
  Serial.println("Sending to rf95_server");
  // Send a message to rf95_server
  
  char radiopacket[20] = "Hello World #      ";
  itoa(packetnum++, radiopacket+13, 10);
  Serial.print("Sending "); Serial.println(radiopacket);
  radiopacket[19] = 0;
  
  Serial.println("Sending..."); delay(10);
  rf95.send((uint8_t *)radiopacket, 20);

  Serial.println("Waiting for packet to complete..."); 
  //delay(10);
  rf95.waitPacketSent();
  // Now wait for a reply
  uint8_t buf[RH_RF95_MAX_MESSAGE_LEN];
  uint8_t len = sizeof(buf);

  Serial.println("Waiting for reply..."); delay(10);
  if (rf95.waitAvailableTimeout(1000))
  { 
    // Should be a reply message for us now   
    if (rf95.recv(buf, &len))
   {
      Serial.print("Got reply: ");
      Serial.println((char*)buf);
      Serial.print("RSSI: ");
      Serial.println(rf95.lastRssi(), DEC);    
    }
    else
    {
      Serial.println("Receive failed");
    }
  }
  else
  {
    Serial.println("No reply, is there a listener around?");
  }
  delay(1000);
}
Server:
Code:
// rf95_server.pde
// -*- mode: C++ -*-
// Example sketch showing how to create a simple messageing server
// with the RH_RF95 class. RH_RF95 class does not provide for addressing or
// reliability, so you should only use RH_RF95  if you do not need the higher
// level messaging abilities.
// It is designed to work with the other example rf95_client
// Tested with Anarduino MiniWirelessLoRa, Rocket Scream Mini Ultra Pro with
// the RFM95W, Adafruit Feather M0 with RFM95

#include <SPI.h>
#include <RH_RF95.h>
#define TRY_SPI1
#ifdef TRY_SPI1
#include <RHHardwareSPI1.h>
// MISO 1, MOSI 0, SCK 20
#define RFM95_CS 31
#define RFM95_RST 37
#define RFM95_INT 2

// SPI1 Miso=D5, Mosi=21, sck=20, CS=31
RH_RF95 rf95(RFM95_CS, RFM95_INT, hardware_spi1);

// Singleton instance of the radio driver
#else
//  RH_RF95(uint8_t slaveSelectPin = SS, uint8_t interruptPin = 2, RHGenericSPI& spi = hardware_spi);
#define RFM95_CS 10
#define RFM95_RST 9
#define RFM95_INT 2

RH_RF95 rf95(RFM95_CS, RFM95_INT);
#endif

// Change to 434.0 or other frequency, must match RX's freq!
#define RF95_FREQ 915.0


void setup() 
{
  pinMode(RFM95_RST, OUTPUT);
  digitalWrite(RFM95_RST, HIGH);

  Serial.begin(9600);
  while (!Serial) ; // Wait for serial port to be available

#ifdef TRY_SPI1
  SPI1.setMISO(5);
  SPI1.setMOSI(21);
  SPI1.setSCK(20); 
#endif

  if (!rf95.init())
    Serial.println("init failed");  
  // Defaults after init are 434.0MHz, 13dBm, Bw = 125 kHz, Cr = 4/5, Sf = 128chips/symbol, CRC on

  if (!rf95.setFrequency(RF95_FREQ)) {
    Serial.println("setFrequency failed");
    while (1);
  }
  Serial.print("Set Freq to: "); Serial.println(RF95_FREQ);
  rf95.setTxPower(23, false);

}

void loop()
{
  if (rf95.available())
  {
    // Should be a message for us now   
    uint8_t buf[RH_RF95_MAX_MESSAGE_LEN];
    uint8_t len = sizeof(buf);
    if (rf95.recv(buf, &len))
    {
//      digitalWrite(led, HIGH);
//      RH_RF95::printBuffer("request: ", buf, len);
      Serial.print("got request: ");
      Serial.println((char*)buf);
      Serial.print("RSSI: ");
      Serial.println(rf95.lastRssi(), DEC);
      
      // Send a reply
      uint8_t data[] = "And hello back to you";
      delay(5);
      bool sent = rf95.send(data, sizeof(data));
      Serial.print("Send reply ");
      Serial.println((int)sent, DEC);
      rf95.waitPacketSent();
      Serial.println("Sent a reply");
//       digitalWrite(led, LOW);
    }
    else
    {
      Serial.println("recv failed");
    }
  }
}
Both sides are showing messages in TyQt (oops I mean TyCommander) window and have been running for maybe 10 minutes or so.
 
Which fix? The ATOMIC ops or the move of the _mode var sets?

Also can I get your changes to test with? I'm very interested in transaction support for the RH lib.
 
Yep - I mainly added begin/end transactions at the places where it was controlling the CS pin. Also I duplicated the hardwarespi module for hardware SPI1 and harddware SPI2...

Kurt, have you ever tried software SPI with your stuff, radio or otherwise? For that matter, is there even a proven SoftSPI library for Teensy 3.x anywheres? Sounds like it might save some conflict problems, ala "begin/end transactions".
 
No i have not tried the software SPI. Again so far I have mainly only used this to play with my LoRa radios on my psuedo Flex boards where I have the radio added for a T3.5/6 setup using an alternate SPI. In my case SPI is used for the ili9341 display and SPI1 is used for the radio.
 
Kurt, aha, I was just looking at the forked RH library on your github page, and lo and behold, they actually have SoftSPI files.

I use the RFM69 radios myself, but with the Moteino library, and so far only with Mega1284 and ProMini-328 processors. Inside the Moteino ISR, they have a complete packet xfer from the radio in one shot using [h.w.] SPI, and I am thinking that soft-SPI may be just as fast overall on a T3.x, and even more so to avoid spi-transactions being used, as appears to be the standard case with RH. And also, since the RFM bitrates are only 55 Kbps or somesuch, the packet xfers to the host should still be plenty fast using soft-SPI.

A year and more ago, I had tried [and given up on] using the RFM radio on the T3.1 along with sharing the single SPI buss with an LCD/etc, but never had much success. I looked for a soft-SPI but hadn't found one at the time. Maybe a good time to revisit the issue.
 
The RH lib has software SPI which can be specified in the constructor for the particular radio driver. In addition Bill Greiman has a software SPI driver he includes in his SdFat library. I believe he measured the soft clock rate at ~4-5 mHz on a Teensy 3.2. The FIFO on a RF95 is 255 bytes so ~0.5 mS to transfer a full length incoming msg to a local buffer via (Greiman's) soft SPI.

The RH code assumes that the SPI bus is exclusively it's own without regard to other users and certainly without SPI transactions. So if you need support for multiple SPI devices, you'll want to use soft SPI for the RH lib or perhaps for the other SPI device(s). Another option is to use a 3.5 or 3.6 with multiple SPI ports. In my case I only want to use the RF95 and a SD card, so I plan on using SdFat w/ Greiman's soft SPI driver which is plenty fast for my needs. See this link for more info.
 
Oh my, I had missed the bit about SD card and radio on the T3.2, ie same SPI port. I've almost shot myself in the head trying to do that sort of thing in the past on the T3.1. SD card on a buss with "anything" else tends to be a disaster from my experience. I've been able to run LCD, SPI RAM, and Arducam together on a single T3.1 SPI buss, but never with SD. Anymore, I use the T3.6 with its 3 1/2 SPI ports for such projects, then no conflicts and no worry about spi-transactions either..

You might check this thread, and some of my comments towards the end, especially post #59, ie on how some SD card modules are built, and the bit about SD not-releasing MISO [supposedly fixed]. Just some additional info. Maybe you'll have better luck, SD just don't like me at all, :).
https://forum.pjrc.com/threads/37652-microSD-slot-on-teensy-3-6
 
KurtE: I have done some testing with your updated library and the simple client/server setup. On a first attempt, the server only received the first message from the client, responded and then got stuck again. I added some delay in the main loop, and then it worked fine. I then deleted the delay again, and it continued to function just fine for all the tests. Weird, but seems to keep working now.

Code:
#include <SPI.h>
#include <RH_RF95.h>

#define RFM95_CS 10
#define RFM95_RST 9
#define RFM95_INT 2

// Change to 434.0 or other frequency, must match RX's freq!
#define RF95_FREQ 868.0

// Singleton instance of the radio driver
RH_RF95 rf95;
//RH_RF95 rf95(5, 2); // Rocket Scream Mini Ultra Pro with the RFM95W
//RH_RF95 rf95(8, 3); // Adafruit Feather M0 with RFM95 

int led = 15;

void setup() 
{
  pinMode(RFM95_RST, OUTPUT);
  digitalWrite(RFM95_RST, HIGH);
  pinMode(15, OUTPUT);
  
  Serial.begin(9600);
  delay(3000);
  Serial.println("Arduino LoRa RX Test!");

  // manual reset
  digitalWrite(RFM95_RST, LOW);
  delay(10);
  digitalWrite(RFM95_RST, HIGH);
  delay(10);
  
  //while (!Serial) ; // Wait for serial port to be available
  if (!rf95.init()){
    Serial.println("init failed");  
  }
  else{
    Serial.println("LoRa radio init OK!");
  }

  if (!rf95.setFrequency(RF95_FREQ)) {
    Serial.println("setFrequency failed");
    while (1);
  }
  Serial.print("Set Freq to: "); Serial.println(RF95_FREQ);

  // Defaults after init are 434.0MHz, 13dBm, Bw = 125 kHz, Cr = 4/5, Sf = 128chips/symbol, CRC on
  rf95.setTxPower(23, false);
}

void loop()
{
  if (rf95.available())
  {
    // Should be a message for us now   
    uint8_t buf[RH_RF95_MAX_MESSAGE_LEN];
    uint8_t len = sizeof(buf);
    if (rf95.recv(buf, &len))
    {
      digitalWrite(led, HIGH);
      RH_RF95::printBuffer("request: ", buf, len);
      Serial.print("got request: ");
      Serial.println((char*)buf);
      Serial.print("RSSI: ");
      Serial.println(rf95.lastRssi(), DEC);
      
      // Send a reply
      uint8_t data[] = "And hello back to you";
      rf95.send(data, sizeof(data));
      rf95.waitPacketSent();
      Serial.println("Sent a reply");
      delay(1500); //just a quick hack to make the led visible longer
       digitalWrite(led, LOW);
    }
    else
    {
      Serial.println("recv failed");
    }
  }
  /*else{
    if(printdelay > millis()+2000){
      Serial.print("waiting");
      printdelay = millis();
    }
  }*/
}
 
Strange!

So far it appears to be working for me. Again I am using real basic RHDatagram...

I was running into issues and pulling my few hairs out, trying to figure out why I was not receiving valid data, until I figured out I was not properly initializing the length field before I passed in a pointer to it... funny how doing some of the simple things can drive you nutty.... It took me awhile to figure it out as the code was in a subroutine so the variable was on the stack and it used whatever garbage happened to be there. So for awhile the code worked as the garbage value was large enough to read in the message, but then code changed and the code paths changed and then more often then not, the garbage was a 0 and so no data bytes were read in...
 
Hi Epyon,

If you're experiencing hangs or other strange behavior in the Radiohead lib it might help to see if there's a disconnect between the current RH state and the interrupt source. I ended up putting a debug print into the RF95 interrupt handler to help me identify the Tx/Rx race condition.

If your hangs reappear you might try adding a (uncommented) print to the RH_RF95::handleInterrupt() routine like in the following snippet:

Code:
void RH_RF95::handleInterrupt()
{
    // Read the interrupt register
    uint8_t irq_flags = spiRead(RH_RF95_REG_12_IRQ_FLAGS);

//	Serial.print("i ");  Serial.println(_mode<<8 | irq_flags, HEX);  // DEBUG: output _mode & irq flags

    if (_mode == RHModeRx && irq_flags & (RH_RF95_RX_TIMEOUT | RH_RF95_PAYLOAD_CRC_ERROR))

Unlike some of the other radio modules, the RF9x breaks out the source of the interrupt to individual bits in the irq_flags byte so it's easy to see if there's a mismatch. This may help to ID the source of the hang.

I have the same fix as found in KurtE's lib in my version of RH and it's running without issue for me. But if there is another hole in the driver it would be great to get it identified and patched.
 
help me identify the Tx/Rx race condition.
Rather than using a Serial.print() inside the ISR, I should think it would be a lot better - and especially when ISR timing is critical - to capture the values (_mode<<8 | irq_flags) either to a volatile int or an indexed array, and then set a flag, also volatile, and then use the flag to trigger Serial.print() operations out in the "main" loop.
 
While I agree with the idea of avoiding prints in ISR's, in this case the interrupt timing is non-critical. The Rf9x will hold the state of an IRQ until it's read, and then not interrupt again until the flags are reset. Moreover, RF9x interrupts are in response to a command event such as Tx a buf or Rx, not some asynchronous event,

So yes, conventional wisdom says don't call outside an IRQ, especially Print, but in this case it's an okay way to debug.

Do remember, that the IRQ-induced hangs often prevent the main loop from running...
 
Arduino 1.8.1, Teensy 3.2, Radiohead 1.74 Teensyduino 1.35

I am using Adafruit RFM95 LoRa module and having same problem as the OP using the same code. Hangs on the second “rf95.waitPacketSent()”

Following the tip from post 8 in https://forum.pjrc.com/threads/4063...Ra-Radio-issue?p=126458&viewfull=1#post126458 I changed the attachInterrrupt from RISING to HIGH and fixed it for me. It seems to be bulletproof (I'm not saying it is correct though)

RH_RF95.cpp

Code:
	attachInterrupt(interruptNumber, isr0, HIGH);
 
Back
Top