Teensy Threads, Mutex in IntervalTimer causes code to crash

Hello,

I am working on a data logger for a trackday (race track) motorcycle. The full code requires CAN bus, GPS into Serial1 etc so below is a skeleton of the same structure and though doesn't repeat the issue as often it will break and cause crashes.

Full implementation hardware is: Teensy 4.1, a NEO-M9N GPS receiver via Serial1, Adafruit Airlift ESP32 based wifi module via SPI
Test program below hardware: Teensy 4.1 + extra output on Serial4 for debugging (crash report)

In my real code I use threads and an interval timer and loop as follows:

  • loop will update a LCD (didn't include in the example)
  • thdGPS function is a Teensy Threads thread which reads from Serial1 which is GPS, does some GPS processing and then sends the GPS data out WiFi TCP connection via Adafruit Airlift
  • rc3TimerHandler function is the handler for my interval timer that runs at 40Hz. this gets data from a class which is populated by CAN bus and then puts the data out WiFi TCP connection via Adafruit Airlift

I am outputting to Serial in my example where I would be putting strings out to the TCP server in my real implementation. Also I use a Mutex to control updating and reading of my bike stats class between the CAN handler and the rc3TimerHandler usually but did not include in my example below.

I have a 2nd Serial connection on Serial4 to view the crash reports while I debug since Serial is scrolling with data soon after the crash report when it happens.

If I remove the Mutex call and output in rc3TimerHandler for my full implementation the program is stable for 30min but as soon as I enable my rc3 output it crashes usually within seconds.

For the example code below I get the following crash report: (as you can see it crashed within 1 min sometimes, 4 min and I also have had it run for 15min so I assume some race condition?)

Code:
CrashReport:
  A problem occurred at (system time) 22:3:49
  Code was executing from address 0x850
  CFSR: 40000
        (INVPC) Usage fault: invalid EXC_RETURN
  Temperature inside the chip was 50.26 °C
                                             Startup CPU clock speed is 600MHz
                                                                                Reboot was caused by auto reboot after fault or bad interrupt detected

Setup Complete
Report coming ...
CrashReport:
  A problem occurred at (system time) 22:7:33
<SNIP>
Setup Complete
Report coming ...
CrashReport:
  A problem occurred at (system time) 22:8:25
<SNIP>
Setup Complete
Report coming ...
CrashReport:
  A problem occurred at (system time) 22:12:27

When I try to find what line that is using addr2line: (which isn't really helpful)

Code:
addr2line -e C:\Users\jeff\AppData\Local\Temp\arduino_build_666607\TestCrash.ino.elf -a 0x850
0x00000850
arm-none-eabi-addr2line: Dwarf Error: Can't find .debug_ranges section.
libc_a-__call_atexit.o:?

Looking in the lst file for the program it seems to be in the Teensy Threads code?

Code:
     848:	f8c2 8000 	str.w	r8, [r2]
  __enable_irq();
     84c:	b662      	cpsie	i
  __asm volatile("svc %0" : : "i"(Threads::SVC_NUMBER));
     84e:	df21      	svc	33	; 0x21
  __disable_irq();
     850:	b672      	cpsid	i
  int old_state = currentActive;
     852:	6813      	ldr	r3, [r2, #0]
  currentActive = STOPPED;
     854:	6014      	str	r4, [r2, #0]
  __enable_irq();
     856:	b662      	cpsie	i


I am wondering if there is some sort of conflict between the IntervalTimer and the interrupt/timing of Teensy Threads?

Looking for any thoughts on another approach if there is some conflict between the items I am using. I have had my real implementation working with just GPS and LCD but with my new motorcycle I have CAN bus and want to add data from it to the output that goes to the phone for actual lap timing and logging. (app called Race Chrono)

Edit: Additional details once I typed initial post

I use TyCommander to monitor Serial output of the Teensy. If I click on the Serial button to stop the Serial data scrolling and then try to connect the Serial again by clicking button again the Teensy crashes within seconds without showing any Serial data. Further more if I don't try to reconnect the Serial it also crashes pretty quickly. I have yet to investigate this new finding further (it's the same crash execution address in the crash report).

Suggestions/Comments Welcome
Thanks, Jeff


Code:
#include <Arduino.h>
#include <TeensyThreads.h>

IntervalTimer rc3Timer;
Threads::Mutex mutexOutput;

#define GPS_BUFFER_SIZE 150
#define RC3_HZ 40
char gpsLine[GPS_BUFFER_SIZE];
int gpsCount = 0;
char rcLine[GPS_BUFFER_SIZE];
char tempRCLine[GPS_BUFFER_SIZE];

class StringDumper : private Print
{
  public:
    StringDumper(const Printable &p)
    {
      this->println(p);
    }
    operator const char *() const
    {
      return buf.c_str();
    }

  protected:
    size_t write(uint8_t b) override
    {
      buf.append((char)b);
      return 1;
    }
    String buf;
};

void rc3TimerHandler()
{
  sprintf( rcLine, "Normalling this would be logging data: %lu\r\n", millis() );
  {
    Threads::Scope scope(mutexOutput);
    Serial.print(rcLine);
  }
}

int thdIdGPS = -1;
void thdGPS()
{

  while (true)
  {
      sprintf( gpsLine, "Normalling this would be GPS data: %lu\r\n", millis() );
    {
      Threads::Scope scope(mutexOutput);
      Serial.print(gpsLine);
    }

    memset(gpsLine, 0, sizeof(gpsLine));
    gpsCount = 0;
    threads.yield();
  }
}

void setup() {
  Serial.begin(115200);
  Serial4.begin(115200);
 if (CrashReport)
  {
    StringDumper report(CrashReport);
// doing it like this because in the real program I put the crash report out to SD and Serial4 for debug
    Serial.println("Report coming ...");
    Serial4.println("Report coming ...");
    delay(10000);
    Serial.print(report);
     Serial4.print(report);
   delay(10000);
  }
  threads.setSliceMicros(10);
  Serial.println("Set default time slice");
  thdIdGPS = threads.addThread(thdGPS);
  Serial.println("Added GPS thread");
  rc3Timer.begin(rc3TimerHandler, 1000000 / RC3_HZ);
  Serial.println("Started RC3 timer");
  Serial4.println("Setup Complete");
}

void loop() {
  // put your main code here, to run repeatedly:

}
 
Last edited:
simple "int func()" here the other day was CrashReport'ing when it had no 'return intX;' - didn't see the build warning for far too long ...

Used the addr2line that was on the computer that used to work and it gave a funny result maybe the same.

be sure to use the oddly named 'arm-none-eabi-addr2line.exe' installed with teensyduino tools - found here in:
...\AppData\Local\Arduino15\packages\teensy\tools\teensy-compile\11.3.1\arm\bin
and
...\AppData\Local\Arduino15\packages\teensy\tools\teensy-compile\1.56.1\arm\bin

This computer not up to date with IDE 2 libs ... but that may be solution to finding the code FWIW
<edit> ide 2.1 and TD 1.59b2 is installed - so that addr2line in 11.3.1 should be right for current beta ...

Bummer is ... as cool as TeensyThreads is ... the context switching causing trouble is common when code isn't built to tolerate it.
 
Generally you can't use a mutex from an interrupt handler context, which is where the intervaltimer callback is called from. Reason being if the mutex is blocks, the program will switch to a different thread including switching the current stack which holds the information needed to return from the interrupt.
 
Didn't check, but wondered what priority TeensyThreads uses for what is assumed an interrupt for task switching? If that were lower, then higher PRI interrupt code would be safe to complete and exit.

If not lower then when it exits MCU would 'want' to get back to that ...
 
Didn't check, but wondered what priority TeensyThreads uses for what is assumed an interrupt for task switching? If that were lower, then higher PRI interrupt code would be safe to complete and exit.

If not lower then when it exits MCU would 'want' to get back to that ...

Whether it used an interrupt to switch threads or not, for a co-operative switch (the current thread needs to be paused because the resource it needs is owned by someone else) the stack would get switched regardless. The new active thread would start running in the interrupt context without the information to return from it.
If it tried to use a lower priority interrupt to perform the task switch, it simply wouldn't work; the current interrupt context would be higher and the interrupt simply wouldn't trigger, leaving the thread that was meant to be paused still running.
 
Just looked - seems this is #ifndef __IMXRT1062__:
Code:
IntervalTimer context_timer;
...
/*
 * Implementation strategy suggested by @tni in Teensy Forums; see
 * https://forum.pjrc.com/threads/41504-Teensy-3-x-multithreading-library-first-release
 */

  // lowest priority so we don't interrupt other interrupts
  context_timer.priority(255);

and for "#ifdef __IMXRT1062__" uses a GPT timer instead. But, the same lowest priority 255 is requested:
Code:
  if (gpt_number == 0) {
    if (! NVIC_IS_ENABLED(IRQ_GPT1)) {
      attachInterruptVector(IRQ_GPT1, &gpt1_isr);
      NVIC_SET_PRIORITY(IRQ_GPT1, 255);
      NVIC_ENABLE_IRQ(IRQ_GPT1);
      gpt_number = 1;
    }
    else if (! NVIC_IS_ENABLED(IRQ_GPT2)) {
      attachInterruptVector(IRQ_GPT2, &gpt2_isr);
      NVIC_SET_PRIORITY(IRQ_GPT2, 255);
      NVIC_ENABLE_IRQ(IRQ_GPT2);
      gpt_number = 2;
    }

<edit> crosspost #5 - so TeensyTimer won't interrupt/task switch during any interrupt priority over 239 - what would be loop() code in normal sketch.
 
What it really needs is semaphores, since they aren't "owned" by any thread and are typically safe to use in interrupt handlers (handler releases a semaphore which a regular thread is waiting on). A volatile int can do the same job if only one thread is waiting on it (and slightly less efficiently).
 
Not saying mutex interaction can't cause trouble ... just that the by design Threads won't switch during normal interrupts that default to mid level PRI.

UART code is fed by interrupts which would be safe - but any read/write to buffer transfer on user thread could be interrupted it seems perhaps leaving the head/tail code in an unsafe state for the duration. Same would apply to other library usage mid process could lose context ...
 
Not saying mutex interaction can't cause trouble ... just that the by design Threads won't switch during normal interrupts that default to mid level PRI.

If a thread tries to claim a mutex that isn't available, it HAS to switch. That's a co-operative switch, compared to a pre-emptive switch (triggered by end of the thread's timeslice, e.g. a timer triggering).
 
Hello,

Wow, go to sleep and wake to ideas, thanks. A bunch of these thoughts match where my head was going as I laid in bed thinking rather than sleeping. I dont need perfect 40Hz for extra data so I will investigate adding to the same loop as the GPS data. The more important part is I need to wait for between GPS lines to inject the additional CAN based data lines. As mentioned a volatile global might let me trigger the addition of the data in the main loop.

Seems I forgot to add the software config I am using.... just for completeness of the thread: using platformio (latest) for read dev but used Arduino IDE 1.8.18 + 1.58 Teensyduino

Will redesign things a bit and see if it works better without the conflict in the interrupt processing
Thanks, Jeff
 
Generally you can't use a mutex from an interrupt handler context, which is where the intervaltimer callback is called from. Reason being if the mutex is blocks, the program will switch to a different thread including switching the current stack which holds the information needed to return from the interrupt.

The usual ways to synchronize with an ISR are critical sections (which disable interrupts temporarily so the ISR never runs when the resource is claimed by the rest of the program - this means the ISR effectively owns the resource and thus can always complete and exit). Or you use queues and event counters so the ISR never has to claim the resource, just queue stuff up for an event loop to process.

Or put another way ISRs are usually the highest priority, so if they have to block for a mutex and you aren't using priority inversion, you have deadlock. If you use priority inversion the ISR would then have to schedule the owning thread so it can release the mutex, and immediately regain control so it can return in a timely fashion.

I think its easier to use critical sections, which effectively claim any resources used by any ISRs (of the relevant priority levels).

One way to run timer events without being in an ISR context is for a timer interrupt to post an event counter which is tested by a suitable event loop. This is sort of what polling millis() or micros() does, using the time value itself as the event counter.
 
Back
Top