T4.1 Intermitent Crashes with Teensy Threads

TeensyDigital · Jul 7, 2022

I have an amateur rocket flight telemetry application on a T4.1 that runs a main loop and two thread loops. The code base is (relatively) large, as it is tracking and managing over 100 variables and sampling sensors at 400 Hz. One of my thread loops (A) is for logging data to QSPI Flash using LittleFS and the other thread loop (B) is for sampling four I2C sensors (2 accelerometers, a gyro, and a barometer) and doing quaternion math on the output. The main loop is monitoring flight events, serial GPS data, serial radio data, and another dozen "housekeeping" items. Recently I have been working on the LittleFS logging and experimenting with the logging rates. In my bench testing I started to notice that the T4.1 was crashing randomly about every 90 minutes with a "Data Access Violation". After 3-4 crashes I started to get organized about isolating the cause. In the field it is rare that the device is on for over an hour, but we have seen resets in the field that we (falsely?) attributed to RFI issues. In the last week of testing I've also seen the crash happen within minutes.

I suspect the issue relates to shared variables across the sampling and logging loops that are causing instability. I have about 75 variables, organized in structs, that get populated (written) in the (B) sampling thread. The (A) logging thread "reads" these variables, constructs long strings with them and then writes the log entry to flash. There are only about 5 variables (counters) that are updated by both threads. These variables I have declared as volatile, but I do not have the other 75+ variables declared as volatile (although I tried and it didn't make a difference). I am not doing anything with pointers or writing directly to memory. The strings for logging are being constructed using strcpy, strcat, and sprintf. Most of the sensor data is different size ints and floats.

My threads are initiated with a stack size of 4096 and 2048 and are running with setMicroTimer to 5 (very small time slices for massive amounts of fast ITC reading). I've also got my code loaded with thread.yields() after every logic step in the code in all three threads.

Here is what I know, so far...

1. If I turn off thread A or thread B the device does not crash, but when they are both running it crashes at a random time from 2-120 minutes in duration.
2. In thread B (sampling), if I comment out 3 of the four sensors it still crashes, regardless of which sensor I keep active. Each sensor read also results in a lot of computation and global variables being set. There are no shared variables across the four sensor loops.
3. If I comment out all of the LittleFS code in thread A and only retain the logging string construction the board still crashes. So, it does not seem to be related at all to LittleFS.
4. I am monitoring the stack size in both threads and they are both stable and healthy.
5. If I increase the rate of logging (rate of string construction) to 150-200 times per second the crashes are more frequent (2-20 minutes). If I decrease the rate of logging (string construction) to 10X per second the crashes still happen, but I have gone as long as 8 hours without a crash.
6. I was working in TD1.55, for better LFS performance, but I moved everything to the latest TD1.56 and there was no change in behavior.

I have been experimenting with locking around the string construction functions in the logging loop, but that seemed to make it worse not better. Although, I don't feel like I fully have a handle on how locking works and the implications to the other thread. I am not sure how sensitive threading is to shared use of variables. It is obvious to me that if a variable is getting updated by two different threads it needs to be volatile, but what if one thread writes and the other thread just reads the shared variable?

I've got at least 40-50 crash reports that are all the same "Data Access Violation", but the Code Execution address and the Accessed Address are always different, event when I just recycle the power and rerun over and over without touching code.

I appreciate any help to troubleshoot further. I am going to continue to try and isolate the code to something reasonably small and frequently repeatable, but my trial and error cycle time is still 60-90 minutes.

Any insight into 1) better shared variable management with threading (e.g., how exactly does locking work?) or 2) any other diagnostics I can use to evaluate the crash reports or the environment as it is running?

Thanks,

Mike

Example Crash Reports:

Code:

CrashReport:
  A problem occurred at (system time) 0:29:48
  Code was executing from address 0x17C2A
  CFSR: 82
	(DACCVIOL) Data Access Violation
	(MMARVALID) Accessed Address: 0x343331
  Temperature inside the chip was 42.50 °C
  Startup CPU clock speed is 396MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected

CrashReport:
  A problem occurred at (system time) 0:30:28
  Code was executing from address 0x8B8A
  CFSR: 82
	(DACCVIOL) Data Access Violation
	(MMARVALID) Accessed Address: 0x39003836
  Temperature inside the chip was 41.20 °C
  Startup CPU clock speed is 396MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected

CrashReport:
  A problem occurred at (system time) 21:28:25
  Code was executing from address 0x15C8E
  CFSR: 82
	(DACCVIOL) Data Access Violation
	(MMARVALID) Accessed Address: 0x303000
  Temperature inside the chip was 41.85 °C
  Startup CPU clock speed is 396MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected

CrashReport:
  A problem occurred at (system time) 0:3:51
  Code was executing from address 0x9776
  CFSR: 82
	(DACCVIOL) Data Access Violation
	(MMARVALID) Accessed Address: 0x380000
  Temperature inside the chip was 41.85 °C
  Startup CPU clock speed is 396MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected

ps -- I love the new CrashReport capture feature on restart. That will make recovery "in the air" much easier.

defragster · Jul 7, 2022

Bummer on the elusive Crash. First thought is a pointer/array being misused or abused?
If you move to TD 1.57b3 some added 'BreadCrumbs' can be tracked with the CrashReport. This might help narrow down or locate the source/reason for the problem.

The only documentation is 'read the code' so far. But when I glanced at it, it seemed some few (6) locations were made available to pass in values that will display on restart with CrashReport"
void breadcrumb(unsigned int num, unsigned int value) {
Then on print of CrashReport:

Code:

    for (int i=0; i < 6; i++) {
      if (bc->bitmask & (1 << i)) {
        p.print("  Breadcrumb #");
        p.print(i + 1);
        p.print(" was ");
        p.print(bc->value[i]);
        p.print(" (0x");
        p.print(bc->value[i], HEX);
        p.println(")");
      }
    }

How those 6 words might help is based on your code. Perhaps set one to __LINE__ on entry to key functions - to get an idea of the function path last used. And for functions only used on one thread or the other use a unique breadcrumb. Also knowing which thread was active when the crash happened might be useful.

Also there is a thread where other NonVolatile regions of the 1062 could be used. The 1062 has 4 DWORDS that may provide faster access than the breadcrumbs that have to flush cache:
pjrc.com/threads/67449-Teensy-4-x-Battery-backed-Non-volitile-memory?p=281008&viewfull=1#post281008

Using thread.yield() when a task is done is probably a good choice, so the thread voluntarily completes a task before being 'interrupted'. Perhaps if that is well done then the time slice of the threads can be made artificially larger than 'needed', and those well behaved threads will yield unused time so they are not being interrupted?

IIRC: I looked at that code and some 'DEBUG' thread monitoring of cycles used was added. Not sure how that might help - perhaps each thread could set a variable to ZERO on start and set to getCyclesUsed() before calling .yield
> each thread could check the other thread's 'variable' and when left ZERO it would indicate it was interrupted instead of doing a graceful .yield. If that 'ZERO' were found to associate with the CRASH it may point to a problem of task switching for code not written to tolerate that?
> also reading the THREADS code and seeing that done might inspire some other DEBUG info that would be useful?
I did that much and it got added to the sources, I had other ideas 'back then' that seemed like they might help, but not in the general case.

CollinK · Jul 7, 2022

I'd say you might want to rethink where the thread.yield() calls are. Every yield is going to cause the next thread to run. Doing that too much just means you have constant thread switches that maybe you don't need. That you can make it crash more or less frequently in the ways you listed seems to suggest it's some sort of race condition and the more times the threads race to access things the worse it gets. So, fewer task switches might actually make it better. The processor is fast so maybe you could get away with letting each thread run a full loop if it can be done within the 5us window. That is, yield at the end of the loop but not before. It may or may not help.

I think everything you access between threads might want to be volatile as it can otherwise be tough for the optimizer to know when things might have changed. If side A writes and side B reads then how does side B know if the value updated? Did side A store the value in a register or in RAM? It could use a register for speed and then side B might not even see the update. To be safest you want to know that both sides are really reading and writing in the same place. This probably won't cause crashes but it might cause you to read stale values.

Lastly, have you used addr2line with your binary to see where those code execution points are found? It might offer some insight if you knew where it was crashing. I see that the addresses vary a lot but what functions are they in? If the execution points seem nonsensical then maybe you have some serious stack corruption and it is driving off into the weeds and trying to execute nonsense.

TeensyDigital · Jul 7, 2022

Thanks guys.

@collinK, that is an interesting theory. My original rationale for using a lot of yields was twofold, 1) it is better to switch threads when convenient then get caught in the middle of something on the switch and 2) The I2C bus with all the sensors is running at 400 Kilohertz and I wanted to make sure we didn't miss a pulse of processing on the bus. (e.g, go back more often).

That said, I isolated more code and sped up the string processing, so I can now get it to reliably crash <10 min. I fired up addr2line and analyzed five crashes (under 4 different compiled files). 2 of the addr2line results came back ??, but three separate (different edited compiles) came back with the code address belonging to TeensyThreads.cpp:643, which is the thread.yield() function.

I will try removing some/most of the yields and see how it impacts the crashes.

defragster · Jul 8, 2022

Wrote a sketch to abuse .yield() and not seeing any troubles - will post after running overnight if there is any Crash.
So far not seen any anomalies, except loop() reset of volatile count in thread_func() would get skipped at times {showing a double count} when the other 4 threads were disabled?
Perhaps p#4 with fewer yields worked out? Though risk of timing/function issues with task switch as in p#2, or pointer fail seem more to be expected.

Output looks like this - ending one run and starting again after fresh upload of code to follow (note the NvRam values on restart for debug):

Code:

1933088 279713 64756 6500 650 
3859937 284015 65753 6600 660 
1928313 288319 66749 6700 670 

C:\T_Drive\tCode\tThreads\ThreadYield\ThreadYield.ino Jul  8 2022 01:43:39
[B]Last NvRam: 288953 66896 6715 672[/B] 
0 0 0 0 0 
1930741 4309 992 100 10 
1932097 8614 1987 200 20 
1928947 12917 2983 300 30

It doesn't apply to the unseen code at hand: Other than .yield abuse with loop() and 5 threads, where 4 .yield (two of them twice) and showing the use of the 4 NVRAM_UINT32 values working to hold values as long as the Teensy stays powered (same as CrashReport).

The 5 threads have their own counters running on 5 us clock tick. The first from example is free running, the next .yields after each INC, the last three INC and .delay() 1, 10 or 100 ms.
Using a T_4.1 that has a PCB with an LED on all edge pins, so a 36 pin light show.
Four {11-14} are filler from loop marking seconds - I got a new SPI display - maybe tomorrow that can be integrated on those pins

Code:

#include <TeensyThreads.h>

// Accessing as 4 32 bit DWORDS of NVRAM backed data on 1062 MCU
// For T_3.x there are 7 DWORDS, the 8th is used by PJRC to track valid RTC clock
#if defined(__IMXRT1062__)
uint32_t *[B]NVRAM_UINT32 [/B]((uint32_t *)0x400D4100);
#define DWORDS 4
#else
uint32_t *NVRAM_UINT32 ((uint32_t *)0x4003E000);
#define DWORDS 7
#endif

volatile uint32_t count = 0;
volatile uint32_t countL = 10000000;

uint8_t tPins[] = { 2, 3, 4, 5, 6, 7, 8, 9,   24, 25, 26, 27, 28, 29, 30, 31,    23, 22, 21, 20, 19, 18, 17, 16,   40, 41, 39, 38, 37, 36, 35, 34  , 14, 13, 11, 12 };
void doLED( uint32_t bank, uint32_t val ) {
  val = (7 + val) % 8;
  digitalToggle( tPins[ (bank * 8) + val ] );
}

void thread_func(int inc) {
  while (1) {
    count += inc;
  }
}

void thread_func1(int inc) {
  while (1) {
    NVRAM_UINT32[0] += inc;
    uint32_t num = (countL % count);
    doLED( 3, num );
    threads.yield();
  }
}

void thread_func2(int inc) {
  while (1) {
    NVRAM_UINT32[1] += inc;
    doLED( 0, NVRAM_UINT32[1] );
    threads.delay( 1 );
    threads.yield();
  }
}

void thread_func3(int inc) {
  while (1) {
    NVRAM_UINT32[2] += inc;
    threads.delay( 5 );
    threads.yield();
    doLED( 1, NVRAM_UINT32[2] );
    threads.yield();
    threads.delay( 5 );
  }
}

void thread_func4(int inc) {
  while (1) {
    NVRAM_UINT32[3] += inc;
    doLED( 2, NVRAM_UINT32[3] );
    threads.delay( 50 );
    threads.yield();
    threads.delay( 50 );
    threads.yield();
  }
}

void setup() {
#if defined(__IMXRT1062__)
  [B]SNVS_LPCR |= (1 << 24); //Enable NVRAM - documented in SDK[/B]
#endif
  for ( uint32_t ii = 0; ii < sizeof(tPins); ii++ ) {
    pinMode( tPins[ ii ], OUTPUT );
    digitalToggle( tPins[ ii ] );
  }
  Serial.begin(115200);
  while (!Serial && millis() < 4000 );
  if ( CrashReport ) Serial.print( CrashReport );
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
  [B]Serial.printf("Last NvRam: %d %d %d %d \n", NVRAM_UINT32[0], NVRAM_UINT32[1], NVRAM_UINT32[2], NVRAM_UINT32[3] );[/B]
  NVRAM_UINT32[0] = NVRAM_UINT32[1] = NVRAM_UINT32[2] = NVRAM_UINT32[3] = 0;
  threads.setSliceMicros( 5 );
  threads.addThread(thread_func, 1);
  threads.addThread(thread_func1, 1);
  threads.addThread(thread_func2, 1);
  threads.addThread(thread_func3, 1);
  threads.addThread(thread_func4, 1);
}

void loop() {
  static uint32_t lCnt = 0;
  uint32_t myCnt = count;
  count = 0;
  countL = myCnt;
  Serial.printf("%d %d %d %d %d \n", myCnt, NVRAM_UINT32[0], NVRAM_UINT32[1], NVRAM_UINT32[2], NVRAM_UINT32[3] );
  lCnt++;
  digitalToggle( tPins[ (4 * 8) + (3 + lCnt) % 4 ] );
  threads.delay(1000);
}

mjs513 · Jul 8, 2022

In your description you described that you are reading 2 accel, a gryo and barometer plus doing quaternion math in the same thread. Then in another thread you are doing the datalogging of I am assuming you are logging GPS data + the raw sensor data + your calc data. Besides what @CollinK mentioned about volatile you might be creating a race condition where you are trying to save your data at the same time you are trying to log that data. You might want to also consider some sort of data ready flag from your sensor thread so that your logging thread can test on to log that data. This may be why you are seeing improvement in time between crashes when you log slower.

defragster · Jul 8, 2022

Still running p#5 code here, 11.5+ hours.

Made a vid - first 4 secs is real time - next 11 secs is Super SloMo 120 fps shows what the blurred LEDS are counting. Not good but playing with phone ...
ThreadsYieldLEDS

Started above, where first is count/sec of a thread - staying constant, the other four are cumulative counts since start based - where only last three are limited by thread .delay() rate 1,10,100ms:

Code:

C:\T_Drive\tCode\tThreads\ThreadYield\ThreadYield.ino Jul  8 2022 01:43:39
Last NvRam: 288953 66896 6715 672 
0 0 0 0 0 
1930741 4309 992 100 10 
1932097 8614 1987 200 20 
1928947 12917 2983 300 30
...  // Time would be :: Jul  8 2022 13:18:00
1931867 179355931 41537667 4168200 416820 
1927355 179360235 41538662 4168300 416830

So, 11.5 hours and .yield abuse is tolerated. The counts look as to be expected, except the middle 1ms delay count is a bit shorted where 41.5M should be 41.6M, but other two are closely following each other

As @mjs513 suggests there may be a race or conflict with indexing to data - maybe a pointer isn't synchronized and that is the fault/crash?

Was just confirming Threads code function - did not read into the 'user' code threads tasks as closely as Mike - but interrupting ongoing operations is suspect. And short time slice will be likely pulling during 'work'.

The addr2line pointing to .yield may be a red herring where it gets 'blamed' for what happened before it was called - it is just a short piece of code that does task switch, so having it ID'd as 'caller' of something seems odd?

TeensyDigital · Jul 8, 2022

Ugh. This has been a real pain to isolate. Like pushing jello around a table. I've tried a dozen different things to isolate the issue to specific code or configuration with no luck. In the dozen different crashes since last night there are some trends emerging in the addr2line "executing from address" data. The offending line is inconsistent and the time to crash is wildly inconsistent, but there is some pattern. The addr2line's have landed on thread.yield() at least 20% of the time. Another 25% of the time it points to ?? and when analyzing the sym file it seems that the address is used for core system functions (all different). Whenever it points back to my primary sketch the offending line is in my Main Thread (not the other two threads) -- a few times pointing to void loop() and also the last } in void loop (). It has also pointed to lines of code with global (shared and local) variables being read, but that could just be a coincidence. On my main loop I had a lot of logic that was being checked every cycle, so I bracketed all of the code in timers to only run routines at 500 Hz -- hoping that would prevent collisions or race conditions. But that did not seem to help. I also substantially increased my memory monitoring and I didn't see anything unusual - it does not appear to have anything to do with memory consumption or a memory leak that I can identify. Other things I tried were to increase the threads.setMicroTimer to 10 and 20 and 30... but no change in crashes. I tried removing large portions of code, but like with the sample thread if I remove all the code in a loop it doesn't crash, but if I remove the first half of the code it crashes or if I remove the second half of the code it crashes, so the issue is something more common or bigger than a line of code (I suspect). I'll keep isolating...

mjs513 · Jul 8, 2022

Know you said the code base is large but maybe if you post the code we can see something or suggest some modifications. We are just guessing at this point. At least for me I have put several sensors in one thread and gps in another thread and it seemed to work. Never really tried logging data though

defragster · Jul 8, 2022

Have you looked at those 4 NVRAM DWORDS that you could use to track the last run code?

Like the BreadCrumbs in new CrashReport - they are written and printed in loop - but residual value showing in p#5 setup.

breadcrumbs give 6 - but will have more overhead than those writes when doen often

defragster · Jul 8, 2022

@TeensyDigital - has TD 1.57b3 been installed?

If so, here is an example showing CrashReport.breadcrumb

I started with the PJRC example : examples/Data_Access_Violation/Data_Access_Violation.ino

Which shows what causes a "Data_Access_Violation". It is writing to an invalid pointer:

Code:

// MyFault - collection of examples to trigger fault exceptions

// Example for :  void breadcrumb(unsigned int num, unsigned int value) {
// num is a value 1 to 6 for possible unique storage entries
// 

void setup() {
  CrashReport.breadcrumb( 1, 0x5000000 | __LINE__ ); // Upper bits hold '5' perhaps indicating func() for ref, lower bits show line #
  CrashReport.breadcrumb( 2, millis() );
  Serial.begin(9600);
  CrashReport.breadcrumb( 3, millis() );
  while (!Serial && millis() < 4000 );
  CrashReport.breadcrumb( 4, millis() );
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
  CrashReport.breadcrumb( 5, millis() );
  Serial.print(CrashReport);
  CrashReport.breadcrumb( 6, 0xdeadbeef );
  *(volatile uint32_t *)0x30000000 = 0; // causes Data_Access_Violation
}

void loop() {
}

/* sample output

C:\T_Drive\tCode\T4\CrashBreadCrumbs\CrashBreadCrumbs.ino Jul  8 2022 18:22:51
CrashReport:
  A problem occurred at (system time) 18:22:54
  Code was executing from address 0x1B0
  CFSR: 82
  (DACCVIOL) Data Access Violation
  (MMARVALID) Accessed Address: 0x30000000
  Temperature inside the chip was 43.45 °C
  Startup CPU clock speed is 600MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected
  Breadcrumb #1 was 83886088 (0x5000008)
  Breadcrumb #2 was 300 (0x12C)
  Breadcrumb #3 was 367 (0x16F)
  Breadcrumb #4 was 367 (0x16F)
  Breadcrumb #5 was 367 (0x16F)
  Breadcrumb #6 was 3735928559 (0xDEADBEEF)

*/

TeensyDigital · Jul 8, 2022

Thanks guys. Here is a link to the code and a few libraries that it requires. Surprisingly, this will still compile and run on a stock Teensy (without the custom PCB and components), but I cannot get it to crash, because none of the code is actually running in the logging thread or the sample thread once the flash and sensors error out on start-up. I've forked the code 30 different times to trial-and-error isolation and solutions, but this version is pretty close to the original. The code is messy, as it migrated over three platforms the last five years (last it was on an ESP32), but it has been otherwise stable and has been launched in a rocket successfully over 75 times.

I will also start looking at the NVRAM DWORDS to see if I can use them to track the last run code. There is a lot going on simultaneously in the code, so four markers isn't much, but I'll see what I can do.

mjs513, the general approach used in this code is to have autonomous loops (threads) going in parallel with logging and processing not having to worry about sampling or sampling state. The logging is for postmortem analysis only and is being captured at 20-50 hz during launch, so if the sampling loop is midway through a calculation and one line in the logging is off it really doesn't matter much. That said, the way sampling works is to collect the sample and do the processing in local variables and then drop the value into the master global variable at the end for others to pick up. So, there really isn't a need to put locks around the sampling logic -- assuming it can write global variables while other threads read global variables.

https://github.com/AllDigital33/RocketTalk5-Crash

defragster · Jul 8, 2022

As noted, the DWORDS are sadly only 4 and have to be accessed as DWORDS

But as a quick example this can 'pack' things that fit into those 32 bits:
CrashReport.breadcrumb( 1, 0x5000000 | __LINE__ ); // Upper bits hold '5' perhaps indicating func() for ref, lower bits show line #

Also, the p#11 code - if TeensyDuino updated to 1.57 Beta3 - shows how to have those 6 DWORD values available as well.

Actually, much of the upper HALF of RAM2/DMAMEM is STATIC and unchanged when power is maintained. It takes an explicit pointer - and CACHE FLUSH like done in the CrashReport code. Paul goes to extreme to CRC and verify in that code before reporting CrashReport data.

If you pick a spot say 32KB below this pointer : struct crashreport_breadcrumbs_struct *bc = (struct crashreport_breadcrumbs_struct *)0x2027FFC0;

You can write anything there you like. But it won't survive restart ( it will only be in CACHE ) unless the area is FLUSHED:
> arm_dcache_flush((void *)bc, sizeof(struct crashreport_breadcrumbs_struct));

NOTE: The pointer could be made as DMAMEM, but the lower region of that IS overwritten by the 1062 on startup, and that is where alloc starts. And I found that order of declaration doesn't always determine which area is assigned low to high.

Hopefully that gives enough of an idea to make use of that area for extra data?

An LFS RamDrive during testing was made 'survivable' across restarts, but those were made 'start fresh' on release.

mjs513 · Jul 9, 2022

TeensyDigital said:
Thanks guys. Here is a link to the code and a few libraries that it requires. Surprisingly, this will still compile and run on a stock Teensy (without the custom PCB and components), but I cannot get it to crash, because none of the code is actually running in the logging thread or the sample thread once the flash and sensors error out on start-up. I've forked the code 30 different times to trial-and-error isolation and solutions, but this version is pretty close to the original. The code is messy, as it migrated over three platforms the last five years (last it was on an ESP32), but it has been otherwise stable and has been launched in a rocket successfully over 75 times.

I will also start looking at the NVRAM DWORDS to see if I can use them to track the last run code. There is a lot going on simultaneously in the code, so four markers isn't much, but I'll see what I can do.

mjs513, the general approach used in this code is to have autonomous loops (threads) going in parallel with logging and processing not having to worry about sampling or sampling state. The logging is for postmortem analysis only and is being captured at 20-50 hz during launch, so if the sampling loop is midway through a calculation and one line in the logging is off it really doesn't matter much. That said, the way sampling works is to collect the sample and do the processing in local variables and then drop the value into the master global variable at the end for others to pick up. So, there really isn't a need to put locks around the sampling logic -- assuming it can write global variables while other threads read global variables.

https://github.com/AllDigital33/RocketTalk5-Crash

Thanks for posting the code - think it will help us in trying to help you. I haven't done much in model rockets in years so bringing back memories. Only played with Estes/Centurion etc type rockets nothing high powered (especially since i live in NYC). So bringing back memories. Anyway I digress.

I started looking through the code and just a cursory review of the structure of the code and what you are doing in the loops I see a couple of potential challenges. Here are a few things to consider:
1. Stack size: reading through the teensythread thread I believe there is a maximum stack size of 10K and you are already using 6K in 2 threads. Never really used that option so may rethink how you are using. Remember the Loop is also considered a thread - thread 0. So you may want to set that up as well.
2. I don;t usually use setMicrotimer as I spend too much in the threads so I usually stick with something like this:

Code:

  compass = threads.addThread(getCompass);
  threads.setTimeSlice(0, 1);
  threads.setTimeSlice(compass, 100);

The compass is this case is a BNO055. Notice I set the timeslice for thread 0 as well
3. Just looking at the sampling thread, would recommend you put all the sensor config code in a simple config function called from setup and not in the thread.
4. delays in threads - don't use things like delay(10) but threads.delay(10) instead. threads.delay lets other threads uses yield and allows other threads to use the time
5. Serial.prints from threads - would avoid it unless you are using thread locks around serial.prints, If one thread is calling serial.print while another thread is you are going to have an issue - may want to read this on locking-https://github.com/ftrias/TeensyThreads#locking
6. Think you have way too many threads.yield in you sampling thread - usually I put it at the end of the while loop. May want to rethink.
7. Don't forget to use volatiles for variables.
8. I notice that you are using i2cdevlib - that one brings back memories

. Would check out a different lib for the LSM9 sensor that sticks with just using the Wire library.

Again this is just a cursory review will get into more.

Kind of have a few other things on the agenda today but going to pull out some of my old sensors and may be play a bit with it.

defragster · Jul 9, 2022

Didn't disturb the Lightshow.Yield p#5 test ... still running over 31 hours. Did power up another T_4.1 for the p#11 .breadcrumb example. Didn't get as far as Mike clicking the code link, his notes look good.
> volatile is important - but not perfect as reported above. If the volatile vars are more than one uint32_t or critical for pointer index/reference they are suspect. Not sure if a ring buffer would help pass info between threads?
I just played with info found here and it worked - though extended it to use uint32_t's:

// https://www.downtowndougbrown.com/2013/01/microcontrollers-interrupt-safe-ring-buffers/
// https://forum.pjrc.com/threads/5496...n-the-same-pin?p=301287&viewfull=1#post301287

Also incorporated @KurtE's RAMStream class to print to RAM in that example. With edits it might work to save to restart persistent DMAMEM memory {indicated above} that could be printed on restart after a fault?

Just wondering where THREADS codes runs? Not on user level lowest priority interrupt level but the task switcher must be higher ?

Example p#5 seems to be running consistently and increments and lights are as expected:

Code:

1928313 288319 66749 6700 670 

C:\T_Drive\tCode\tThreads\ThreadYield\ThreadYield.ino Jul  8 2022 01:43:39
Last NvRam: 288953 66896 6715 672 
0 0 0 0 0 
1930741 4309 992 100 10 
1932097 8614 1987 200 20 
...
1930664 492249830 114001894 11439800 1143980      // would be:: Jul 9 2022 09:30
1926413 492254133 114002891 11439900 1143990 
...
[B]1934583 736533076 170576346 17116900 1711690      // would be:: Jul 10 2022 01:16 // 47.5 hours[/B] 2 days running well, stopped to add display
1928454 736537380 170577341 17117000 1711700

That p#5 just stressing the mechanics of .yield() ... short task, exit time slice and return to code after that on next slice.
Only thread doing printing or something non-trivial is loop():thread_0
> doing multi slice device I/O on short slices

As noted in p#11 example to cause the Data_Access_Violation seen here - it was pointer abuse/trash. Not sure what is getting trashed where or why. Since a large % are pointing to yield and threads code space - maybe adding a DEBUG integrity check to the threads structures could catch it after it happens and before it causes trouble?

... gotta run

joepasquariello · Jul 9, 2022

mjs513 said:
I started looking through the code and just a cursory review of the structure of the code and what you are doing in the loops I see a couple of potential challenges.

I second Mike's suggestions, especially the question on use of Serial printing from multiple threads without locking. That has been the source of problems in quite a few forum questions related to TeensyThreads and preemptive OS generally. Wondering if you can state the requirements and strategy for sampling and logging at a very high level? For example,

sample_thread
- read ADXL every X ms
- read Baro every Y ms
- etc
logging_thread
- build log string
- write log string to file every Z ms (or N samples?)

EDIT: the reason I'm asking about timing is that the method of having a set of variables for each sensor, such as tConv/tMeasure/tDV, and advancing these values with relative steps is essentially a state-machine design within the thread, and that is sort of redundant complexity. The great advantage of multi-threading is to be able to code the application as a set of tasks, each of which is a logically sequential process, and to let the scheduler take care of the switching back and forth. In this case, the switching should generally go from sampling -> logging when the sampling thread is either (a) doing nothing, or (b) waiting for a (relatively slow) I2C transfer to complete. Likewise, switch from logging to sampling should occur when (a) logging is idle, or (b) logging is waiting for a (relatively slow) QSPI transfer to complete.

TeensyDigital · Jul 9, 2022

Thanks guys. The feedback has been really useful. I feel like I'm getting closer, but also no smoking gun.

I've got NVRAM markers in the code now. These are great, since they survive the crash, but they are dog slow so I can't use a lot of them. My approach was just to instrument each thread with one NVRAM value at each step (e.g., after init NVRAM[0]=1 then =2 at next step and so on. That worked well on main, but when I added to logging and sample threads it slowed it down so much it crashes. So I am running heavy on markers in my main loop now and in the sample loop. That has slowed the logging loop by about 30%. I only have a few crashes with markers and they are telling as the addr2line data. The NVRAM markers indicate it crashes while in the main loop() or at the thread.yield() in the main loop. So, I took out my thread.yields in the main loop and used on NVRAM marker to indicate failure in the main loop and another to indicate what was last finished. The last finished routines are not consistent, but the failure in the main loop is, even without the yield.

@mjs513, thank you for looking through the code. I really appreciate it. Here are some thoughts on each of your points:
1) yep. Monitoring each thread stack through this exercise indicates I can reduce the size of the logging and sample stacks.
2) I will try setTimeSlice to see if it makes a difference
3) I did not put the init code for the sample thread in setup, because I was trying to keep 1) all wire activity on the same thread and 2) variables as local as possible. If you start wire on setup and then use wire on a new thread do you know how the memory management works for the library?
4) yep. I only have a few delays in the code - I'll switch them to threads.delay()
5) I really never use serial.print (eg SerialDebug = true) even in my current troubleshooting. All serial output is disabled for my production compile. For my current troubleshooting I only have "debugMode" enabled so it just does serial prints from the main loop every five seconds. If I do turn on SerialDebug I expect things to break across threads.
6) Yep. I suspect I have too many, but I am not sure if it is best practice to put it at the end of the while or the end of loop(), since it would then get called hundreds of thousands of times a second. Based on Defrag's testing above that seems to be OK, but with all the crashes on thread.yield() it has made me stop and rethink.
7) Yes. Volatile might be my smoking gun. I will post a follow-up with new information I just found.
8) yep. Tech debt. using an old library. I'll upgrade it.

TeensyDigital · Jul 9, 2022

So, in my learning/testing of the NVRAM features I modified one of Defrag's sketches to include a thread and my ability to force crash the thread (see below). The main loop monitors a global variable (lClock) to see if the thread is still living. I was curious to see what would kill a thread vs. crash the whole system. When I first compiled it I was surprised that the main loop immediately reported the thread as dead (lClock = 0). So, I added a serial.print in the thread to report the value of lClock (I know, don't use serial print in threads) and oddly the main loop could now read lClock correctly. I removed the serial.print and made lClock volatile and that fixed the issue.

Previously, I had assumed that I only needed to mark variables volatile if two threads wrote to them, but it seems like they need to be volatile if there is any access at all from more than one thread. I am curious if this sketch works the same for others. I am using the GA release of TD1.56. In previous testing of threads (a year ago?) I did not have any issues updating on one thread and reading on another without the variables being volatile. In my rocket telemetry application I have hundreds of shared variables that get read. Is there overhead if I make them all volatile? This could be the smoking gun to my thread instability.

here is the example sketch. If you run it as is does it say "Sorry, but the thread is dead"? and is unable to read lClock? Thanks, Mike

Code:

  unsigned long lClock;
  //volatile unsigned long lClock;  // this fixes reading from main loop
  
  volatile byte killIt = 0;  //volatile also required to pass KillIt value to thread

  #include <TeensyThreads.h>
  int id_sampling;
  uint32_t *NVRAM_UINT32 ((uint32_t *)0x400D4100);
  elapsedMillis tSome;


void setup() {
  
  SNVS_LPCR |= (1 << 24); //Enable NVRAM - documented in SDK
  
  Serial.begin(115200);

  if(CrashReport) {
    Serial.print(CrashReport);
    Serial.println(" ");
    Serial.print("NVRAM Marker Value = ");
    Serial.println( NVRAM_UINT32[0] );
    while(1); 
  } 

  //create the thread - nothing fancy
  id_sampling = threads.addThread(sample_thread);
  delay(2000);

  tSome = 0;
  Serial.println("PRESS A KEY TO PLAY:     '1':CRASH 1         '2':CRASH 2        'k':Kill Thread    \n");
}

void loop() {
  if ( tSome >= 5000 ) {
    tSome = 0;
    Serial.println("Press a key, I'm still waiting to die... ");
    if(millis() > (lClock + 1500)) {
      Serial.print("Sorry, but the thread is dead...");
      Serial.println(lClock);
    }
  }
  
  while ( Serial.available() ) {
    char cc = Serial.read();
    if ( '1' == cc ) {
      NVRAM_UINT32[0] = 1;  // set marker value here
      int* p = nullptr;
      *p = 33;  // This will force crash and burn :-O
    }
    else if ( '2' == cc ) {
      NVRAM_UINT32[0] = 2;  // set marker marker here
      int* p = nullptr;
      *p = 33;  // This will force crash and burn :-O

    }
    else if ( 'k' == cc ) {
      killIt = 1;
    }
  }
}

void sample_thread() {

 elapsedMillis lSome;
 lSome = 0;

 while(1) {
   if (lSome > 500) {
    lClock = millis();
    lSome = 0;
   }
   if(killIt == 1) {
    killIt = 0;
    NVRAM_UINT32[0] = 3;
      int* p = nullptr;
      *p = 33;  // This will force crash and burn :-O    
    
   }
 }
}

defragster · Jul 9, 2022

If NVRAM seems slow - the breadcrumb in RAM2 will IIRC be slower with cache flush?
> notes below says it only uses 2 CPU cycles as used in that context???

I just found I wrote back in April, a hack to that called StaticTrace where I took the new CrashReport and extended it, post below from thread: Teensy-MicroMod-freeze-hang-when-using-multiple-libraries-together

defragster said:
Did a hack of CrashReport to make a StaticTrace that gives access to both the prior mention RTC or RAM2 'semi static' storage areas.

Question: When the 'soft crash' happens can the Teensy be programmed without a Button press? p#9 says "USB dies out at some point too." - so that suggests the CPU is just in a busy loop - perhaps as noted 'overwhelmed' by cascading interrupt behavior.

In making StaticTrace I see that the storage using RTC is actually much faster than RAM2 without having to write to that memory and then wait for the cache flush.

I'll post the code as it is working and self documented. YMMV as to how the available storage might serve the purpose at hand, perhaps changing bits or values as code enters or exits. No provision for reading what is written was made, though it could/should be added.

As far as SPEED the RTC writes appear to take only 2 CPU cycles (based on writing the cycle counter 3 times) and the same writes to RAM2 appear to take 85 to 100 cycles. So keeping 4 local unit32_t's and updating and writing them with StaticTrace.crumbRTC( {1-4}, VALUE ); would be optimal.

Code:

[ATTACH]28016[/ATTACH]

This is an INO sketch using the files for StaticTrace in the same folder to build.

That sketch Class code emulates Paul's new breadCrumb { StaticTrace.crumbRAM2( 5, ARM_DWT_CYCCNT); } - except access to 30 DWORDS [1,2,...,29,30] - and minimal checking and can be called even if No Fault.
And also includes the 4 RTC DWORDS [1,2,3,4] usable as { StaticTrace.crumbRTC( 2, ARM_DWT_CYCCNT); }

Edit: See the INO in that ZIP for usage example - including PRINTING in setup():

Code:

  if ( StaticTrace )
  {
    Serial.println(StaticTrace);
    StaticTrace.clear();
  }

defragster · Jul 9, 2022

TeensyDigital said:
So, in my learning/testing of the NVRAM features I modified one of Defrag's sketches to include a thread and my ability to force crash the thread (see below). The main loop monitors a global variable (lClock) to see if the thread is still living. I was curious to see what would kill a thread vs. crash the whole system. When I first compiled it I was surprised that the main loop immediately reported the thread as dead (lClock = 0). So, I added a serial.print in the thread to report the value of lClock (I know, don't use serial print in threads) and oddly the main loop could now read lClock correctly. I removed the serial.print and made lClock volatile and that fixed the issue.

Previously, I had assumed that I only needed to mark variables volatile if two threads wrote to them, but it seems like they need to be volatile if there is any access at all from more than one thread. I am curious if this sketch works the same for others. I am using the GA release of TD1.56. In previous testing of threads (a year ago?) I did not have any issues updating on one thread and reading on another without the variables being volatile. In my rocket telemetry application I have hundreds of shared variables that get read. Is there overhead if I make them all volatile? This could be the smoking gun to my thread instability.

...

Will give the sketch a try.

Volatile is needed when the compiler might choose to OPTIMIZE away the read of the current value when it doesn't see a way for it to change 'in the context of the code at hand/current compile unit'?
Adding volatile only forces the compiler to include a read of the actual variable memory, instead of assuming that any 'local copy' is already unchanged.

Feel free to add volatile - it just tells the compiler to not assume a value won't change and include a value read from actual storage.

Contrived example might be a tight loop in code [ while ( 1 ) { if (5 == xx); break; // other code } ]and if 'xx' is changed by an _isr routine it might never re-read the value xx from RAM, so it must be volatile.

TeensyDigital · Jul 10, 2022

Since this issue is not proving easy to isolate, I've decided to step way back. I have too many changed variables in the environment. I am testing a brand new design of my PCB, on a new Mac M1, running a newer version of the TD IDE, with a dozen code changes for the new board (new GPS, etc.) and the changes to LFS. So last night I pulled out my old Mac running TD1.54B10 and compiled my old v4.6 code on my old v4.6 PCB. That ran overnight with no crashes (I had to put in code to trap a crash). I then took my v4.6 code and did minimum edits for my v5.0 PCB and compiled on TD1.54B10. That also ran overnight with no crashes. So, I now have a known good baseline to start introducing changes. I am currently testing my old v4.6 PCB with the v4.6 code compiled under TD1.56 on the new Mac. I had to make one change to the new LittleFS.h library, but otherwise everything is the same. So far, it hasn't crashed -- if that holds true then it is something in my code and not in the environment (or a change in my code in the new environment). The cycle time on these tests are very long, but I will keep adding and testing, until I get parity with the code that crashes. Stay tuned...

defragster · Jul 10, 2022

Interesting - nice if it holds up and allows progress and function!

mjs513 · Jul 11, 2022

TeensyDigital said:
Since this issue is not proving easy to isolate, I've decided to step way back. I have too many changed variables in the environment. I am testing a brand new design of my PCB, on a new Mac M1, running a newer version of the TD IDE, with a dozen code changes for the new board (new GPS, etc.) and the changes to LFS. So last night I pulled out my old Mac running TD1.54B10 and compiled my old v4.6 code on my old v4.6 PCB. That ran overnight with no crashes (I had to put in code to trap a crash). I then took my v4.6 code and did minimum edits for my v5.0 PCB and compiled on TD1.54B10. That also ran overnight with no crashes. So, I now have a known good baseline to start introducing changes. I am currently testing my old v4.6 PCB with the v4.6 code compiled under TD1.56 on the new Mac. I had to make one change to the new LittleFS.h library, but otherwise everything is the same. So far, it hasn't crashed -- if that holds true then it is something in my code and not in the environment (or a change in my code in the new environment). The cycle time on these tests are very long, but I will keep adding and testing, until I get parity with the code that crashes. Stay tuned...

Glad you got it semi working. Was beginning to dig into the code yesterday but got myself confused with the way its all set up. Is interesting when I was looking at some existing designs it looked like the one from Apogee. Have you ever thought of the replacing the sensors with a single IMU like the MPU-9250? May keep looking at it though. Its fun remembering.

TeensyDigital · Jul 11, 2022

Ok... I finally found the source of my woes. It seems to be sprintf() when used simultaneously across threads. The biggest change from my earlier (stable) v46 code to v50 code was 1) an increase in logging speed and 2) addition variables being logged. In my logging thread I am using sprintf to format and then concatenate 75+ variables into a comma delimited log line. When I isolated the code down to only creating the string log line (no LittleFS or other business) it still crashed. If I commented out half the code it didn't crash. If I commented out the other half it didn't crash. If I slowed down logging it crashed less often.

So, I wrote my own lightweight routines to convert floats, ints, and longs to strings and replaced all the log string creation code. Ran it for eight hours and no crashing

Then I created a short sketch to simulate a lot of sprintf activity across two threads. That sprintf thread test crashed the same as my code when run fast enough. If I only did sprintf on one thread it did not crash.

Do we know if sprintf is thread safe? It would seem not and/or I am consuming some memory or resource without knowing it. I know that sprintf can be heavy, especially with floats, but I would have though I'd see it consume more stack.

Here is the sprint f test code that causes a crash (sometimes takes a few minutes). This was run in TD1.56 (GA) on a T4.1 using 600Mhz and "faster".

Code:

  // Teensy Threading and sprintf crash code
  // Compiled on a T4.1, 600 Mhz, Faster


   #include <TeensyThreads.h>
   int thread_a, thread_b;
 
   elapsedMicros tClock1;
   elapsedMillis tClock2;

   int mCounter = 0;
   int aCounter = 0;
   int bCounter = 0;

   char logString1[500];
   char logString2[500];
   char logString3[500];



void setup() {

  Serial.begin(115200);
  if(CrashReport) {
    Serial.print(CrashReport);
    while(1); 
  } 

    thread_a = threads.addThread(aThread);
    threads.setMicroTimer(10);  
    delay(200);
    thread_b = threads.addThread(bThread);
    threads.setMicroTimer(10);      
    delay(200);

}

void loop() {

 if(tClock2 > 5000) {
   tClock2 = 0;
   Serial.print("I am still running...");
   Serial.println(millis());
   Serial.print("Main:  ");Serial.print(mCounter);Serial.print("    Thread A: ");Serial.print(aCounter);
   Serial.print("     ThreadB: ");Serial.println(bCounter);
   mCounter = 0;
   aCounter = 0;
   bCounter = 0;
 }

 if(tClock1 > 233) {  //5000 Hz
    tClock1 = 0;
    mCounter++;

    // set a bunch of variable types
    float theFloat1 = 0.0; 
    float theFloat2 = 0.0;
    float theFloat3 = 0.0;
    float theFloat4 = 0.0;
    float theFloat5 = 0.0;
    float theFloat6 = 0.0;
    float theFloat7 = 0.0;
    float theFloat8 = 0.0;
    float theFloat9 = 0.0;
    float theFloat10 = 0.0;
    theFloat1 = millis() * (float) 0.5;
    theFloat2 = millis() * (float) 0.4;
    theFloat3 = millis() * (float) 0.3;
    theFloat4 = millis() * (float) 0.2;
    theFloat5 = millis() * (float) 0.1;
    theFloat6 = millis() * (float) 0.6;
    theFloat7 = millis() * (float) 0.7;
    theFloat8 = millis() * (float) 0.8;
    theFloat9 = millis() * (float) 0.9;
    theFloat10 = millis() * (float) 0.99;
    unsigned long timer1 = 0;
    unsigned long timer2 = 0;
    unsigned long timer3 = 0;
    unsigned long timer4 = 0;
    unsigned long timer5 = 0;
    timer1 = millis();
    timer2 = millis();
    timer3 = millis();
    timer4 = millis();
    timer5 = millis();
    int theInt1 = 1;
    int theInt2 = 2;
    int theInt3 = 3;
    int theInt4 = 4;
    int theInt5 = 5;

    // now create a log line from a bunch of sprintf calls

      char strTemp[20];
      char comma[3];
      strcpy(comma,",");
      strcpy(logString1, "");
      strcpy(strTemp, ""); sprintf (strTemp, "%1.5f" , theFloat1);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.8f" , theFloat2);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.3f" , theFloat3);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat4);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.6f" , theFloat5);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.2f" , theFloat6);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.1f" , theFloat7);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.7f" , theFloat8);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat9);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat10);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer1);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer2);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer3);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer4);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer5);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt1);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt2);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt3);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt4);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt5);strcat(logString1, strTemp);strcat(logString1, comma);

      strcpy(strTemp, ""); sprintf (strTemp, "%1.5f" , theFloat1);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.8f" , theFloat2);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.3f" , theFloat3);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat4);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.6f" , theFloat5);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.2f" , theFloat6);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.1f" , theFloat7);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.7f" , theFloat8);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat9);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat10);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer1);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer2);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer3);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer4);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer5);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt1);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt2);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt3);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt4);strcat(logString1, strTemp);strcat(logString1, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt5);strcat(logString1, strTemp);strcat(logString1, comma);

      
 }

}


void aThread() {

  elapsedMicros tClock3;

  while(1) {

  if(tClock3 > 144) {  //5000 Hz
    tClock3 = 0;
    aCounter++;


    // set a bunch of variable types
    float theFloat1 = 0.0; 
    float theFloat2 = 0.0;
    float theFloat3 = 0.0;
    float theFloat4 = 0.0;
    float theFloat5 = 0.0;
    float theFloat6 = 0.0;
    float theFloat7 = 0.0;
    float theFloat8 = 0.0;
    float theFloat9 = 0.0;
    float theFloat10 = 0.0;
    theFloat1 = millis() * (float) 0.5;
    theFloat2 = millis() * (float) 0.4;
    theFloat3 = millis() * (float) 0.3;
    theFloat4 = millis() * (float) 0.2;
    theFloat5 = millis() * (float) 0.1;
    theFloat6 = millis() * (float) 0.6;
    theFloat7 = millis() * (float) 0.7;
    theFloat8 = millis() * (float) 0.8;
    theFloat9 = millis() * (float) 0.9;
    theFloat10 = millis() * (float) 0.99;
    unsigned long timer1 = 0;
    unsigned long timer2 = 0;
    unsigned long timer3 = 0;
    unsigned long timer4 = 0;
    unsigned long timer5 = 0;
    timer1 = millis();
    timer2 = millis();
    timer3 = millis();
    timer4 = millis();
    timer5 = millis();
    int theInt1 = 1;
    int theInt2 = 2;
    int theInt3 = 3;
    int theInt4 = 4;
    int theInt5 = 5;


    // now create a log line from a bunch of sprintf calls

      char strTemp[20];
      char comma[3];
      strcpy(comma,",");
      strcpy(logString2, "");
      strcpy(strTemp, ""); sprintf (strTemp, "%1.5f" , theFloat1);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.8f" , theFloat2);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.3f" , theFloat3);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat4);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.6f" , theFloat5);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.2f" , theFloat6);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.1f" , theFloat7);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.7f" , theFloat8);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat9);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat10);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer1);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer2);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer3);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer4);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer5);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt1);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt2);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt3);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt4);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt5);strcat(logString2, strTemp);strcat(logString2, comma);


      strcpy(strTemp, ""); sprintf (strTemp, "%1.5f" , theFloat1);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.8f" , theFloat2);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.3f" , theFloat3);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat4);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.6f" , theFloat5);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.2f" , theFloat6);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.1f" , theFloat7);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.7f" , theFloat8);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat9);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%1.4f" , theFloat10);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer1);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer2);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer3);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer4);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%lu" , timer5);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt1);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt2);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt3);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt4);strcat(logString2, strTemp);strcat(logString2, comma);
      strcpy(strTemp, ""); sprintf (strTemp, "%d" , theInt5);strcat(logString2, strTemp);strcat(logString2, comma);




    
  }

    
  }
}


void bThread() {

  elapsedMicros tClock4;

  while(1) {

  if(tClock4 > 433) {  //5000 Hz
    tClock4 = 0;
    bCounter++;
  }

    
  }
}

mjs513 · Jul 11, 2022

Short answer is no. Sprintf goes to serial and as I mentioned should be wrapped in a serial mutex

T4.1 Intermitent Crashes with Teensy Threads

Well-known member

Senior Member+

Well-known member

Well-known member

Senior Member+

Senior Member+

Senior Member+

Well-known member

Senior Member+

Senior Member+

Senior Member+

Well-known member

Senior Member+

Senior Member+

Senior Member+

Well-known member

Well-known member

Well-known member

Senior Member+

Senior Member+

Well-known member

Senior Member+

Senior Member+

Well-known member

Senior Member+