t4.1 detect reboot type

Frankthetech

Active member
I'm looking for help in detecting when the teensy is powered off/on vs a software reset.
This is so I can keep track of the power cycles. I tried to just inc a counter in setup() but I
need to know if the unit was unpluged (hard reset) and keep a count of that too.
 
let me know what you think of this,
as a test setup to start with
I know this only gives me hardware/software or watchdog flag
But since I don't have any software reboots in my code this is a work around... maybe
only question is wdt.expired() reset on a power reboot?

Code:
//test 4.1 for reboot h/w vs s/w

WDT_T4<WDT1> wdt;
WDT_timings_t config;


void setup() {
  // WATCHDOG
  config.timeout = 5;
  wdt.begin(config);
 
  while(!Serial.begin(115200)&& millis()<3000);
 
  if(!wdt.expired()){    //restart was not the result of watchdog timeout
    //must have been a power off/on or software reboot
    //force a wdt timeout here then next restart will be true
    //inc counter & save to eeprom
    delay(6000);
  }
  //continue with setup
 
}

void loop() {
  wdt.feed();

}
 
Frank, if you look at file cores\Teensy4\CrashReport.cpp, you will find the code shown below, which reads and interprets the bits of register SRC_SRSR. Bits 0-8 indicate the cause of reset. You can copy and modify this code for your own use. The SRC (system reset control) is described in section 21 of the IMXRT reference manual.

Code:
  uint32_t SRSR = SRC_SRSR;
  if (SRSR & SRC_SRSR_LOCKUP_SYSRESETREQ) {
    // use SRC_GPR5 to distinguish cases.  See pages 1290 & 1294 in ref manual
    uint32_t gpr5 = SRC_GPR5;
    if (gpr5 == 0x0BAD00F1) {
      p.println("  Reboot was caused by auto reboot after fault or bad interrupt detected");
    } else {
      p.println("  Reboot was caused by software write to SCB_AIRCR or CPU lockup");
    }
  }
  if (SRSR & SRC_SRSR_CSU_RESET_B) {
    p.println("  Reboot was caused by security monitor");
  }
  if (SRSR & SRC_SRSR_IPP_USER_RESET_B) {
    // This case probably can't occur on Teensy 4.x
    // because the bootloader chip monitors 3.3V power
    // and manages DCDC_PSWITCH and RESET, causing the
    // power on event to appear as a normal reset.
    p.println("  Reboot was caused by power on/off button");
  }
  if (SRSR & SRC_SRSR_WDOG_RST_B) {
    p.println("  Reboot was caused by watchdog 1 or 2");
  }
  if (SRSR & SRC_SRSR_JTAG_RST_B) {
    p.println("  Reboot was caused by JTAG boundary scan");
  }
  if (SRSR & SRC_SRSR_JTAG_SW_RST) {
    p.println("  Reboot was caused by JTAG debug");
  }
  if (SRSR & SRC_SRSR_WDOG3_RST_B) {
    p.println("  Reboot was caused by watchdog 3");
  }
  if (SRSR & SRC_SRSR_TEMPSENSE_RST_B) {
    p.println("  Reboot was caused by temperature sensor");
      SRC_SRSR &= ~0x100u; /* Write 0 to clear. */
      p.println("Panic Temp Exceeded Shutting Down");
      p.println("Can be caused by Overclocking w/o Heatsink or other unknown reason");
      IOMUXC_GPR_GPR16 = 0x00000007;
      SNVS_LPCR |= SNVS_LPCR_TOP; //Switch off now
      asm volatile ("dsb":::"memory");
      while (1) asm ("wfi");
  }
 
Thanks for that, tried that code but the SRSR is always 1 on power off\on, won't work in this case with the T4.x

Code:
if (SRSR & SRC_SRSR_IPP_USER_RESET_B) {
    // This case probably can't occur on Teensy 4.x
    // because the bootloader chip monitors 3.3V power
    // and manages DCDC_PSWITCH and RESET, causing the
    // power on event to appear as a normal reset.
    p.println("  Reboot was caused by power on/off button");
  }
 
I unfortunately had a reset while in operation of my lathe while it was feeding, being controlled by a Teensy4.1. I need to find out if this was due to some bug that was introduced recently, or it was "merely" a loosely fitting USB connector. This matters a lot to me, as I have one of these systems in the field that I need to support. If the reset occurred during a threading operation, it could break things on the lathe, including the lathe, hence my concern.

@joepasquariello Can you tell me how to read out the crash report? Is the crash report automatically generated, or do I need to include CrashReport.cpp in my project?
 
@clinker8: Add a line to your setup() function something like this (which will wait for a maximum of 3 seconds for the serial monitor to be connected, then proceed): while (!Serial && millis() < 3000) {}.

Then, add the following code in the setup() function, right after the while (!Serial && millis() < 3000) {} line, which will print out any crash information:

Code:
if (CrashReport) {
   Serial.print(CrashReport);
}

Note that, if a crash occurs, the crash information will be printed in the SerialMonitor. Instead of printing the crash info to the Serial Monitor, you may need to store that info somewhere that you can retrieve it from after the fact: EEPROM, SDcard, etc.

The report will include an address where the crash was recorded. Paul has also created a CrashReport() <webpage> with useful details. You can also check the entry in the unofficial Teensy wiki <here> for links to descriptions of where to find the addr2line utility for both the old (1.8.x) & new (2.3.x) Arduino IDE, as well as detailed descriptions of how to use it.

Good luck . . .

Mark J Culross
KD5RXT
 
@clinker8: Add a line to your setup() function something like this (which will wait for a maximum of 3 seconds for the serial monitor to be connected, then proceed): while (!Serial && millis() < 3000) {}.

Then, add the following code in the setup() function, right after the while (!Serial && millis() < 3000) {} line, which will print out any crash information:

Code:
if (CrashReport) {
   Serial.print(CrashReport);
}

Note that, if a crash occurs, the crash information will be printed in the SerialMonitor. Instead of printing the crash info to the Serial Monitor, you may need to store that info somewhere that you can retrieve it from after the fact: EEPROM, SDcard, etc.

The report will include an address where the crash was recorded. Paul has also created a CrashReport() <webpage> with useful details. You can also check the entry in the unofficial Teensy wiki <here> for links to descriptions of where to find the addr2line utility for both the old (1.8.x) & new (2.3.x) Arduino IDE, as well as detailed descriptions of how to use it.

Good luck . . .

Mark J Culross
KD5RXT
Thanks for the tips. I eventually found the CrashReport link and saved it. I will check out the addr2line utility as well.

At the moment, I have no idea whether this was a loose connector problem, (mechanical), or a crash due to code. The day before, I ran it for 10 "feed to stop" passes at speeds effectively 17x faster, and had no problems, so I'm puzzled why a code crash at this much, much lower speed. Since the platform had been running in "normal" mode for over 18 months error free, this is will be a tough debug. I'll probably write the report to SD card, or something like that, since a PC isn't usually attached to read out the serial diagnostics. This is when one wants a RTC, so files have valid save dates, but at the moment, any saved crash data could be useful.
 
While I am asking, how many bread crumbs can one place? In other words, what is the maximum number?
Code:
#pragma once

#include <Printable.h>
#include <WString.h>

class CrashReportClass: public Printable {
public:
    virtual size_t printTo(Print& p) const;
    void clear();
    operator bool();
    void breadcrumb(unsigned int num, unsigned int value) {
        // crashreport_breadcrumbs_struct occupies exactly 1 cache row
        volatile struct crashreport_breadcrumbs_struct *bc =
            (struct crashreport_breadcrumbs_struct *)0x2027FFC0;
        if (num >= 1 && num <= 6) {
            num--;
            bc->value[num] = value;
            bc->bitmask |= (1 << num);
            arm_dcache_flush((void *)bc, sizeof(struct crashreport_breadcrumbs_struct));
        }
    }
};

extern CrashReportClass CrashReport;
The above code implies 6 are allowed. Is this because of memory constraints of Teensy4.1. Trying to understand what I have to work with.

What is a reasonable strategy for bread crumb placement? Just before a block of code? After? What about the strategy for ISR's? At entrance? Exit?
 
Last edited:
Up to 6 breadcrumbs are supported. This limit of 6 is documented on the CrashReport page.


However, each of the 6 can have be set from any number of places in your program. Each location can assign its own 32 bit number, which is how you discover which location executed last before the crash occurred.

The limit of only 6 is due to allocating the breadcrumbs memory within a single CPU cache row.
 
What is a reasonable strategy for bread crumb placement?

Often the best strategy is to keep an open mind. Planning a strategy depends on knowing things, but the nature of debugging is you don't yet really know what's wrong. You don't know in advance what action will yield new info that leads to a solution. Best to focus on just increasing the amount of stuff you can observe, expecting most things will show expected & unhelpful results but with some luck you might see new info that's wrong and gives insight about the problem.

But as a general guideline, usually you would dedicate 1 of the 6 to a particular library or subsystem or functionally distinct group of code within your project. Set the it to a unique 32 bit number just before each important block of code or functional thing the code does.

Just remember to cast a wide net and consider the problem is probably something you don't expect, otherwise you would have already solved it without breadcrumbs.
 
Often the best strategy is to keep an open mind. Planning a strategy depends on knowing things, but the nature of debugging is you don't yet really know what's wrong. You don't know in advance what action will yield new info that leads to a solution. Best to focus on just increasing the amount of stuff you can observe, expecting most things will show expected & unhelpful results but with some luck you might see new info that's wrong and gives insight about the problem.

But as a general guideline, usually you would dedicate 1 of the 6 to a particular library or subsystem or functionally distinct group of code within your project. Set the it to a unique 32 bit number just before each important block of code or functional thing the code does.

Just remember to cast a wide net and consider the problem is probably something you don't expect, otherwise you would have already solved it without breadcrumbs.
Thanks for the insight. Hadn't appreciated the differing value settings to a single bread crumb. That's helpful.

Since this is the first instance of this reset happening to me on code that has been running for 18 months, it's going to be interesting trying to trap this. Due to the infrequent nature of the fault, I will have to save the data to somewhere, as the system isn't always hooked up to a console. Ordinarily, the code is only connected to an ILI9341, as this is designed to operate in a (machine) shop, not an office. When this event happened, there was no console attached, I was machining at the time. The lathe and the Teensy are on different breaker panel circuits so it's not likely a glitch in lathe power affected the power supply for the Teensy.

It's likely that bugs have been introduced recently, but perhaps that might not be the case. Have to keep an open mind. Seems like I have my work cut out for me. The net will be cast as wide as I can imagine, maybe I will catch something, hope so.
 
I interspersed some bread crumbs all at level 2 with differing values in my loop code. So I have bread crumb (2,1) through (2,10) I'm trying to interpret the result. I forced an error in the main loop using elapsedMillis and then referencing a null pointer. Sure enough Teensy rebooted. That's good. But I don't understand the report. Yes it's reporting a null pointer fault. But the fault should have been just after bread crumb(2,10). Why is there reporting on bread crumb 6? I never defined that.

Here's a snippet of the bread crumbs.
C:
CrashReport.breadcrumb(2, 6);
  // this section runs the state machine... button presses cause us to change states
  doStateMachine();
  CrashReport.breadcrumb(2, 7);
  /* The following code requires a hardware modification of the ELS board to work.  If the
     mod is not made, there will be no Serial print output!  As this is diagnostic only, it
     only matters for logging stepper rpm and ztogo.  It does not affect operation.
     You must connect physical pin 7 (logical 5 IN2) PUL to physical pin 25 (33 MCLK2) at the
     end of part next to the SD card
  */   
  if (freq1.available()) {
    sum1 = sum1 + freq1.read();
    count1 += 1;
  }
  //Serial.printf("freq1.available() = %i\n", freq1.available());
  CrashReport.breadcrumb(2, 8);
  if (pulsetimeout > 250) {
    if ((count1 > 0) && stepperactive) {
      step_pps = freq1.countToFrequency(sum1 / count1);
      updaterate = step_pps * 25.4f/(12.0f * 800.0f) * 1000.0f; // approx display call rate!
      // NB: @ 400 RPM, 0.2mm feed, about 1333 counts/sec
      #if defined(USB_TRIPLE_SERIAL)
      SerialUSB2.printf("%3.10lf, %3.3f, %3.3f\n", mysec(), step_pps, ztogo);  // keep step_pps to preserve datalog.
      #endif
    }
    sum1 = 0; count1 = 0; pulsetimeout = 0;
  }
  // end code requiring PCB modification
  CrashReport.breadcrumb(2, 9);
  blub = mysec(); // dummy call, ensures mysec is properly updated.  Only needs an update once every 7 sec.

  #ifdef forceerror
  CrashReport.breadcrumb(2, 10);
  if (errortimer > 20000) { // force a CrashReport after 20 seconds since start, but since we reset we get whacked every 20 seconds
    volatile byte *p = nullptr; /* using this pointer will crash */
    *p = 5;
  }
  #endif
}
Errortimer is an instance of elapsedMillis. So after running 20 seconds, the Teensy faults with the null pointer use. Here is the captured CrashReport. Actually it resets every 20 seconds, which wasn't my intent, as it interferes with my testing.
Code:
CrashReport:
  A problem occurred at (system time) 15:42:38
  Code was executing from address 0x67FE
  CFSR: 82
    (DACCVIOL) Data Access Violation
    (MMARVALID) Accessed Address: 0x0 (nullptr)
      Check code at 0x67FE - very likely a bug!
      Run "addr2line -e mysketch.ino.elf 0x67FE" for filename & line number.
  Temperature inside the chip was 48.13 °C
  Startup CPU clock speed is 600MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected
  Breadcrumb #2 was 10 (0xA)
 (0x8950973D)
  Breadcrumb #2 was 10 (0xA)
  Breadcrumb #3 was 17580490 (0x10C41CA)
  Breadcrumb #6 was 3794188452 (0xE226B8A4)
Does this mean it crashed at (2,10)? That's where I introduced the fault. Why is there a Breadcrumb #3 and #6 mentioned? I haven't ever defined them, or used them, just
Code:
CrashReport.breadcrumb(2, 1) to CrashReport.breadcrumb(2, 10)
. Are those breadcrumbs a consequence of the null pointer access?
 
The breadcrumb memory isn't cleared at startup (because that would potentially wipe any breadcrumbs from a previous run). As a result uninitialized data may be interpreted as valid breadcrumbs. Just ignore them if they're not ones that you explicitly set.
 
Grand experiment starting. Have littered my code with breadcrumbs. Entered using an #if defined(mycrumbs). Breadcrumb(n,x); #endif.

On the one hand, I hope I capture something so I have a clue where to look, and on the other hand, I hope there's nothing to capture and nothing goes wrong! It was quite distressing to be in the middle of a machining operation and the Teensy reset, so I truly hope I find it.
 
Well got another error during high speed operation. Fastest lathe speed, and fastest feed rate.
The output is different than I've seen before. I know where the breadcrumbs are.
PXL_20250407_001217859.jpg
(2,8) main loop after using an instance of elapsedMillis
(3,4) EncoderTool callback ISR - processes DRO value change
(4,100) TeensyTimerTool callback ISR - only two LOC, a breadcrumb, and digitalWriteFast to turn off an IO pin.
(5,400) Inside state machine at the next to last if statement checking if there's a crash report to show.

So I need to use
Run "addr2line -e mysketch.ino.elf 0x2179E" for filename & line number?
 
@defragster
Code:
[arm/bin] $ ./arm-none-eabi-addr2line -e ~/Library/Caches/arduino/sketches/C798958FCBDCA6AB7A9C28AB1D2FD52A/teensy-electronic-lead-screw.ino.elf 0x2179E
??:?
[arm/bin] $
This is not enlightening, is it? ??:? what kind of answer is that? Do I need an absolute path for the library?

As for 0xE0E565AC, I have no idea what that is. About the only thing I can think of is I use an elapsedMillis value of 250ms for FreqMeasMulti. But as an address, haven't a clue. One of the breadcrumbs shown was near the FreqMeas area, and it used 250. I've only recently started to use this library. Strictly speaking, it isn't necessary for my controller to work, I used it for data gathering and insight to whether I synthesized an S-curve for the stepper motor velocity. So I could comment out the entire function and things would still work.

Is there a way to access the linker map, to see if it's something I did? So I can learn where everything is on this build?

For the record, on a Mac, the path to the addr2line is:
~/Library/Arduino15/packages/teensy/tools/teensy-compile/11.3.1/arm/bin/arm-none-eabi-addr2line

Both the executable and the elf file are pretty well hidden under layers and layers of directories. Keeps it out of the way of dumb users, but sure makes it hard to uncover, if you really need it!
 
Last edited:
That output usually means the access violation occurred in a library function - those don’t tend to have debug information available. You should find a .sym file in the same folder as the .elf, with a bit of digging around you should be able to figure out the function that has code at 0x2179E
 
That output usually means the access violation occurred in a library function - those don’t tend to have debug information available. You should find a .sym file in the same folder as the .elf, with a bit of digging around you should be able to figure out the function that has code at 0x2179E
The sym file is unsorted, kind of hard to make sense of it. I found the instruction in the lst file, which is readable, but I don't know where in the code (whose library) it is happening.

Is there a way to make a sorted sym file? It's seriously crazy mixed up.
 
I did an objdump, which is somewhat useful. Set demangling, disassembly, line number and source flags.
Code:
[arm/bin] $ ./arm-none-eabi-objdump -C -d -l -S ~/Library/Caches/arduino/sketches/C798958FCBDCA6AB7A9C28AB1D2FD52A/teensy-electronic-lead-screw.ino.elf > ~/Documents/mydump.txt

I'll poke through that a while. I'll see if I can use nm or objdump to make just all the symbols in order.
 
Even if my logic is faulty, I may have stumbled on an answer. When I searched the objdump file, I see that the closest human readable descriptor indicates it may be related to _svfprintf_r. Whether that is true or not, it triggered a thought that I'm logging data from within an ISR.

So I turned off my data logging (USBSerial1.printf) by recompiling with just Teensy Serial. I had a lot of printf stuff pouring out, for diagnostics, and function verification. A good deal of it was in the linear encoder ISR. Just tested it and so far, I am not generating faults, or crashes. Hope this continues.
 
The sym file is unsorted, kind of hard to make sense of it. I found the instruction in the lst file, which is readable, but I don't know where in the code (whose library) it is happening.

Is there a way to make a sorted sym file? It's seriously crazy mixed up.
I generally drop it into a spreadsheet, ensuring the address columns are set to text (mayhem ensues otherwise), then filter and sort as needed. It’s not easy, though :mad:
Even if my logic is faulty, I may have stumbled on an answer. When I searched the objdump file, I see that the closest human readable descriptor indicates it may be related to _svfprintf_r. Whether that is true or not, it triggered a thought that I'm logging data from within an ISR.

So I turned off my data logging (USBSerial1.printf) by recompiling with just Teensy Serial. I had a lot of printf stuff pouring out, for diagnostics, and function verification. A good deal of it was in the linear encoder ISR. Just tested it and so far, I am not generating faults, or crashes. Hope this continues.
Yes, definitely avoid serial output from within an ISR. Even sprintf could be problematic, as I think it uses the heap. There are various ring buffer utilities around designed to be used to transfer information from ISR to your foreground code. I’ve not used any myself, but someone will be along with a recommendation shortly, I expect.
 
I generally drop it into a spreadsheet, ensuring the address columns are set to text (mayhem ensues otherwise), then filter and sort as needed. It’s not easy, though :mad:

Yes, definitely avoid serial output from within an ISR. Even sprintf could be problematic, as I think it uses the heap. There are various ring buffer utilities around designed to be used to transfer information from ISR to your foreground code. I’ve not used any myself, but someone will be along with a recommendation shortly, I expect.
Having the serial output, did prove to me the ISR was working as designed, at least at lower data rates. But pushing 4MB through the interface at high speeds did prove to be an undoing. I've tried not to put in printf statements in ISR's, for good practice, but I really needed to know if my crazy idea was working.

My idea works, as long as one doesn't try to stuff 30kg of stuff into a 1kg pipe. To be fair, it worked *most* of the time, even at the limits of the system. But I need it to work all the time, even if that means de-rating the system performance. With my new compile, the pressure on the USB pipe has been reduced over 1000 x.
 
Well … I may be misunderstanding what you’re saying, but … The Rules are: don’t use any sort of Serial access in an ISR. If you use “less”, all you’ve done is to increase the interval between crashes. The next one will probably happen when you’re doing something expensive, on a Friday evening, when you can’t get any spares…

Debugging ISRs is not easy. On a Teensy a ring buffer and serial printing (in foreground code only) is perhaps the simplest way. You can log to an SD card, with care, which means you don’t need a PC connected all the time. If you have an oscilloscope and spare pins you can check timings, but generating a log is not typically possible.
 
Well … I may be misunderstanding what you’re saying, but … The Rules are: don’t use any sort of Serial access in an ISR. If you use “less”, all you’ve done is to increase the interval between crashes. The next one will probably happen when you’re doing something expensive, on a Friday evening, when you can’t get any spares…

Debugging ISRs is not easy. On a Teensy a ring buffer and serial printing (in foreground code only) is perhaps the simplest way. You can log to an SD card, with care, which means you don’t need a PC connected all the time. If you have an oscilloscope and spare pins you can check timings, but generating a log is not typically possible.
With my current build, there are no serial outputs in an ISR except for a serious fault condition, (missing a count) which has never occurred in 18 months. So now, there are no serial outputs in an ISR under normal operation. All my serial printing is in the foreground code. Missing a count is serious, as now there's an ever increasing error, which is why I allow that to be communicated. The Teensy has never missed a count, such a message has never been output.

If a big fault occurs (a reset-able one) a crash report is generated. The crash report is logged to Program Memory and can be output to both the serial console and the ILI9341 display. I have to log this, otherwise there's no record of what happened, or a way to get access to the data. Most of the time, I'm using my lathe to make things, not developing code on it. Laptops shouldn't be near machines which can throw metal chips in the air and land on them. It's not good to have your laptop fry. So generally, only the Teensy and the ILI9341 are used, with no PC.

In general, I don't need to log at all for the application. It's only for the development and testing phase, or for adding new features. My main branch has been stable for 18 months. This year, I have added a feature (and new branch), which is why I had "way too many" logging activities. I think this feature is basically done, save for some clean up. Hope to add a new feature in the near future. It seems I need to be more careful about how I pass the information to the foreground.
 
Back
Top