Debugging Teensy 4.1, by unxpected reboots

Status
Not open for further replies.

was-ja

Well-known member
Hello,

I have strange behavior in my Teensy 4.1 program that I have not been able to fix for a week. I suppose there is a bug, out of memory, or something illegal somewhere in my program. In my program, I am using multiple interrupts at about 100 kHz, 4 DMA channels and external interrupts.

Actually the problem occurs only on 600MHz. If I overclock at 816MHz everything works stable.

I am not using USB-Serial for communication and I am debugging a program on USB-Serial, however I was unable to find the problem - if I put a lot of debugging information, it seems that my program works in a different scenario and this error never occurs.

The behavior is simple, after a few minutes of work, it suddenly reboots and starts from the very beginning. At 816 MHz, I run my system overnight and it never reboots.

Power consumption of my Tensy 4.1 (with SD card) is about 1.2W at 600MHz (CPU temperature is 51C) and about 1.4W at 816MHz (CPU temperature is 63C).

My question is the following: can you please tell me if it is possible to save the instruction / stack or any other information on reboot? Or maybe there is some kind of interrupt handler that could be caused by illegal instruction / memory access, so I will dump the memory to SD card myself in such a situation, or at least write to USB-Serial.

Thank you!
 
My question is the following: can you please tell me if it is possible to save the instruction / stack or any other information on reboot?

Yes. If you're using Teensyduino 1.54 or later, the default fault handler already does this. You can use CrashReport to see the info about what when wrong before rebooting.

Code:
void setup() {
  while (!Serial) ; // wait for serial monitor open
  if (CrashReport) {
    Serial.print(CrashReport);
    delay(5000);
  }
 
Thank you very much, PaulStoffregen, for your kind advise! I have placed it into my sketch and waiting now for the failure.
 
Finally got the problem:

Code:
CrashReport:
  A problem occurred at (system time) 22:59:22
  Code was executing from address 0x2E52
  CFSR: 82
	(DACCVIOL) Data Access Violation
	(MMARVALID) Accessed Address: 0x71000000
  Temperature inside the chip was 47.07 °C
  Startup CPU clock speed is 600MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected

I have follow up questions, please, help me with it:

1. I found that there is a tool line2addr, to figure our where my error occurs (on my case 0x2E52), please, suggest me how to call this tool?
2. how to understand CFSR: 82

Thank you!
 
Finally got the problem:

Code:
CrashReport:
  A problem occurred at (system time) 22:59:22
  Code was executing from address 0x2E52
  CFSR: 82
    (DACCVIOL) Data Access Violation
    (MMARVALID) Accessed Address: 0x71000000
  Temperature inside the chip was 47.07 °C
  Startup CPU clock speed is 600MHz
  Reboot was caused by auto reboot after fault or bad interrupt detected

I have follow up questions, please, help me with it:

1. I found that there is a tool line2addr, to figure our where my error occurs (on my case 0x2E52), please, suggest me how to call this tool?
2. how to understand CFSR: 82

Thank you!

In this case easy, you're reading (or writing) beyond the end of PSRAM (0x71000000)

1) the tool is in \Arduino\hardware\tools\arm\bin and you need to use the commandline
 
Yes. If you're using Teensyduino 1.54 or later, the default fault handler already does this. You can use CrashReport to see the info about what when wrong before rebooting.

Code:
void setup() {
  while (!Serial) ; // wait for serial monitor open
  if (CrashReport) {
    Serial.print(CrashReport);
    delay(5000);
  }

Note: I have had a few times where memory was corrupted enough that the CrashReport was corrupted and did not print...

So for those I modified cores code to use it's debug output code: I added the following:

Code:
	#ifdef PRINT_DEBUG_STUFF
	printf("\n >>>>> unused_interrupt_vector <<<<<\n");
    printf("  Code was executing from address 0x%x\n", info->ret);
    printf("  CFSR: %x\n", info->cfsr);
	if (((info->cfsr & (0x80)) >> 7) == 1) printf("\t(MMARVALID) Accessed Address: 0x%x\n", info->mmfar);
	printf("  XPSR: %x\n", info->xpsr);
    printf("  HFSR: %x\n", info->hfsr);
    printf("  STACK: %x\n", (uint32_t)stack);
    for (uint16_t i = 0; i < 32; i+=4) printf("\t %x %x %x %x\n", stack[-i],stack[-i-1], stack[-i-2],stack[-i-3]);
    #endif
Into the void unused_interrupt_vector(void) right before the arm_dcache_flush_delete...

To enable this, you need to edit the file: teensy4\debug\printf.h

And uncomment the line: //#define PRINT_DEBUG_STUFF

This has startup code configure Serial4 at 115200 to print out debug stuff... I have hooked up either a USB to Serial adapter or used a teensy to connect up to Serial4

There is a second define that in certain cases will allow debug stuff to be output over USB instead of Serial4...
//#define PRINT_DEBUG_USING_USB // if both defined will try to direct stuff out USB Serial or SEREMU
But my gut says this may not work in a crash case... It did not for the one I tried.

Wonder if I should PR this in to core?
 
Thank you very much, Frank B,

yep, I searched for addr2line, but it was arm-none-eabi-addr2line. Got it, and found the line where it is occurs, hope to find the reason.

Thank you very much for very helpful tools!!!
 
Thank you very much, KurtE for your kind suggestion.

So for those I modified cores code to use it's debug output code: I added the following:

I edited appropriate files, did make in ~/arduino-1.8.19/hardware/teensy/avr/cores/teensy4 and recompiled my sketch, but seems that nothing occurs. It seems that I should recompile startup.c with printf.h, please, suggest me how to do this?

Thank you!
 
Thank you very much, KurtE for your kind suggestion.

I edited appropriate files, did make in ~/arduino-1.8.19/hardware/teensy/avr/cores/teensy4 and recompiled my sketch, but seems that nothing occurs. It seems that I should recompile startup.c with printf.h, please, suggest me how to do this?

Thank you!

If using the IDE and those source files show changes - they will cause a rebuild as needed.

As noted, any added output will appear on UART Serial4 at 115200 baud, so Rx of some device with GND connected needs to be connected to Serial#4 pin #17 Tx to collect that output, then get it to a place to be seen. Another Teensy running something like : {local install}\examples\Teensy\Serial\EchoBoth\EchoBoth.pde that will show on another USB SerMon. For multiple Teensy Serial Monitors suggestion as used here is TyCommander.
 
Thank you very much, defragster, for the info. I tried to uncomment USB version since my Serial4 is busy by GPIO. Hope to get more info now, but it will be not very easy - the program does not crash anymore :( - so it seems that this bug is really dependent on some data allocation, interrupts, etc.

BTW, please, advice me, is there any possibility to print current list of interrupt vectors and maximum available information (for example, its priority, last occurrence, etc.) I will try to dump it by different cases and try to figure out the reason. (According to previous CrachReport info regarding to PSRAM, it seems that it should be already memory corruption somewhere far before).
 
Status
Not open for further replies.
Back
Top