PDA

View Full Version : Teensy 3 fault handler demonstration



cmason
11-11-2012, 01:22 AM
Mostly as a learning exercise, I've been messing around with installing a fault handler in the teensy 3 interrupt vector table. This interrupt routine gets called whenever a processor fault occurs (such as a memory or instruction error). This may help with debugging. This was mostly based on the "Definitive Guide to the Arm Cortex M3 (http://www.amazon.com/Definitive-Guide-Cortex-M3-Second-Edition/dp/185617963X)", but this blog was also somewhat helpful (http://blog.frankvh.com/2011/12/07/cortex-m3-m4-hard-fault-handler/).

In Adruino.app/Contents/Resources/Java/hardware/teensy/cores/teensy3/mk20dx128.c, modify fault_isr so that it's a weak symbol:



void __attribute__((weak)) fault_isr(void)
{
while (1); // die
}


(Perhaps Paul can make this change permanent? It might also be useful to have a separate hard_fault_isr handler.)

Use this main.cpp:



#include <unistd.h>
#include "usb_serial.h"
#include "core_pins.h"

// These could go in mk20dx128.h

#define SCB_SHCSR_USGFAULTENA (uint32_t)1<<18
#define SCB_SHCSR_BUSFAULTENA (uint32_t)1<<17
#define SCB_SHCSR_MEMFAULTENA (uint32_t)1<<16

#define SCB_SHPR1_USGFAULTPRI *(volatile uint8_t *)0xE000ED20
#define SCB_SHPR1_BUSFAULTPRI *(volatile uint8_t *)0xE000ED19
#define SCB_SHPR1_MEMFAULTPRI *(volatile uint8_t *)0xE000ED18

usb_serial_class Serial;

void flash() {
digitalWrite(13, HIGH);
delay(100);
digitalWrite(13, LOW);
delay(100);
}
extern "C" {

void __attribute__((naked)) fault_isr () {
uint32_t* sp=0;
// this is from "Definitive Guide to the Cortex M3" pg 423
asm volatile ( "TST LR, #0x4\n\t" // Test EXC_RETURN number in LR bit 2
"ITE EQ\n\t" // if zero (equal) then
"MRSEQ %0, MSP\n\t" // Main Stack was used, put MSP in sp
"MRSNE %0, PSP\n\t" // else Process stack was used, put PSP in sp
: "=r" (sp) : : "cc");

Serial.print("!!!! Crashed at pc=0x");
Serial.print(sp[6], 16);
Serial.print(", lr=0x");
Serial.print(sp[5], 16);
Serial.println(".");

Serial.flush();

// allow USB interrupts to preempt us:
SCB_SHPR1_BUSFAULTPRI = (uint8_t)255;
SCB_SHPR1_USGFAULTPRI = (uint8_t)255;
SCB_SHPR1_MEMFAULTPRI = (uint8_t)255;

while (1) {
flash();
asm volatile (
"WFI" // Wait For Interrupt.
);
}
}
}

int main() {
pinMode(13, OUTPUT);

while(!usb_configuration) {
flash();
}

// enable bus, usage, and mem fault handlers.
SCB_SHCSR |= SCB_SHCSR_BUSFAULTENA | SCB_SHCSR_USGFAULTENA | SCB_SHCSR_MEMFAULTENA;

Serial.println("Hello, world.");
Serial.flush();

digitalWrite(13, HIGH);
delay(2000);
digitalWrite(13, LOW);

// crash
*((int*)0x0) = 1;

// shouldn't get here:
Serial.println("Success.");
}


When run, this outputs:


Hello, world.
!!!! Crashed at pc=0x490, lr=0x55B.

You can then use addr2line to figure out where these addresses are.


arm-none-eabi-addr2line -s -f -C -e main.elf 0x490 0x55B
Print:: println(char const*)
Print.h:47
main
main_crash.cpp:86

(The Makefile seems to like to delete the .elf file so you may need to do: make main.elf)

The first address (pc) is the currently executing instruction, and the second address (lr) is the previously called function.

As you can see by the fact that current PC is inside the next Serial.println() function, certain faults are "imprecise" in that they don't occur for some time after the offending code is executed. So you may see that you have continued somewhat past where the mistake occurred.

The part used on the Teensy3 unfortunately lacks a memory protection unit (MPU), so there are classes of errors that you could catch on a desktop CPU that will go unnoticed (and silently corrupt your program's execution) on the Teensy.

I've uploaded this to my teensy mercurial repo (https://bitbucket.org/cmason/teensy3). Type make main_crash.hex to build and upload.

Next, I'd like to try to turn this fault ISR into a mini debug console (or even GDB stub).

Hope this is useful,

-c

Bill Greiman
11-11-2012, 12:43 PM
This is just what I needed. I am porting various RTOSs and schedulers to Teensy 3.0 and I have a usage fault in one RTOS.

I was about to try put something like this together but this is much better than what I was planning.

Also, "Definitive Guide to the Arm Cortex M3" is a great resource.

cmason
11-11-2012, 04:55 PM
Glad to help. There were really three significant parts to this besides the assembly from "Definitive Guide":
* __attribute__((naked)) is necessary because otherwise gcc does "push {r4, lr}" overwriting the previous values. Most other projects I saw were using a separate .s file which complicates the tool chain and wouldn't integrate well with Arduino.
* Enabling the faults themselves -- just didn't realize they were off by default.
* Lowering the priority of the fault ISRs. I would think in an RTOS setting you would just kill the offending task and then return from the ISR and not change any priorities. In this setting I wanted to make sure we never returned to program code, but were still able to send/receive USB via its interrupt.

I also hadn't done any significant inline assembly. This page (http://www.ethernut.de/en/documents/arm-inline-asm.html) was very helpful, particularly about "clobbers."

I'd really like to be able to produce a full stack trace, but I'm not sure I'm going to be able to do that easily. Googling reveals a wealth of interesting details about various approaches. I'll post about this separately.

Also, I was hoping to help y'all out with that usb_isr bug, but didn't realize that this model of the freescale part lacked an MPU. One other thought I had around that bug specifically was to use data watchpoint on that address on the heap. I would think this would cause a debug interrupt on or near the instruction that was doing the erroneous write. This would be a good learning exercise for me towards my debug stub goal. I'll post about that separately, too.

Once you get a set of RTOSs working, it would be awesome if you could post a quick getting started guide / links for the one you'd recommend for beginners. RTOS is something I'd love to try but wouldn't really even know where to start.

Best,

-c

Bill Greiman
11-11-2012, 05:16 PM
The usb_isr bug was really a loader script problem http://forum.pjrc.com/threads/219-Bug-Teensy-3-0-linker-script-causes-bss-to-overlap-heap.

Boy I hope Paul fixes the linker script! This was a killer to find.

I found the usage fault with your fault handler. I had the wrong symbol in the vector for PendableSrvReq. So the fault occurred in the Supervisor call isr. I fixed the PendableSrvReq vector to point to the correct isr and the scheduler worked.