Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 22 of 22

Thread: Re-enable bootloader interrupts after a hard fault [Teensy 4.1]

  1. #1
    Junior Member
    Join Date
    Jun 2020
    Posts
    19

    Re-enable bootloader interrupts after a hard fault [Teensy 4.1]

    Hi! I'm not sure if this is the right question to ask, but I currently implement a hard fault function to do a (WIP) crash trace dump, but after the dump I'd like to re-enable USB interrupts so I can re-flash the Teensy during my unending while loop.

    Another interesting, but less important question if anyone knows: my handler isn't called for every exception. Sometimes the Teensy crashes (as indicated by teensyduino no longer being able to upload/re-flash) and the handler isn't called. I'm not sure if there's another interrupt I need to handle (the documentation is a little hard to decipher) or if my handler is crashing (less likely; it succeeds on some crashes).

    I'm currently overriding the HardFault_HandlerC function (when TeensyDebug isn't compiled in, anyway) and have my screen show an appropriate blue screen with useful-ish info (I'd like to eventually try and generate a trace/dump file to the SD card, would be neat!)

    Thanks!

  2. #2
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    8,230
    Very interesting thing -I'd like to see your code?
    I made something similar, but dont get much useful info.

    My codes just resets after a hardfault.

  3. #3
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    @MeBoop - indeed - would be great to see your 'crash trace dumo' code.

    FrankB added a great T_4.x solution using the RAM2 area for Fault data storage. RAM2 is not zeroed on restart and if power is maintained that data area then survives for detection on the next 'restart' for display when the USB and system is healthy and not in a faulted state.

    I did something using the included T_3.x Reg dump code and found some updated code for the T_4.x that is in the cores commented out - but nothing beyond that.

    That capture of the faults and many would still function for USB printing - I even added a call from the fault handler to a weak userDebugDump() that could be placed in the sketch ( @FrankB - you might try this before restart? - note below ) And that could print out known sketch specific stuff as desired that might lead to Fault solution or sketch state at that time.

    That was not always reliable for USB printing and the sketch could not return to normal function as seen because once in the faulted state, it won't exit properly to resume after the fault.

    @FrankB: re userDebugDump()
    ->> On Fault: userDebugDump( void *yourRam2, bool bFaultedNow==true )
    > before restarting you might call this sketch function with a pointer and a flag this is when a FAULT happened.
    > The user could put a Structure or other data there for display on restart

    ->> On Restart after Fault call: userDebugDump( void *yourRam2, bool bFaultedNow==false )
    > the user could then using that *yourRam2 extract the saved data for display

  4. #4
    Junior Member
    Join Date
    Jun 2020
    Posts
    19
    It's still a WIP -- I'm actively working on hammering away at it (my ARM asm knowledge is very slim, but I'm working on it!) but will definitely share once it gets in a more usable state. If I do want to use SD card to shove dump files, I'll have to make sure it's not in an inconsistent state.

    I'm actually having success with USB printing during faults. I remember having issues with it awhile ago, but it seems to be working well now (I only print some basic info ATM.)

    For stack traces, the current plan is to generate a .h based on the general memory layout of everything else around it -- plan B would be, if the SD card is in a consistent state, reading the ELF directly from the SD card and doing some black magics to coerce it into user readable data by diving into the symbol/dwarf sections.
    The SD card may never be in an inconsistent state if the fault never falls within the SD library (I'm using SDFat for SDIO), in theory; and I think it's friendly enough to the stack to not overflow the handler, but definitely more experimentation required.

    A big part of this is being able to re-image the teensy after a fault (so I don't have to reset it when the issue could be somewhere early on -- an infinite loop! not to mention that you could put the CPU in a locked state accidentally if you fault in your fault handler). Anyone have any ideas about this? I'm not really sure what to re-enable that the bootloader hooks (and given the bootloader is proprietary, it's not exactly easy to reverse engineer -- nor would I want to out of respect for Paul's excellent hard work).

    The RAM2 is a good idea, but my project is very allocate-heavy (the bulk of it is a fancy GUI). Using a SPI FLASH/SRAM chip would be interesting, though; not sure if the SPI RAM would be cleared on startup (I don't think it is) -- but I've gotta order a new teensy 4/some sram chips from PJRC as I accidentally killed some of the pins needed (oops, avoid my thread history lol)

    (a side note, I like to use visualmicro because I'm lazy, but its' gdbstub debugger is very buggy and doesn't seem to work, which is kinda what inspired this)
    ----
    In the teensy 4 core, the HardFault_HandlerC is declared weak, so you can easily override it and do whatever voodoo you'd like in it -- https://github.com/PaulStoffregen/co...startup.c#L532

  5. #5
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    @MeBoop: Writing to the SD in faulted state is suspect? Lots of overhead and moving parts have to be working for that? As noted - some faults allow USB Serial.print() others do not - but returning from fault handler leaves the Teensy locked/hung IIRC.

    The Bootloader is an external processor that only exerts real control during programming request, though on T_4.x it does help with startup timing AFAIK. But it only talks to the 1062 when in Program mode, and once in program mode the Teensy can only restart with or without reprogramming done.

    The SPI PSRAM is NOT cleared on startup - but again it may or may not function in a faulted state.

    Frank_B and I have both working in startup.c with the fault handler to some degree.

    Frank_B has an outstanding pull request for his Fault >> RAM2 : Restart >> READ RAM2 to present fault info. before setup().

    @Frank - I got this working here in your hardfaults.cpp code - I'll email for review and finishing touches if you desire.
    Code:
    Hardfault.
    Return Address: 0x8FE
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x0 (nullptr)
    
    FAULT RECOVERY :: userHFDebugDump() in hardfaults.cpp ___ 
    Hardfault.
    Return Address: 0x8FE
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x0 (nullptr)
    
    FAULT RECOVERY :: userHFDebugDump() in hardfaults.cpp ___ 

  6. #6
    Junior Member
    Join Date
    Jun 2020
    Posts
    19
    Quote Originally Posted by defragster View Post
    @MeBoop: Writing to the SD in faulted state is suspect? Lots of overhead and moving parts have to be working for that? As noted - some faults allow USB Serial.print() others do not - but returning from fault handler leaves the Teensy locked/hung IIRC.

    The Bootloader is an external processor that only exerts real control during programming request, though on T_4.x it does help with startup timing AFAIK. But it only talks to the 1062 when in Program mode, and once in program mode the Teensy can only restart with or without reprogramming done.

    The SPI PSRAM is NOT cleared on startup - but again it may or may not function in a faulted state.

    Frank_B and I have both working in startup.c with the fault handler to some degree.

    Frank_B has an outstanding pull request for his Fault >> RAM2 : Restart >> READ RAM2 to present fault info. before setup().

    @Frank - I got this working here in your hardfaults.cpp code - I'll email for review and finishing touches if you desire.
    Code:
    Hardfault.
    Return Address: 0x8FE
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x0 (nullptr)
    
    FAULT RECOVERY :: userHFDebugDump() in hardfaults.cpp ___ 
    Hardfault.
    Return Address: 0x8FE
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x0 (nullptr)
    
    FAULT RECOVERY :: userHFDebugDump() in hardfaults.cpp ___ 
    There are a lot of moving parts which is why I agree it could be suspect. However: I've had quite a bit of success using SPI with several of the fault handlers (and even USB serial, even though I'm still unable to engage the USB program mode -- weird!) -- my display I .. display .. the error data on works over SPI (RA8875)
    Unless the fault is related to the hardware SPI (in which case, a potential idea is to use software SPI for this? hmmmm) I don't think there will be any issues using it, but it just needs a lot of testing to confirm.

    I never intend to return from the fault handler. The reason returning from the fault handler leaves it hung is because the PC never changes, so it goes right back to whatever caused the fault, causing it to fault again.
    You could technically skip that instruction, but... that's probably way more dangerous than it's worth.

    Dedicating a block of memory to holding fault information is a great idea as a more "provable" fault handling but I have some concerns that there will be data omitted, such as a stack trace.
    I wonder if the ETM (embedded trace macrocell) is usable with the Teensy, as it is available on the MCU. This is only about 32 bytes or so, so it'd be relatively cheap to store.

    My issue w/r/t the bootloader, is that when I fault, it no longer responds to programming requests. I'm not really sure why but I could just be blind to something obvious.

  7. #7
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    The Bootloader responds to the button by a hardware signal on the board.

    The USB Wire signal requires a functional USB Stack. Having output is one thing that somehow often works.

    But the USB jump to Bootloader requires CODE in the USB stack running to detect the PC request to ( via a baud rate change ) and then have the Teensy tell the Bootloader chip to take over.


    I've modified the FrankB HardFaults.cpp code to write on FAULT and READ on restart from both DMAMEM ( RAM2 ) and a PSRAM.
    This latest version running puts a string into PSRAM on the FAULT - then the Teensy RESETS, detects it was restarted because of a Fault, Enters the indicated Code and prints out the saved string.
    Code:
    Hardfault.
    Return Address: 0x942
    	(DACCVIOL) Data Access Violation
    	(MMARVALID) Accessed Address: 0x0 (nullptr)
    
    FAULT RECOVERY :: userHFDebugDump() in hardfaults.cpp ___ 
    You Are Here - via PSRAM EXTMEM! :( 
    Frank - going to email this .cpp to you now for your consideration ...

  8. #8
    Junior Member
    Join Date
    Jun 2020
    Posts
    19
    I have other priorities ATM but if I get some time this weekend I'll see about hammering away at generating dump files. A shame about the USB issue, though. Would be a massive help when rapid fire debugging (but I ended up just moving my teensy closer so pushing the button isn't that bad haha)

  9. #9
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    FrankB's code isn't in a PJRC release yet - just a Pull Request on github.

    I made some edits for his review that allows call to user sketch on a FAULT to save data to DMAMEM or EXTMEM for view after restart. This ahppens before entry to setup() { but after waiting for USB to be online } - at this point that data could be used to alter how the program runs this next time based on the data saved.

  10. #10
    Junior Member
    Join Date
    Jun 2020
    Posts
    19
    I know, but I may expand on it when I get time (need to do some experiments on the ITM/ETM)

  11. #11
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    Good luck with whatever is working to trace back.

    If you have code that can walk back and uncover the stack in any way please post.

    The code edit just forwarded to FrankB could capture that onto PSRAM or RAM2 perhaps - then use your idea to then when the system is stable on the next restart open files on the SD card to parse the ELF or other. Right now as you know the Fault type and address and register values are captured and that is all.

  12. #12
    Junior Member
    Join Date
    Jun 2020
    Posts
    19
    You have the stack pointer and you can use the compiler option -funwind-tables -- this was the working plan. More than that requires a disassembler and dwarf info on elf.

  13. #13
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    Okay, thought maybe you had the math and logic to find and walk back the call stack.

    Frank has the PR to PJRC CORES for his code and I put a PR on hit PowerButton with an updated copy of the hardfaults.cpp that does the work with edit to call a user func as shown that works, but not heard back if he thinks it is a good idea.

    It would work for your case to push data to PSRAM for instance to recover/display or act on with restart. That would allow capturing variable values/state when the fault happened in case it explains why it went wrong.

    Paul may be ready to take the PR or have his own idea that he hasn't presented yet. But having general interest in needing it and being ready to test it might help carry the issue.

  14. #14
    Junior Member
    Join Date
    Jun 2020
    Posts
    19
    Quote Originally Posted by defragster View Post
    Okay, thought maybe you had the math and logic to find and walk back the call stack.

    Frank has the PR to PJRC CORES for his code and I put a PR on hit PowerButton with an updated copy of the hardfaults.cpp that does the work with edit to call a user func as shown that works, but not heard back if he thinks it is a good idea.

    It would work for your case to push data to PSRAM for instance to recover/display or act on with restart. That would allow capturing variable values/state when the fault happened in case it explains why it went wrong.

    Paul may be ready to take the PR or have his own idea that he hasn't presented yet. But having general interest in needing it and being ready to test it might help carry the issue.
    Without DWARF/unwind tables, you're going to have a very hard time creating a backtrace, I'm afraid. You could back up the entirety of the stack and send that over on next reboot for analysis on the PC, but other than that, it's mostly wishful thinking

    going to keep investigating though, may can find something useful

  15. #15
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    8,230
    @Defragster, wasn't that thing I did with a DIE() macro?
    @MeeBoob - Yes, I played with unwind-tables, too, not very successful.
    I hope you have more luck!

  16. #16
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    Quote Originally Posted by Frank B View Post
    @Defragster, wasn't that thing I did with a DIE() macro?
    @MeeBoob - Yes, I played with unwind-tables, too, not very successful.
    I hope you have more luck!
    dir() is the other way around. Running along and see a problem and cause a reset.

    The userHFDebugDump() is:
    > Fault happens - as it does log return address
    > Call empty weak userHFDebugDump() : unless USER code has one in the sketch
    ->> userHFDebugDump(, true) can do nothing or push any desired sketch state variables or info into PSRAM/DMAMEM
    > Teensy reset()

    > Teensy restarts
    > Fault info is displayed - as it does
    > Call empty weak userHFDebugDump() : unless USER code has one in the sketch
    ->> userHFDebugDump(, false) can do nothing or display any sketch state variables or info from PSRAM/DMAMEM
    > Teensy setup()

    Just like trapping Faults - it doesn't happen very often. But some sketches run for some hours or days and then just die. If the death is by Fault this would give a chance to track anything the sketch can record from global variables or other system state.

    Question: Do watchdogs exit through a Fault mechanism - or just reset?

  17. #17
    Senior Member
    Join Date
    Dec 2016
    Location
    Montreal, Canada
    Posts
    3,667
    with the watchdog you can decide what to do in it's own callback, i think only 1 of the 4 watchdogs doesn't reset unless you tell it to
    Last edited by tonton81; 02-28-2021 at 10:47 PM.

  18. #18
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    Quote Originally Posted by tonton81 View Post
    with the watchdog you can decide what to do in it's own callback, i think only 1 of the 4 watchdogs doesn't reset unless you tell it to
    Thanks TonTon81 - I did go look at your code and saw that watchdogs have their own exit path not through Fault. Of course by default no watchdogs are enabled.

    @FrankB and @MeBoop : bummer their isn't a clear way to unwind for stack trace. Not having an easy way showed in the searches I found

  19. #19
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    8,230
    @Tim, can you make a normal pullrequest to the T4_Powerbutton?

  20. #20
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    Quote Originally Posted by Frank B View Post
    @Tim, can you make a normal pullrequest to the T4_Powerbutton?
    Probably not

    It seemed a new folder was needed - not sure if I can do that from the web?

    If I had write permission - or you created a new git for that?

    Will go try ...

  21. #21
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    I made a forked copy and put that folder and file on this system.

    Then made that PR - not sure how it isn't normal.

    When I go to the www.git clicking Pull Request want's to know from where. If the file existed I could edit and offer that - but since it is a new folder and file ???

  22. #22
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,769
    I added a file to examples? Not where it should be ... but ...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •