Non-volatile data for debugging RTWDOG reset?

Status
Not open for further replies.

damiend

Member
I'm trying to debug the causes of a RTWDOG reset by saving some state to non-volatile memory.
A single bit would be useful, 13 bytes would be perfect.

Does the Teensy4 have any memory that
- would persist after a RTWDOG reset
- is fast enough to write within the 255 cycles of the interrupt grace period?

or:
- would persist after a RTWDOG reset
- has enough endurance to be continually updated 1000 times per second?

Note that surviving the reset is enough (I can persist it somewhere else on reboot), it does not need to survive a power supply disconnection.
 
why not write to eeprom, the watchdog allows a watchdog callback before the reset occurs, and you can program the time between the watchdog callback and the reset to allow you to complete tasks safely...

also why 1000 times per second? update it once during the callback, thats it
 
I'm going to try this, but I'm worried that the delay is too short for EEPROM writes — this is RTWDOG/WDOG3, so the interval between interrupt and reset is fixed at 255 bus clock cycles.

1000 times per second is for the alternative option of not using the RTWDOG interrupt, just continuously updating the debug values. Something like the SRC_SRSR register that tells you why the chip reset, but for user data.
 
yeah WDT3 doesn't care even if your callback ain't finished it'll reboot :)

the other ones WDT1/2 you can set the time before reset, like 1 second or so if wanted

Use the best of both worlds? have the WDT1/2 trigger a callback before the WDT3 callback occurs, and feed both normally

you can feed them both in a void function, and just call that function in your loop when you want to feed both
 
Yes, Teensy 4.x has memory that persists across WARM restarts - even Programming.

It is the 512KB of RAM2 that can be allocated at compile time like :: DMAMEM byte myWarmRam[13];
EDIT - Given it will restart ASAP :: This "arm_dcache_flush()" is required to have the data saved - needs address of memory and the size of the area to flush
Code:
arm_dcache_flush(myWarmRam, sizeof(myWarmRam));

That area of memory will be 'reserved' in a fixed location (for a given build/compile) and the bits stored there will not be initialized but persist after powered, while powered.

TeensyDuino 1.54 just released uses this for the CrashReport fault reporting. That code also has detection for WatchDogs that trigger and might be instructive.

See : github.com/PaulStoffregen/cores/blob/master/teensy4/CrashReport.cpp#L179
 
Last edited:
My application requires a watchdog timeout of 5 milliseconds, so I can't use the other ones.
It's part of the control loop for a micro-hot plate that runs at 1KHz.
 
Strange: the DMAMEM trick worked for a while, and now not any more.
Even with a minimal, clean sketch the value resets every time:

Code:
DMAMEM unsigned int counter;

void setup() {  
  // put your setup code here, to run once:
  while (!Serial);
  Serial.println(counter++);
  Serial.println((unsigned long)&counter, HEX);
}

void loop() {
  // put your main code here, to run repeatedly:

}

Output:
Code:
3746047971
20200000
 
I'm not using the watchdog interrupt to set the DMAMEM variable.

I solved my problem -- turns out I needed to call
Code:
arm_dcache_flush(&counter, sizeof(counter));
for the value to persist. I have no idea why it worked before! Some other part of my code must have been triggering a flush in mysterious ways.
 
This is working well now for most of my state that I set from the control loop.

There's just one variable that I'd like to write from the RTWDOG interrupt, and somehow can't seem to set -- either the interrupt isn't triggered, or the arm_dcache_flush doesn't complete in time.

If I understand correctly:
(a) DMAMEM variables persist not because of DMA, but because the .dmabuffers section happens to be placed in Bank0-Bank15 of FlexRAM and these are in power domain PDRAM0
(b) this layout is configured in cores/teensy4/imxrt1062_t41.ld
(c) technically, it would be possible to configure the chip to have some DTCM in PDRAM0.

Would it be easy to do (c), or would that wreck havoc on the rest of Teensyduino?
 
I'm not using the watchdog interrupt to set the DMAMEM variable.

I solved my problem -- turns out I needed to call
Code:
arm_dcache_flush(&counter, sizeof(counter));
for the value to persist. I have no idea why it worked before! Some other part of my code must have been triggering a flush in mysterious ways.

Glad I linked the PJRC CrashReport code - Indeed forgot to add that arm_dcache_flush() which Must be done or the changes won't be committed to physical memory.
> opps : except that is shown on the SAVE end >> github.com/PaulStoffregen/cores/blob/master/teensy4/startup.c#L555

The RAM1 is the TCM memory at processor speed for ITCM and DTCM as mapped by PJRC in the .ld.

RAM2 is OCRAM - Off Chip RAM that is given the name of DMAMEM to distinguish it. It isn't zero'd on startup init because it is uniquely situated and powered - but the association to DMA is just that it is good for that as it doesn't fight for the TCM BUS/addressing lines AFAIK allowing DMA to run without generally blocking the MCU processing.
I wouldn't expect that p#11 reMapping would work or be good. It is either physically linked differently or would go in odd sized blocks that would diminish memory utility.

There are 4 NVRAM DWORDS in the 1062 - watchdog may be limited in time or function once it fires? See this post and see if the NVRAM write completes properly maybe: Teensy-4-x-Battery-backed-Non-volitile-memory
 
For reference, I've managed to make the RTWDOG interrupt work.

My mistake was that I had forgotten to call
Code:
NVIC_ENABLE_IRQ(IRQ_RTWDOG);

With this enabled, the interrupt fires and I can confirm there is enough time to write 4 bytes to DMAMEM and flush.
 
No I'm not -- I thought about using it, but there was no documentation and I wasn't sure what the units for the timeouts should be. There was also quite a lot of code in there and I got worried that I didn't understand all of it.

In my application the RTWDOG monitors a really critical safety check function which is there to protect the hardware from being destroyed by a software bug.

I appreciate that your library provides a unified interface to all the watchdogs, but I wanted exactly the opposite: something small and simple that only does RTWDOG and where I understand every single line.

In the end I came up with this: https://gist.github.com/damiendr/18b3336c7293f8b52e482a37d6d12d7d
 
I guess your not using WDT_T4, it's enabled there

@tonton81 - could the use of DMAMEM (post #5) be integrated into your examples or documented in some way?
Code:
void myCallback() {
  Serial.println("FEED THE DOG SOON, OR RESET!");

and 

void myCallback() {
  Serial.println("YOU DIDNT FEED THE DOG, 255 CYCLES TILL RESET...");
}
}

Maybe this use of DMAMAM persistence is T_4.x/MMod debug tip worthy of the @luni github.com/TeensyUser/doc/wiki/

<EDIT>: this same would generally apply to TD 1.54 CrashReport - as long as the CACHE of the DMAMEM gets flushed before the fault. On restart entry to setup() those 'static' DMAMEM values would hold any info stored there for debug usage or restart status.
 
I have never used DMA so I wouldn't know, but I don't see why not
probably only needs to be put in the callback itself, or if theres any housework involved before user code it can be put as part of the callback loader
 
yeah i can see something like that in the callback itself in the sketch, where a user can write or read, or....

we could add a method as well to auto store and retrieve the data, but how do we know other applications do not use that specific region? and how much data can be stored or details from the event does the user require?

like what kind of data would the user be able to store or retrieve?
 
Here is a modified version of :: WDT_T4-master\examples\watchdog3_demo\watchdog3_demo.ino

The CrashReport is in to see if Watchdog restart is trapped - and it is not.
Code:
#include "Watchdog_t4.h"
WDT_T4<WDT3> wdt;

// DMAMEM causes allocation in 'static' RAM2 on Teensy using 1062 processor
// It is not initialized and will retain value while powered
// BUT - it writes through a cache that must be flushed to assure it is current
// This works for DMAMEM allocation of any variable type or structure
DMAMEM uint32_t loopCnt;
DMAMEM uint32_t feedCnt;

void myCallback() {
  arm_dcache_flush(&loopCnt, sizeof(loopCnt));
  arm_dcache_flush(&feedCnt, sizeof(feedCnt));
  Serial.println("YOU DIDNT FEED THE DOG, 255 CYCLES TILL RESET...");
}

void setup() {
  Serial.begin(1);
  delay(600);
  Serial.println("Begin......");
  Serial.print("\tPRIOR Loop Count = ");
  Serial.println(loopCnt);
  Serial.print("\tPRIOR Feed Count = ");
  Serial.println(feedCnt);
  loopCnt = 0;
  feedCnt = 0;
  arm_dcache_flush(&loopCnt, sizeof(loopCnt));
  arm_dcache_flush(&feedCnt, sizeof(feedCnt));
  if ( Serial && CrashReport )
  { // Make sure Serial is alive and there is a CrashReport stored.
    Serial.print(CrashReport); // Once called any crash data is cleared
    // In this case USB Serial is used - but any Stream capable output will work : SD Card or other UART Serial
  }

  WDT_timings_t config;
  config.window = 3000; /* in seconds, 32ms to 522.232s, must be smaller than timeout */
  config.timeout = 10000; /* in seconds, 32ms to 522.232s */
  config.callback = myCallback;
  wdt.begin(config);
}

void loop() {
  static uint32_t feed = millis();
  loopCnt++;

  /* set to below 3000 to demonstrate windowMode effect for feeding the dog too fast */
  /* set to 3100 to demonstrate proper processing */
  /* set to 12000 to demonstrate an actual non-feeding reset */

  if ( millis() - feed > 3100 ) {
    feed = millis();
    wdt.feed();
    feedCnt++;
  }
  if ( Serial.available() ) { // Any Serial input will HALT the loop() here causing WD Timeout
    Serial.println("Hungry Dog ahead ......");
    while ( Serial.available() );
  }
}
 
Last edited:
Here is that example with a bit more of a show of it working.

This output from sending this to the running sketch: "Mary Had a Little Lamb"
Code:
Begin......
	PRIOR Loop Count = 0
	PRIOR Feed Count = 0
Hungry Dog ahead ......
YOU DIDNT FEED THE DOG, 255 CYCLES TILL RESET...
Begin......
	PRIOR Loop Count = 70931712
	PRIOR Feed Count = 2
	Last User input text : Mary Had a Little Lamb

The code:
Code:
#include "Watchdog_t4.h"
WDT_T4<WDT3> wdt;


// DMAMEM causes allocation in 'static' RAM2 on Teensy using 1062 processor
// It is not initialized and will retain value while powered
// BUT - it writes through a cache that must be flushed to assure it is current
// This works for any DMAMEM allocation of any variable type or structure
DMAMEM uint32_t loopCnt;
DMAMEM uint32_t feedCnt;
DMAMEM char szLast[32];

void myCallback() {
  arm_dcache_flush(&loopCnt, sizeof(loopCnt));
  arm_dcache_flush(&feedCnt, sizeof(feedCnt));
  uint ii = 1;
  while ( ii < (sizeof(szLast) - 1) && Serial.available() ) {
    szLast[ii++] = Serial.read();
  }
  szLast[ii] = 0;
  szLast[0] = 42;
  arm_dcache_flush(szLast, sizeof(szLast));
  Serial.println("YOU DIDNT FEED THE DOG, 255 CYCLES TILL RESET...");
}

void setup() {
  Serial.begin(1);
  delay(600);
  Serial.println("Begin......");
  Serial.print("\tPRIOR Loop Count = ");
  Serial.println(loopCnt);
  Serial.print("\tPRIOR Feed Count = ");
  Serial.println(feedCnt);
  if ( szLast[0] == 42 && szLast[1] != 0 ) {
    Serial.print("\tLast User input text : ");
    Serial.println(&szLast[1]);
  }
  loopCnt = 0;
  feedCnt = 0;
  szLast[0] = 0;
  arm_dcache_flush(&loopCnt, sizeof(loopCnt));
  arm_dcache_flush(&feedCnt, sizeof(feedCnt));
  arm_dcache_flush(szLast, sizeof(szLast));
  if ( Serial && CrashReport )
  { // Make sure Serial is alive and there is a CrashReport stored.
    Serial.print(CrashReport); // Once called any crash data is cleared
    // In this case USB Serial is used - but any Stream capable output will work : SD Card or other UART Serial
  }

  WDT_timings_t config;
  config.window = 3000; /* in seconds, 32ms to 522.232s, must be smaller than timeout */
  config.timeout = 10000; /* in seconds, 32ms to 522.232s */
  config.callback = myCallback;
  wdt.begin(config);
}

void loop() {
  static uint32_t feed = millis();
  loopCnt++;

  /* set to below 3000 to demonstrate windowMode effect for feeding the dog too fast */
  /* set to 3100 to demonstrate proper processing */
  /* set to 12000 to demonstrate an actual non-feeding reset */

  if ( millis() - feed > 3100 ) {
    feed = millis();
    wdt.feed();
    feedCnt++;
  }
  if ( Serial.available() ) { // Any Serial input will HALT the loop() here causing WD Timeout
    Serial.println("Hungry Dog ahead ......");
    while ( Serial.available() );
  }
}
 
yeah so it is sketch based, i guess for a demo that could be one :)

Yes, all in the user Sketch code. And you can see the killer of the watchdog update is stopping on Serial.available() - then the CallBack is where the string input is actually read into the 'static' memory. Not a sensible/practical example - but after the p#21 sketch it seemed a natural proof of concept for p#22 to actually get the input stored to show on restart.

Though it might be? If loop() needs to cycle to read Serial or Serial# or other input - but stalled - knowing what is next might show if processing up to that point caused the failure. Have seen a few hangs were loop() won't re-enter but interrupts are still ticking. Might be enough to record the value returned as number of Serial.available or Serial.availableForWrite.
 
Status
Not open for further replies.
Back
Top