Teensy 4.1 & FlasherX with large project hangs in flash_move()

klaasdc

Active member
I am integrating the FlasherX code into my own project to be able to flash over CANbus.
For a small blinker application, this works reliably, but in my final application weird things start to happen.

As test, I try to flash the blinker firmware when the larger application is running. After the new firmware is transferred into the buffer, flash_move() is called and the Teensy seems to hang. Serial connection is dropped within platformio monitor, and the builtin LED is dim (inbetween on and off). Pressing the program button makes the red LED blink extremely fast (almost dim).
On top of that, I cannot reprogram the Teensy until restoring to the built-in example, left it a few minutes without power *and* rebooted the host pc. Otherwise it is not detected by TeensyLoader.

To track the issue, I have added crude print statements inside flash_move(). Basically, inside FlashTxx.c I have:
C:
  // move is complete. if the source buffer (src) is in FLASH, erase the buffer
  // by erasing all sectors from top of new program to bottom of FLASH_RESERVE,
  // which leaves FLASH in same state as if code was loaded using TeensyDuino.
  // For KINETIS, this erase cannot include FSEC, so erase uses aFSEC=0.
  if (IN_FLASH(src)) {
    sSerialPrint("2 ");
    while (offset < (FLASH_SIZE - FLASH_RESERVE) && error == 0) {
      addr = dst + offset;

      if ((addr & (FLASH_SECTOR_SIZE - 1)) == 0) {
        if (flash_sector_not_erased( addr )) {
          #if defined(__IMXRT1062__)

            sSerialPrint("eepromemu_flash_erase_sector addr = ");
            sSerialPrintNb(addr);
            sSerialPrint("\n");

            eepromemu_flash_erase_sector( (void*)addr );
          #else
            error |= flash_erase_sector( addr, 0 );
          #endif
          }
      }
      offset += FLASH_WRITE_SIZE;

      sSerialPrint("3 ");
    }
  }

Eventually, flash_move() seems to stop with these messages:

Code:
eepromemu_flash_erase_sector addr = 6003D000
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 eepromemu_flash_erase_sector addr = 6003E000

It is clearly related to the larger firmware that is hosting the FlasherX code, since a minimal example works. Why does it hang on that address? Is there something else I can test?

Edit: another time it went up to "eepromemu_flash_erase_sector addr = 6003F000". So it's not specific to that address.
 
Last edited:
Thanks for doing the testing and trouble-shooting. That makes it a lot easier for me to try to help.

Let's assume you are trying to install a new "copy" of the same firmware that is already in flash. Do you know the size of the program? It cannot be more than half of the non-reserved flash, so the maximum size of a T4.1 program that FlasherX can write over itself is a little less then 4MB. That would explain addresses 603D000 and 603F000 which are approaching that limit. You can get around this limit by buffering the new firmware in PSRAM. That lets you go all the up to near the full 8MB of flash. If you don't have PSRAM, and your program is larger than the largest possible buffer, you'll have to either reduce the program size, or create a minimal program that contains FlasherX capability, and do your updates in two steps. Step 1 is to install the small program, and step 2 is to install the new version of the larger program.
 
How hot is it getting with these continual erases?
I didn't notice it getting any hotter, but I configured it to run at 16Mhz with set_arm_clock()

Thanks for doing the testing and trouble-shooting. That makes it a lot easier for me to try to help.

Let's assume you are trying to install a new "copy" of the same firmware that is already in flash. Do you know the size of the program? It cannot be more than half of the non-reserved flash, so the maximum size of a T4.1 program that FlasherX can write over itself is a little less then 4MB. That would explain addresses 603D000 and 603F000 which are approaching that limit. You can get around this limit by buffering the new firmware in PSRAM. That lets you go all the up to near the full 8MB of flash. If you don't have PSRAM, and your program is larger than the largest possible buffer, you'll have to either reduce the program size, or create a minimal program that contains FlasherX capability, and do your updates in two steps. Step 1 is to install the small program, and step 2 is to install the new version of the larger program.
The blinker test is 23552 bytes (upload) and the summary after compilation is
Code:
teensy_size: Memory Usage on Teensy 4.1:
teensy_size:   FLASH: code:10532, data:4040, headers:8976   free for files:8102916
teensy_size:    RAM1: variables:4832, code:7840, padding:24928   free for local variables:486688
teensy_size:    RAM2: variables:12416  free for malloc/new:511872

and the 'larger' one (656384 bytes upload) gives me:
Code:
teensy_size: Memory Usage on Teensy 4.1:
teensy_size:   FLASH: code:556912, data:90484, headers:8984   free for files:7470084
teensy_size:    RAM1: variables:47744, code:166380, padding:30228   free for local variables:279936
teensy_size:    RAM2: variables:12416  free for malloc/new:511872
 
I didn't notice it getting any hotter, but I configured it to run at 16Mhz with set_arm_clock()

Your sketch is running at 16 MHz? Is this true when you update via FlasherX?

and the 'larger' one (656384 bytes upload) gives me:
Code:
teensy_size: Memory Usage on Teensy 4.1:
teensy_size:   FLASH: code:556912, data:90484, headers:8984   free for files:7470084
teensy_size:    RAM1: variables:47744, code:166380, padding:30228   free for local variables:279936
teensy_size:    RAM2: variables:12416  free for malloc/new:511872

Based on the size, it doesn't seem like there should be a problem. Are you using EEPROM and/or LittleFS? If so, you need to specify to FlasherX how much additional flash it should reserve for them.

Assuming you're not using EEPROM or LittleFS, please try the following:
  • do a "full reset" of T4.1 via the 15-second hold on the program button
  • install the larger sketch via Teensy loader
  • try to re-install larger sketch via FlasherX
Does this work? The purpose of this test is to make sure that all of the flash between the end of the sketch and the beginning of reserved flash is erased when you try to update via FlasherX.
 
Your sketch is running at 16 MHz? Is this true when you update via FlasherX?
Yes, would that make a difference? I haven't tried it at the default 600Mhz, will do.

I'm using the EEPROM, but increased FLASH_RESERVE to (64*FLASH_SECTOR_SIZE), as I read that it then stays out of the top 256K used by the EEPROM emulation.

I'll try your suggestion with full reset/upload/flasher.
 
Note says 16 MHz ARM clock? IIRC PJRC says that USB is only expected to work down to 24 MHz.

Yes, the FLASHERX'ing is CAN - but on restart that might prevent the Teensy from being active on USB on restart?
 
Yes, would that make a difference? I haven't tried it at the default 600Mhz, will do.
Not sure, but I was surprised to see that number. I didn't think T4 could run that slow.
I'm using the EEPROM, but increased FLASH_RESERVE to (64*FLASH_SECTOR_SIZE), as I read that it then stays out of the top 256K used by the EEPROM emulation.
Okay, good to know.
I'll try your suggestion with full reset/upload/flasher.
Yes, the reason for this test is that a few years ago, the PJRC bootloader was changed to not do a full erase before writing the new program, so if you install a large program and then a smaller program using TeensyLoader, you can have un-erased flash that messes up FlasherX. If you start with a fully erased flash, then use TeensyLoader for first install, and then only use FlasherX for updates, that issue can be avoided.
 
To be fair, I never checked if it actually clocks to below 24Mhz. But now I need to know :p

I also commented out the part that erases the buffer in FlashTxx.c and on first test with the blinker sketch that worked.
I'll report back tomorrow.
 
I also commented out the part that erases the buffer in FlashTxx.c and on first test with the blinker sketch that worked.
I don't understand this. You commented out the code that erases the buffer? Please do your test with your large program and the unmodified FlasherX files, with the exception of your FLASH_RESERVE setting.
 
I don't understand this. You commented out the code that erases the buffer? Please do your test with your large program and the unmodified FlasherX files, with the exception of your FLASH_RESERVE setting.
I did that, but then I get the behaviour in my first post (freezes at address). I meant that I later commented out the part where the source buffer is erased (below "//move is complete"). Then the Teensy reboots into the new program fine every time, but the orange LED stayed on half-lit until the next power cycle.
 
After some initial success, I get also unreliable behavior after commenting out the buffer erase part. Teensy hangs and on power cycle blinks 9 times on the red LED.
 
After some initial success, I get also unreliable behavior after commenting out the buffer erase part. Teensy hangs and on power cycle blinks 9 times on the red LED.
Why are you commenting out the buffer erase? FlasherX assumes that there will be available ERASED flash to create a buffer. If you do flash, and don't erase the buffer, the next attempt will fail. You shouldn't be commenting out anything!
 
Why are you commenting out the buffer erase? FlasherX assumes that there will be available ERASED flash to create a buffer. If you do flash, and don't erase the buffer, the next attempt will fail. You shouldn't be commenting out anything!
Because without disabling it, the result was always the same: Teensy freezes and is not recognized by TeensyLoader until after some "cool-down" time. I tried it in 2 different large programs and got the same result. By commenting it, I did manage to flash several times in a row, hence I continued that route.

I don't know what direction to try further. I can try to rework the program to flash from SD card instead, although I'm not convinced the CAN transfer is the issue.

Note says 16 MHz ARM clock? IIRC PJRC says that USB is only expected to work down to 24 MHz.

Yes, the FLASHERX'ing is CAN - but on restart that might prevent the Teensy from being active on USB on restart?
I disabled the clock setting during runtime and left it at 80Mhz but still the same result.
 
Because without disabling it, the result was always the same: Teensy freezes and is not recognized by TeensyLoader until after some "cool-down" time. I tried it in 2 different large programs and got the same result. By commenting it, I did manage to flash several times in a row, hence I continued that route.

I don't know what direction to try further. I can try to rework the program to flash from SD card instead, although I'm not convinced the CAN transfer is the issue.


I disabled the clock setting during runtime and left it at 80Mhz but still the same result.
How are you confirming that the transfer of the HEX file is complete and error-free? For example, when I load new firmware via UART, I use a packet protocol with CRC. On the host side, I parse the HEX file and create a binary image, then send the image to T4 in blocks, confirming each one. If that process completes without error, I know for sure that the image in the T4 buffer is correct, and only then do I call move() to erase the existing firmware and write the new version over it.
 
Last edited:
How are you confirming that the transfer of the HEX file is complete and error-free? For example, when I load new firmware via UART, I use a packet protocol with CRC. On the host side, I parse the HEX file and create a binary image, then send the image to T4 in blocks, confirming each one. If that process completes without error, I know for sure that the image in the T4 buffer is correct, and only then do I call move() to erase the existing firmware and write the new version over it.
I rely on the CAN protocol for the message integrity. CAN has a built-in CRC check and will automatically ask for a repeat message if needed. On top of that, I keep track of each message as the Teensy first replies to every message with the next address location. The host can then verify that the target is "at the same page" before continuing. Finally, the host sends a final message to confirm the end of data. I'm pretty confident the transfer is reliable, and if the transfer would corrupt, I would expect it to fail more "random"? But by flashing from SD card I could 100% exclude that...
I can clean up and share my python host code and a teensy example on Github if that helps.
 
in my sample app, there is a some conditional code near the top to increase the size of the hex file by defining an arbitrarily large const array. Since your blink application works, could you try using thst to increase its size and see if that alone causes it to fail?
 
in my sample app, there is a some conditional code near the top to increase the size of the hex file by defining an arbitrarily large const array. Since your blink application works, could you try using thst to increase its size and see if that alone causes it to fail?
I experimented with large const arrays but that was all working fine...

Then I realized that the large programs also have the watchdog timer (WDT1) active. I disabled that and it seems to complete fine now.
So now I modified the flash_move function to take a function pointer that is called during the move part. That keeps updating the watchdog timer.

I'll run a few more tests but I believe it is solved :) Thanks for the help!
 
I experimented with large const arrays but that was all working fine... Then I realized that the large programs also have the watchdog timer (WDT1) active. I disabled that and it seems to complete fine now. So now I modified the flash_move function to take a function pointer that is called during the move part. That keeps updating the watchdog timer.
Glad you got it. That makes sense and would explain why it was okay with small programs but not large ones. What is your WDT timeout?
 
It's at 8 seconds, and the callback disables some output pins. I suppose the callback function also crashes since everything is being overwritten.
 
Back
Top