'Over the Air' firmware updates, changes for flashing Teensy 3.5 & 3.6

dahollen

Member
I, too, am interested in performing 'Over the Air' firmware updates of the Teensy - except replace 'Over the Air' with CANBus, since I am placing the Teensy 3.6 in an automotive environment.

I began with the flash code provided by Jon Zeeff for Teensy 3.0/3.1/3.2. Thank you Jon! Refer to:
https://forum.pjrc.com/threads/29607-Over-the-air-updates

However, this code did not run on the Teensy 3.5 or 3.6.

After pouring over the Kinetis manuals, I discovered the Teensy 3.5/3.6 use a different flash type (FTFE, not FTFA), and require the flash to be programmed in 8 byte chunks,
represented by a different flash command code. After making this change, the example flash code attached here seems to be working on the Teensy 3.5.

However, more challenges ahead for the Teensy 3.6. As is the case when writing to EEPROM, the T3.6 has issues writing to program flash whenever it runs at > 120MHz.
Thankfully, Paul has added core routines kinetis_hsrun_enable/kinetis_hsrun_disable, which are easily called before issuing flash commands. Thanks Paul!

This got the flash write to, uh, sort of work on the Teensy 3.6. If the bytes written into program flash are read back using the 'Program Check' command, all seems well. However, if read back using simple reads via the processor, as you might read a data variable, the results are -usually- incorrect.

Some reasons why this might happen (though I have not figured out why they might be the case in the example):
1) The cache is not being cleared.
2) It is illegal to read while flashing/erasing the same section.
3) The sector is in execute only mode.

Has anyone else successfully re-flashed the program flash on the Teensy 3.6?
Does anyone see an issue with the code or have other suggestions? I'd be happy to try them out. And, I will continue to try to figure out what's going on...

WARNING: FLASHING YOUR PROGRAM CODE CAN BRICK YOUR TEENSY. ESPECIALLY addresses 0x400-0x40F.
The test program attempts to write at the program memory midpoint.

Thanks!
 

Attachments

  • flashT35notT36.ino
    18.7 KB · Views: 261
I have a clue as to why the read back of flashed code is incorrect. It is a cache issue. In the teensy core ResetHandler routine (in the mk20dx128.c file), called at powerup, is the following code:
#if defined(__MK66FX1M0__)
LMEM_PCCCR = 0x85000003;
#endif

If I comment out this Teensy 3.6 specific line, the newly flashed area reads back okay. I need to read more about the cache registers to understand what a proper fix might be...more later...
 
So, to fix the erroneous data being read back after flashing a Teensy 3.6, the code bus cache must be cleared. The following lines were added after flashing a phrase with a 3.6:

//Invalidate any cached lines in the CODE bus cache
LMEM_PCCCR |= LMEM_PCCCR_GO+LMEM_PCCCR_INVW1+LMEM_PCCCR_INVW0; //issue invalidate cmds, leave lower settings intact
while ((LMEM_PCCCR & LMEM_PCCCR_GO) == LMEM_PCCCR_GO) // wait for invalidate to complete
{
};

The invalidate commands are 'or'ed in to preserve the setting which exist in the lower bits of the LMEM_PCCCR register.

A new copy of flash example code is attached.
 

Attachments

  • flashT35T36.ino
    16.6 KB · Views: 264
Sounds interesting. Thank you for your research! What I'd be interested in, is reading a hex file from SD card and flash the Teensy with that.

The main program would contain a routine which would check the SD Card for a file, i.e. fw000002.dat and if the number (000002 in this example) is bigger than the current software version running (i.e. 000001), it would read that dat file and flash it into the program memory, replacing the current firmware. I wonder if this is only a dream or if this can be achieved...
 
Sounds like a very useful way to perform field upgrades. I do not know much about the SD side, but it is certainly possible to perform 'partial' code replacements on the Teensy. Anything I write (to perform field upgrades over CANBus) will eventually find its way onto github. If there is interest, I will post here when that happens.
 
Nice work. The original code has "FMC_PFB0CR |= 0xF << 20; // flush cache", but I'm not surprised that 3.5/3.6 needed different/additional command(s) to accomplish this.

Hopefully someone can integrate your additions back into the complete flasher program (ie, all the way from .hex lines to new 3.6 firmware).
 
Last edited:
Hi,
I'm working on a wifi version of your Idea so I can update the Teensy 3.6 remotely, is there are any updates or progress? and can you share the code :)
 
Hi,
I'm working on a wifi version of your Idea so I can update the Teensy 3.6 remotely, is there are any updates or progress? and can you share the code :)

I'm a retired programmer, so I take long periods away from projects. I left this one in a 'page of design notes and some test code' state. The test code is included in this thread. I will pick the project up again in November, and anything I come up with will be share-able. If you have any ideas you would like to bounce around, I would be glad to give my 2 cents. Good luck.
 
Nice work indeed. I'm working on means to update Teensy 3.x application in the field (actually: in the garden) via CAN bus.

But I'm currently extremely puzzled about the endianness involved. According to the datasheets the FTFL program longword command (and as well the FTFE program phrase command) write byte 0 to the supplied address, byte 1 to byte address +1, and so forth. However, in the flash_word code the word_value 0xFFFFFFFE is written to address 0x40C with byte 0 as word_value>>24, byte 1 as word_value>>16, ..., and byte 3 (FTFL_FCCOB7) as word_value, i.e. the 0xFE goes into byte 3. Why does byte 3 end at 0x40C and not at 0x40F ???

Kind regards, Sebastian
 
According to the datasheets the FTFL program longword command (and as well the FTFE program phrase command) write byte 0 to the supplied address, byte 1 to byte address +1, and so forth.

The datasheet is, ahem, utterly misleading IMHO. Refer also https://community.nxp.com/thread/325090. To interpret it correctly one seemingly has to deduce know that on a little-endian processor, when the datasheet specifies "Byte 0 data is written to the supplied address ('start')", "Byte 0 data" refers in fact to "byte 3 program value" respectively FCCOB7!
 
Nice work. The original code has "FMC_PFB0CR |= 0xF << 20; // flush cache", but I'm not surprised that 3.5/3.6 needed different/additional command(s) to accomplish this.

Hopefully someone can integrate your additions back into the complete flasher program (ie, all the way from .hex lines to new 3.6 firmware).

jonr, I'm working on integrating dahollen's flash routines for T35 and T36 with your flasher program, and I have a few questions.

(1) In flasher, you specify RESERVE_FLASH for T31 and T32. Is this meant to be an option you provide for someone who wants to reserve some space at the top of flash, and is there any reason it can't be set to 0? My understanding is T31, T32, T35, T36 allow the entire main flash (128/256/512/1024K) to be used for program storage, and use a separate small flash area for EEPROM, whereas T40 (2MB) and T41 (8MB) reserve 64K at the top of external serial flash for EEPROM and the recovery app.

(2) In flasher, you have global variable leave_interrupts_disabled, which provides a way to prevent flash_word() and flash_erase() from re-enable interrupts at the end of their flash operation. Is this necessary because the interrupt vector table in RAM may point into just-erased flash, as opposed to being specifically related to setting the FSEC fields after the 0th sector is erased?

(3) In flasher4, there is no leave_interrupts_disabled variable, so interrupts will sometimes be enabled during the flash erase and the writing of the new program to flash. Is this okay because any interrupts that may occur are guaranteed to be vectored to code already in RAM?

Thanks very much.
 
With interrupts, it's critical to be clear about during which stage. The first when data is being moved from input to upper flash and the second when upper flash is being moved down to lower flash. There were some problems with reboot(), but now that it is working, it would be better (with t4) to leave interrupts off between the completion of stage 2 and the reboot. Nothing special with FSEC, just want a clean reboot.

I don't recall the details of the reserved space option.
 
Okay, thanks. The original flasher keeps interrupts disabled from the the beginning of stage 2 (moving upper to lower) through the reboot. I'll keep it that way.
 
Something a little different about T35 and T36. The simple test of erase/write/read at the beginning of the program works fine, but the hex file transfer via TeraTerm seems to be failing. There are no error messages related to bad records even though the hex files do contain records that contain 4 or 12 bytes. Those records should be detected and flagged as invalid within flash_block(), but are not. Do you guys use TeraTerm?
 
Continuing to work on Flasher for T35/T36. As dahollen mentioned, flash operations are not reliable with high-speed mode enabled. The NXP reference manuals are clear that high-speed mode is not supported for flash operations. She added hsrun_disable/enable to all of the flash routines, and also delays, with a comment about the delays being necessary to prevent hanging the serial monitor, but I'm finding problems with this approach. Flash writes work some of the time, but not all of the time, and serial file transfer via USB doesn't work at all. I tested the reliability of flash writes by (a) erasing upper flash, then (b) copying all of lower flash to upper flash via flash_write_phrase(), and (c) comparing lower and upper flash. This always fails with hsrun_disable/enable inside the flash functions, but works reliably with hsrun_disable/enable bracketing the process. This also eliminated problems with the file transfer and serial monitor. The next issue that needs to be solved is that records in the intel hex files seem to be guaranteed to be 32-bit aligned and at least 32 bits in size, but are not always 64-bit aligned or 64 bits in size. This means the approach of writing to flash for each hex line works for T32, but not for T35/T36. There has to be some additional buffering and filling before writing to flash.
 
The attached file Flasher3a.zip contains an update to jonr's "flasher" for T32, with support for T35 and T36. The primary change relative to flasher is addition of a flash_phrase() function to write to flash in 64-bit "phrases" for T35/T36. After a lot of testing with dahollen's flash routines for T35/T36, I based flash_phrase() on jonr's flash_word(), with modifications for 64-bit, and added dahollen's logic for clearing the LMEM code cache for T36. The clearing of the FMC and code cache are necessary for the immediate read-back done within flash_word() and flash_phrase(), and it's included in flash_erase(), too. I didn't include it in this program, but I have confirmed that program_check() can be used to verify flash writes without clearing the caches.

High-speed mode must be disabled to do flash operations on T35/T36, and I think should also be disabled for T32. I tried using dahollen's flash routines, which perform hsrun_disable/enable for each flash write/erase, but I found that was not reliable when looping rapidly through a large number of flash writes. I'm not sure if this is related to having high-speed run disabled, but I found that the hex file transfer was only reliable if the target Teensy writes to standard output as each intel hex record is received, so you'll see all of the hex records echoed to whatever terminal you use to send the file. I'm using Teraterm on on Windows 7/10.

All Teensy intel hex files seem to contain records that are not 64-bit aligned, so they are okay for T32, but not for T35/T36 with the existing logic to parse/write each hex line as it is received. I plan to add some buffer/fill logic so that any Teensy hex file can be uploaded, but for test purposes I have included hex files for a blink program for T32/T35/T36 that are all 64-bit-aligned. To test, build and load Flasher3a to T32/T35/T36, and then send the appropriate BlinkT3x.hex file. I used a free program "hex2bin" to load the 32-bit aligned hex files and write them back out as equivalent 64-bit aligned hex files.
 

Attachments

  • Flasher3a.zip
    45.5 KB · Views: 141
Indeed probably best to compile any T_3.6 flash writing code at 120 MHz to avoid any issues with Flash write at HSRUN high voltage - or confusion in running at altered speed.

There are some devices that use the common clock so it might disturb I/O.
 
Indeed probably best to compile any T_3.6 flash writing code at 120 MHz to avoid any issues with Flash write at HSRUN high voltage - or confusion in running at altered speed.

There are some devices that use the common clock so it might disturb I/O.

Yes, or at least make it clear that firmware upload requires a "shutdown" of the application. Can you comment on whether disabling HSRUN might affect file transfer via the USB serial?
 
Yes, or at least make it clear that firmware upload requires a "shutdown" of the application. Can you comment on whether disabling HSRUN might affect file transfer via the USB serial?

Not for sure - been some time since I did much with that during T_3.6 beta, it would be an easy test though. Start faster compiled sketch and then in setup() with Serial active, do a .print(), then do a kinetis_hsrun_disable() and another set of prints, perhaps with Serial.read() as well.

Had not thought about this upload being just a subroutine within a running sketch - so build at 120 MHz not a good option :(

Indeed nothing else should expect to or try to run or work during the flashing process once it is determined that new code is to be loaded.
 
Flasher3b.zip is attached, with improvements from Flasher3a and has been 100% reliable on many updates of T32/T35/T36. I was having trouble with T3.6, and after a lot of trial-and-error, added code to disable LMEM code cache rather than invalidate LMEM code cache after every flash write. For some reason, this makes the transfer of the hex file more reliable, perhaps simply because it's a little slower? The result is that the flash functions in Flasher3b are very similar to those in jonr's original flasher, with the addition of flash_phrase() for 64-bit writes for T35/T36. The other important changes relative to the original flasher are disabling hsrun, which is necessary for writing to flash on all T3.x, disabling LMEM code cache for T3.6, and modifying several functions to handle data in 32-bit (T3.2) or 64-bit (T3.5/T3.6) as appropriate.

Note that hex files must have all records 64-bit aligned for T35/T36. Teensyduino's hex files are 32-bit-aligned and can be converted to 64-bit-aligned with program IntelHex.exe included in the ZIP. Now that the upload seems to be very reliable, I plan to implement a packetized transfer of some type. I'll also try to modify the hex line parsing to handle 32-bit aligned TeensyDuino hex files directly for all T3.x.
 

Attachments

  • Flasher3b.zip
    244.1 KB · Views: 196
Back
Top