Targeting Teensy 4.1 uses 5x as much dynamic memory as Teensy 3.6

Status
Not open for further replies.
@Defragster: Do you have any Idea why my alternative linkage wasn't wanted?
It could have been one click to get Johans program working. A sample-player does not need extreme high speed.

Edit: Do we have a WIKI page about FLASHMEM/PROGMEM and memory layout?
 
Do you have any Idea why my alternative linkage wasn't wanted?

Please don't take this so personally Frank.

I hope you can understand the timing wasn't ideal. We've been struggling to keep PJRC running since the pandemic hit, and during that time we released Teensy 4.1 (which had been mostly designed before February - but actually doing a commercial release is always the other 90% of the work). Substantial changes to the linker file were not desired at that time, because of the new product and difficulty just keeping the business going. Supporting a 2nd memory model was (and still is) anticipated to bring many small issues, likely increasing the tech support load, and also create a need for rewriting the web pages, which today only just barely manage to document the 1 memory model we're supporting. We just barely managed to get Teensy 4.1 released in May, and months later I *still* haven't managed to fully create the back side of the pinout card. It's simply the non-ideal reality of where we're at today.

Long-term, offering a RAM-conserving memory model does make sense. Maybe we'll do it in 1.54, or but more likely it will go into 1.55 or 1.56.

Please, I want you to understand this sort of alternative memory model comes with a substantial support cost. It's more like a free puppy than a free beer! It does have a substantial benefit and it is worthwhile to implement. But we're limited on how fast things can happen, especially with the pandemic forcing PJRC to run with 1 less employee. Robin & I are working long hours every day just to keep everything going. I'm also trying to prioritize the long-need File & FS addition, and SdFat & LittleFS as the main development goal for version 1.54. There's only so many high-risk software changes we can make under these not-so-ideal conditions (and a memory model change is high risk - very likely to expose subtle bugs or other issues in the many libraries we support). Even just taking the time to write forum messages like this one is a challenge right now.
 
C:\Users\johan\AppData\Local\Temp\arduino_build_903646\sketch\src\samples\bpb606bd04.cpp:6:1: error: 'PROGMEM' does not name a type
PROGMEM const unsigned int bpb606bd04[10561] = {
^

Did you include Arduino.h? It's only automatically included for .ino files.

You're compiling a .cpp file. Like all special names & symbols which aren't defined on the command line, you need to include the appropriate header which defines the feature you wish to use.
 
@luni the best ref to date is the pjrc.com/store/teensy40.html - and the thread that it came from - though it is long and spread out.

A functional example or two would be great.

Paul regularly uses the FLASHMEM in cores for CODE - the PROGMEM still works for DATA - but as noted when not in sketch INO ( header into cpp or other? ) it had to see arduino.h to know what it is.

And FLASHMEM was created because PROGMEM on CODE conflicted in the linker when PROGMEM was used on data in the same unit.

And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.
 
@Paul, this is not the right place or medium for an answer to you. Just this much: I think you sometimes underestimate the users here. I deliberately don't mention any names here because I might forget someone. Some of them spend an incredible amount of time here in the forum, writing code or debugging. Or write a wiki article or just delete spam. They catch a lot of the work. They do it because they enjoy it.
 
Yes, that stuff about the memory keywords probably needs to become a dedicated page with examples. Also needed is info about using const with pointers.
 
I have not included Arduino.h, I will try that, thanks. However, I still cannot understand how 8Mbyte Flash is insufficient when the Flash in T3.5 was enough to load my samples?

It is sufficiant. You just don't use it* :)



*correctly


The compiler tries to copy your arrays to the RAM. With PROGREM you tell him: Don't do that, please.
On Teensy 3.x this was the default.
 
Last edited:
The reason is because the linker scripts are designed differently on Teensy 4 than they were on Teensy 3. As Frank mentioned, the reason for the different design is memory performance.

To explain a bit more, both have flash which is slower than the CPU. Both also have cache / buffers between the flash and CPU. There is no extra delay in many cases where data is in the cache. But the cost of a cache miss is very different. On Teensy 3.5 the worst case is waiting only 5 cycles. On Teensy 4.x, a cache miss can take hundreds of cycles, because the flash memory is external and access over only a 4 bit interface running at a relatively slow clock (compared to 600 MHz inside the chip).

Because of the huge different in cache miss performance, on Teensy 4.x the linker script is designed to put "const" variables in the fast DTCM RAM. On Teensy 3.x, "const" puts the variables in flash. To get the same behavior on Teensy 4.x, you need to add "PROGMEM", as we've tried to explain.

There is indeed a good reason for this inconsistency between the boards. It's one of so many reasons why these new boards are called version 4, rather than Teensy 3.7 & 3.8. Many things about the memory architecture are very different from the older Teensy 3.x boards. We try to make most code compatible, but this situation where PROGMEM was optional on 3.x but required on 4.x is one of those places where the software support differs, because the hardware has very different performance trade-offs.
 
@luni the best ref to date is the pjrc.com/store/teensy40.html - and the thread that it came from - though it is long and spread out.
A functional example or two would be great.
Paul regularly uses the FLASHMEM in cores for CODE - the PROGMEM still works for DATA - but as noted when not in sketch INO ( header into cpp or other? ) it had to see arduino.h to know what it is.
And FLASHMEM was created because PROGMEM on CODE conflicted in the linker when PROGMEM was used on data in the same unit.
And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.

I started with a WIKI article about this stuff here: https://github.com/TeensyUser/doc/wiki/Memory-Mapping. It currently lives in the Basisc section. It is still a bit sketchy but good enough for a first review.

While I wrote the article I thought a simple tool to do experiments with various memory locations and attributes might be fun. Here a few examples what you can do with it. (please note, it currently only works with T3.x. I'll extend it to T4.x when I find some time):

Code:
int i;
const double x = 42;  // <== constant

void setup(){
    static int k;

    while (!Serial) {}
    printMemoryInfo(i);
    printMemoryInfo(x);
    printMemoryInfo(k);
}

void loop(){
}

Which prints:
Code:
i
  Start address: 0x1FFF'11E4
  End address:   0x1FFF'11E7
  Size:          4 Bytes
  Location:      RAM (not initialized)

x
  Start address: 0x0001'08C8
  End address:   0x0001'08CF
  Size:          8 Bytes
  Location:      FLASH

k
  Start address: 0x1FFF'11E8
  End address:   0x1FFF'11EB
  Size:          4 Bytes
  Location:      RAM (not initialized)
More examples here: https://github.com/TeensyUser/doc/wiki/Memory-Mapping#experiments-with-the-memorytool


Code:
And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.
I wasn't able to reproduce that can you do an example where a const variable doesn't end up in FLASH?
 
Great tool, luni :)
You may want to take a look at the .sym file, that gets generated on a build, too.
(In the wiki there is a page which describes a way for a better *.sym generation)

On Teensy 3 consts are located in flash, by default. This is the reason why PROGMEM, for the case above, was not needed before Teensy 4.
With a look at the linker-files you can find many interesting details.
For example the interrupt-table was located in the flash, long years ago. Then I suggested to copy it to the RAM, where it is now. This was a unusual concept (for Arduino), that time.
 
Great tool, luni :)
You may want to take a look at the .sym file, that gets generated on a build, too.
(In the wiki there is a page which describes a way for a better *.sym generation)

Sure, when you added the info about nm to the WIKI some months ago, I added a call to it in VisualTeensy. I use it quite often to check things.
 
I started with a WIKI article about this stuff here: https://github.com/TeensyUser/doc/wiki/Memory-Mapping. It currently lives in the Basisc section. It is still a bit sketchy but good enough for a first review.

While I wrote the article I thought a simple tool to do experiments with various memory locations and attributes might be fun. Here a few examples what you can do with it. (please note, it currently only works with T3.x. I'll extend it to T4.x when I find some time):

Code:
...

[CODE]And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.
I wasn't able to reproduce that can you do an example where a const variable doesn't end up in FLASH?

Just saw this coming back for another post. Below is confirmation what Paul said in his posts is correct - Wiki should get an update:

Using the boards.txt ",--print-memory-usage" >> teensy41.build.flags.ld=-Wl,--print-memory-usage,--gc-sections,--relax "-T{build.core.path}/imxrt1062_t41.ld"
Code:
// BEFORE
Memory region         Used Size  Region Size  %age Used
            ITCM:         64 KB       512 KB     12.50%
            [B]DTCM:       17088 B       512 KB      3.26%[/B]
             RAM:       12384 B       512 KB      2.36%
           FLASH:       88356 B      7936 KB      1.09%
            ERAM:          0 GB        16 MB      0.00%

// Added :: [B]const uint32_t myArr[30000]={0};[/B]
Memory region         Used Size  Region Size  %age Used
            ITCM:         64 KB       512 KB     12.50%
            [COLOR="#FF0000"][B][U]DTCM:      139968 B       512 KB     26.70%[/U][/B][/COLOR]
             RAM:       12384 B       512 KB      2.36%
           [B]FLASH:      208548 B      7936 KB      2.57%[/B]
            ERAM:          0 GB        16 MB      0.00%

// Edited to :: [B][U]PROGMEM[/U] const uint32_t myArr[30000]={0};[/B]
Memory region         Used Size  Region Size  %age Used
            ITCM:         64 KB       512 KB     12.50%
            [B][U]DTCM:       17088 B       512 KB      3.26%[/U][/B]
             RAM:       12384 B       512 KB      2.36%
           [B]FLASH:      208548 B      7936 KB      2.57%[/B]
            ERAM:          0 GB        16 MB      0.00%

Added this code in setup so the compiler didn't optimize the allocation away:
Code:
	int vvv=0;
	for ( int uuu=0; uuu<30000; uuu++) {
		vvv+=myArr[uuu];
	}
	Serial.println(vvv);
	Serial.println(myArr[29000]);
 
Mea culpa, as so often, I got distracted and never finalized this article. I'll add the missing info about PROGMEM asap.
 
Cool, just came back here myself and saw there was a question.

That ',--print-memory-usage' is nice to have turned on, I created a boards.local.txt to keep it around. I like the IMXRT-size thing - but not everyone can do that so easily.
 
Great!

Should we mention the cache and the three core cache-functions from imxrt.h?
arm_dcache_flush();
arm_dcache_delete();
arm_dcache_flush_delete();
 
Status
Not open for further replies.
Back
Top