T4.0 Memory - trying to make sense of the different regions

KurtE

Senior Member+
While trying to debug some issues, where a program would cause USB to not work in cases, but did in others, I decided to try to understand more on the memory organization of new Teensy T4.0.
I put a lot of this up on the Teensy 4 First beta thread, maybe starting with the post: https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=213113&viewfull=1#post213113

Also during this time frame, it was clear we probably need to enhance some of our data reporting at the end of the builds, as for example I did a sample application, which has the binary data for three full size bitmaps to display on an ILI9341 display, which it by default tries to put all of it in the ITCM segment. The linker said I only used under 50% of memory, but in actuality all of it was trying to fit into the DTCM segment and failed to load.

It was suggested that others might find some of this useful, especially if they did not have to dig through something like 170 pages of a forum thread. Hopefully over time we can cleanup these descriptions and maybe transfer some of this to a more appropriate spot like a Wiki or... Until then, those of moderator privileges, feel free to cleanup some of this... Including probably removing some of this intro:

During the T4 beta, I kept seeing and reading about different terms like FlexRam, ITCM, DTCM, OCRAM, ...? Sometimes I hate 3 or 4 letter acronyms! What do all of these mean, and what impact does each of these mean to me:

For example the product page says:

1024K RAM (512K is tightly coupled)
2048K Flash (64K reserved for recovery & EEPROM emulation)

So again what does this mean? I know I can search the web and the like and I have...

First: 1024KB of RAM with 512KB tightly coupled: The memory of the T4 is divided into two main pieces, both of which are 512KB in size.

FlexRam:
The first part which in the documents is called the FlexRam, which has 16 banks of 32KB of memory. Each of these banks can be configured to be of the type: ITCM, DTCM, OCRAM, or not used. In our setup we only use types ITCM (Instruction Tightly Coupled Memory) or DTCM(Data Tightly Coupled Memory). I will describe each of these in more details. But basically the build process will take some or all of your code and allocate enough 32k blocks of memory to hold this, and the startup code will copy that code into this block. The remaining blocks of the FlexRam will be marked to be DTCM. So for example if you have over 32Kb and less than 64KB of code, it will take 2 banks for ITCM and leave you with 14 banks for Data.

OCRAM: On Chip RAM
This is second 512KB of memory.

1924KB Flash memory - This is where your program is stored (maybe) - As I mentioned above about ITCM - A lot of your code may be moved into ITCM.

Now to describe some of these sections in some more details. As part of this, during the early part of the T4 beta, I believe it was @FrankB who first developed a tool that you could build into the build process (updated Platform.txt), that would give us more information than the standard build did. I know others have also done parts of this as well like @mjs513, @defragster, ... Over the last couple of days I decided to try to update it from the T4B1 (IMXRT 1052) to the current T4 (IMXRT 1062). It is still a Work In Progress, as for my own self I want to cleanup some of the output, plus put in some error checking which for example if you run out of room in DTCM, the tool should report this and return an error status.

I am not sure yet, if I should post the code again, or put into github or if someone already has a github project... But here is an example output, for a real simple sketch which can blink any pin...

Code:
cmd /c "D:\\arduino-1.8.9\\hardware\\teensy\\..\\tools\\arm\\bin\\arm-none-eabi-gcc-nm -n C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_683341\\Blink_any_pin.ino.elf | C:\\Users\\kurte\\source\\repos\\imxrt-size\\Debug\\imxrt-size.exe"
flexRam Config : aaaaaaab
ITCM :  22816 B	(69.63% of   32 KB)
DTCM :  12992 B	( 2.64% of  480 KB)
Stack Size : 478528
OCRAM:      0 B	( 0.00% of  512 KB)
Flash:  32672 B	( 1.61% of 1984 KB)
"D:\\arduino-1.8.9\\hardware\\teensy/../tools/arm/bin/arm-none-eabi-size" -A "C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_683341/Blink_any_pin.ino.elf"
Sketch uses 32672 bytes (1%) of program storage space. Maximum is 2031616 bytes.
Global variables use 35808 bytes (3%) of dynamic memory, leaving 1012768 bytes for local variables. Maximum is 1048576 bytes.

Now suppose I add one simple line like add: Serial.println("Defaults to pin 13");
These numbers change:
Code:
flexRam Config : aaaaaaab
ITCM :  22880 B	(69.82% of   32 KB)
DTCM :  12992 B	( 2.64% of  480 KB)
Stack Size : 478528
OCRAM:      0 B	( 0.00% of  512 KB)
Flash:  32752 B	( 1.61% of 1984 KB)

So again what does this all imply? What are the differences between these sections and how do I control where things are placed?
Code:
Note: A lot of this is very different than it was on T3.x

As mentioned earlier ITCM+DTCM=512kb, which is controlled by the flexRam config register...

ARM Memory Ranges:
If you look at the addresses of a variable or function pointer, you will see that the addresses of these items gives you a hint on what type of memory it is. That is
Code:
Chapter 2: Shows the Arm Platform Memory map: Things like:
0-0x7FFFF - ITCM (512KB)
0x20000000 - 0x2007FFFF - DTCM (512KB)
0x20200000 - 0x2027FFFF - OCRMA2 (512KB)
0x60000000 - 0x6FFFFFFF - FLEXSPI ...

### Code ###: What goes into ITCM versus what goes into Flash?
I believe the simple answer is, by default all code will try to be placed into the ITCM section. As you can see, just adding the Serial.println increased this sections size.

I believe the way to leave the code in the flash memory is by using the keyword: FLASHMEM
Yes The T4 is different than The T3.x and TLC in that FLASHMEM means something again. So for example if I change my sketch setup function to be defined like:

Code:
void FLASHMEM setup()  {
  // Blink any pin.  Note: I put pin 13 as input to see if we can
  // jumper to it to see if we can find the pin...
  while (!Serial && millis() < 5000);
  DBGSerial.begin(115200);
  delay (250);
  DBGSerial.println("Find Pin by blinking");
  DBGSerial.println("Enter pin number to blink");
  DBGSerial.println("Defaults to pin 13");
  pinMode(13, OUTPUT);
}
The size of ITCM shrank
Code:
flexRam Config : aaaaaaab
ITCM :  22752 B	(69.43% of   32 KB)
DTCM :  12992 B	( 2.64% of  480 KB)
Stack Size : 478528
OCRAM:      0 B	( 0.00% of  512 KB)
Flash:  32768 B	( 1.61% of 1984 KB)
Not by much, but than again there is not much code there.

### DATA ### What goes into DTCM, versus OCRAM, versus stays in Flash?

DTCM - By default I believe just about everything goes here? This include all of your global variables, both initialized and uninitialized variables.

Unlike the T3.x, variables such as arrays that are defined as const, will not stay in Flash, but instead will be copied at startup time into DTCM. So some programs that for example work on T3.6 may run into issues of running out of RAM.

FLASH - as I mentioned under DTCM, const data is by default not left in Flash, but instead moved into DTCM. You can tell the system to leave some specific const structures in flash, by using the PROGMEM keyword. Like:

Code:
const unsigned short teensy40_front[76800] PROGMEM={...};

Also I am not sure, but example earlier might imply that strings you pass to things like Serial.println, may also be left in Flash?

OCRAM - Again the other 512KB of memory...

So far I have found only two ways to use this memory. You can define a variable with the attribute: DMAMEM
Or you use malloc/new to allocate the memory.

So far I have not found any way to have a program put any initialized structures up in this region of memory.

The memory in the OCRAM section is defined as being cached WBWA – which can really screw up DMA. That is DMA operations will talk to the underlying memory whereas normal instructions will talk through the cache, which may or may not match. ….

Bad enough using it for DMA buffers, not sure how to get it to work with things like DMASettings. At lest I did not get them to work at all, especially when it involves replaceOnCompletion semanatics.

Will add more soon, but tired of typing!

Again those with moderator access, feel free to correct, add, ...

Also let me know if there is some additional things I should add/remove/modify.

Thanks
Kurt

Update: Paul has added more details about the T4 memory layout on the T4 Product page:
I've added a "Memory Layout" section to the Teensy 4.0 product page.

Update: Update PROGMEM for code is now FLASHMEM as having PROGMEM for both code and data causes link issues.
 
Last edited:
As per earlier request, I put my current version of imxrt-size up on github: https://github.com/KurtE/imxrt-size
I believe this includes the .exe for windows. Sorry I have not tried building a version for Linux or MAC...

Not sure if there are other versions up there or not and or what is the best location for such a tool...

I have made a few changes to the output since the earlier stuff.

There are probably cleaner ways to set this up, to be used, but currently I just added a line to the platform.txt file, similar to what I think was @FrankB did earlier, that for my current setup looks like:
Code:
recipe.hooks.postbuild.4.pattern.windows=cmd /c "{runtime.hardware.path}\..\tools\arm\bin\arm-none-eabi-gcc-nm -n {build.path}\{build.project_name}.elf | D:\GITHUB\imxrt-size\Debug\imxrt-size.exe"

Note: the above line is probably screwing up some other error/warning message as you will see in my error case, where there were multiple versions of libraries...

Current output for example from compiling my Blink_any_pin program looks like:
Code:
cmd /c "D:\\arduino-1.8.9\\hardware\\teensy\\..\\tools\\arm\\bin\\arm-none-eabi-gcc-nm -n C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_683341\\Blink_any_pin.ino.elf | C:\\Users\\kurte\\source\\repos\\imxrt-size\\Debug\\imxrt-size.exe"

FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaab
    ITCM :  22752 B	(69.43% of   32 KB)
    DTCM :  12992 B	( 2.64% of  480 KB)
    Stack Size : 478528
OCRAM: 512KB

    DMAMEM:      0 B	( 0.00% of  512 KB)
    Free Heap Space: 524288 B	(100.00% of  512 KB)
Flash:  32768 B	( 1.61% of 1984 KB)
"D:\\arduino-1.8.9\\hardware\\teensy/../tools/arm/bin/arm-none-eabi-size" -A "C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_683341/Blink_any_pin.ino.elf"
Sketch uses 32768 bytes (1%) of program storage space. Maximum is 2031616 bytes.
Global variables use 35744 bytes (3%) of dynamic memory, leaving 1012832 bytes for local variables. Maximum is 1048576 bytes.

And now the program will error out if it finds that the FlexRam section ran out of space, like when I tried to put three images into main memory. Looks now like:
Code:
FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaaf
    ITCM :  35488 B	(54.15% of   64 KB)
    DTCM : 475840 B	(103.72% of  448 KB)
>>>>> Error FlexRAM Filled no room for Stack: -17088 <<<<<
OCRAM: 512KB
    DMAMEM:      0 B	( 0.00% of  512 KB)
    Available for Heap: 524288 B	(100.00% of  512 KB)
Flash: 507952 B	(25.00% of 1984 KB)
===warn ||| Multiple libraries were found for "{0}" ||| [ILI9341_t3.h]
 Used: C:\Users\kurte\Documents\Arduino\libraries\ILI9341_t3
 Not used: D:\arduino-1.8.9\hardware\teensy\avr\libraries\ILI9341_t3
Using library SPI at version 1.0 in folder: D:\arduino-1.8.9\hardware\teensy\avr\libraries\SPI 
Using library ILI9341_t3 at version 1.0 in folder: C:\Users\kurte\Documents\Arduino\libraries\ILI9341_t3 
exit status -1
Error compiling for board Teensy 4.0.
And as I mentioned the warning message for multiple libraries is messed up, but it does now show that I ran out of memory and aborts.
 
@KurtE

Thanks for putting this together. There were a couple of things I had to do,
  1. In your recipe had to change the path for imxrt-size.exe - put it back the way frank had it. Hoping Paul makes it permanent
  2. Also had to recompile for my Windows 10 x64 machine.

Once I did that it worked like a charm. Since I had the arducam shield sketch open I used it for my test case:
Code:
Opening Teensy Loader...

cmd /c "F:\\arduino-1.8.9\\hardware\\teensy\\..\\tools\\arm\\bin\\arm-none-eabi-gcc-nm -n C:\\Users\\Merli\\AppData\\Local\\Temp\\arduino_build_31145\\ArduCAM_Shield_V2_Camera_Playback.ino.elf | F:\\arduino-1.8.9\\hardware\\teensy\\..\\tools\\imxrt-size"

FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaab
    ITCM :  25664 B	(78.32% of   32 KB)
    DTCM :  17088 B	( 3.48% of  480 KB)
    Available for Stack: 474432
OCRAM: 512KB
    DMAMEM:      0 B	( 0.00% of  512 KB)
    Available for Heap: 524288 B	(100.00% of  512 KB)
Flash:  39120 B	( 1.93% of 1984 KB)
===info ||| Multiple libraries were found for "{0}" ||| [SD.h]
 
@KurtE - Very good indeed. Amazing how much added info comes from those few lines of code {after reading your posts :) }! Indeed that info does a great deal to explain the T4's memory.

I like the added DMAMEM line showing where that resides when used - would it help to add label for PROGMEM on the FLASH?

I copied the RAW CPP from github into my VS 2017 source file and it works - for my build I had to include the PCH.H or it wouldn't build.

Code:
FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaab
    ITCM :   6864 B	(20.95% of   32 KB)
    DTCM :   8896 B	( 1.81% of  480 KB)
    Available for Stack: 482624
OCRAM: 512KB
    DMAMEM:      0 B	( 0.00% of  512 KB)
    Available for Heap: 524288 B	(100.00% of  512 KB)
Flash:  12896 B	( 0.63% of 1984 KB)

As noted I have it working in TSET - though I suppose that needs an update. When I created my exe the underbar not hyphen was used:
"%arduino%\hardware\tools\arm\bin\arm-none-eabi-gcc-nm.exe" -n "%temp1%\%sketchname%.elf" | "%tools%\imxrt_size.exe"
 
I am not sure how interesting this part might be, but with the uncannyeyes, which it appears like at least some of the issues we were running into is caused by unitialized members of the st7735_t3 code, which you will only ever see if you do a new of the display class...

But while debugging some of this stuff, I did add to my version of the sketch a couple of functions, to help me debug some stuff:
Code:
// from the linker
//  extern unsigned long _stextload;
extern unsigned long _stext;
extern unsigned long _etext;
//  extern unsigned long _sdataload;
extern unsigned long _sdata;
extern unsigned long _edata;
extern unsigned long _sbss;
extern unsigned long _ebss;
//  extern unsigned long _flexram_bank_config;
extern unsigned long _estack;

void DumpMemoryInfo() {
#if defined(__IMXRT1062__)
  uint32_t flexram_config = IOMUXC_GPR_GPR17;
  Serial.printf("IOMUXC_GPR_GPR17:%x IOMUXC_GPR_GPR16:%x IOMUXC_GPR_GPR14:%x\n",
                flexram_config, IOMUXC_GPR_GPR16, IOMUXC_GPR_GPR14);
  Serial.printf("Initial Stack pointer: %x\n", &_estack);
  uint32_t dtcm_size = 0;
  uint32_t itcm_size = 0;
  for (; flexram_config; flexram_config >>= 2) {
    if ((flexram_config & 0x3) == 0x2) dtcm_size += 32768;
    else if ((flexram_config & 0x3) == 0x3) itcm_size += 32768;
  }
  Serial.printf("ITCM allocated: %u  DTCM allocated: %u\n", itcm_size, dtcm_size);
  Serial.printf("ITCM init range: %x - %x Count: %u\n", &_stext, &_etext, (uint32_t)&_etext - (uint32_t)&_stext);
  Serial.printf("DTCM init range: %x - %x Count: %u\n", &_sdata, &_edata, (uint32_t)&_edata - (uint32_t)&_sdata);
  Serial.printf("DTCM cleared range: %x - %x Count: %u\n", &_sbss, &_ebss, (uint32_t)&_ebss - (uint32_t)&_sbss);
  Serial.println("Now fill rest of DTCM with known pattern"); Serial.flush(); //
  // Guess of where it is safe to fill memory... Maybe address of last variable we have defined - some slop...
  for (uint32_t *pfill = (&_ebss + 32); pfill < (&itcm_size - 10); pfill++) {
    *pfill = 0x01020304;  // some random value
  }
#endif
}
void EstimateStackUsage() {
#if defined(__IMXRT1062__)
  uint32_t *pmem = (&_ebss + 32);
  while (*pmem == 0x01020304) pmem++;
  Serial.printf("Estimated max stack usage: %d\n", (uint32_t)&_estack-(uint32_t)pmem);
#endif
}
Which I call the first one early on in setup. I call the second one when I print out how many frames per second are output...
Code:
IOMUXC_GPR_GPR17:aaaaaaaf IOMUXC_GPR_GPR16:7 IOMUXC_GPR_GPR14:aa0000
Initial Stack pointer: 20070000
ITCM allocated: 65536  DTCM allocated: 458752
ITCM init range: 0 - 9c70 Count: 40048
DTCM init range: 20000000 - 20001ad0 Count: 6864
DTCM cleared range: 20001ad0 - 200062c0 Count: 18416
Now fill rest of DTCM with known pattern

C:\Users\kurte\Documents\Arduino\uncannyEyes_async_st7735\uncannyEyes_async_st7735.ino Aug 24 2019 08:49:51
Init
Create display #0
Create display #1
ST7789_t3::init mode: 0
Init ST77xx display #0
Rotate
ST7789_t3::init mode: 0
Init ST77xx display #1
Rotate
done
Display logo
$0: Using Frame buffer
$1: Using Frame buffer
36
Estimated max stack usage: 1616
36
Estimated max stack usage: 1616
36
Estimated max stack usage: 1616
36
Estimated max stack usage: 1616
36
Estimated max stack usage: 1616

Note: the sketch has now run for maybe an hour and the estimated Max size is now up to 1696.

This information as well as the information I printed as part of this build:
Code:
FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaaf
    ITCM :  40048 B	(61.11% of   64 KB)
    DTCM :  25280 B	( 5.51% of  448 KB)
    Available for Stack: 433472
OCRAM: 512KB
    DMAMEM:      0 B	( 0.00% of  512 KB)
    Available for Heap: 524288 B	(100.00% of  512 KB)
Flash: 212352 B	(10.45% of 1984 KB)

Which if we assume that this stack will not grow above 2K needed, implies we have over 400K in lower memory, that we might want to make available for usage.

Example currently in the ST7789_t3 DMA update code, I currently have the displays malloc their frame buffer. Yes I also have the option to allocate this myself and tell the display to use my own buffer... But if the user object does not do that, maybe it might want to give preference to having the frame buffer in DTCM where you don't have DMA cache issues...

Likewise currently I define a structure with some smaller buffers as well as DMASetting and DMAChannel structures and I define a set of three static ones for this class on the off chance the user will do a new of this display class. Might be good if again we could simply allocate lower memory on the fly when we need it.

Maybe does not need to be anything more than just allocate (ie. maybe don't support free or realloc...)

Thoughts?
 
Last edited:
Question for self and others...

I remember during the beta cycle that you could mark either code or data as PROGMEM.

I showed that in earlier example: But again you can also do with code:
Code:
uint8_t dtcm_array[10];
uint8_t DMAMEM ocram_array[10];
const uint8_t const_array[] = "ABCD";
uint8_t const_progmem_array[] PROGMEM = "XYZ";

void PROGMEM function_in_upper_mem() {
  Serial.printf("Address function_in_upper_mem: %x\n", (uint32_t)&function_in_upper_mem);
}

void setup() {
  while (!Serial && millis() < 4000) ;
  Serial.begin(115200);
  uint8_t *heap_mem = (uint8_t *)malloc(16);
  pinMode(13, OUTPUT);

  Serial.printf("Address dtcm_array: %x\n", dtcm_array);
  Serial.printf("Address ocram_array: %x\n", ocram_array);
  Serial.printf("Address const_array: %x\n", const_array);
  //Serial.printf("Address const_progmem_array: %x\n", const_progmem_array);
  Serial.printf("Address heap_mem: %x\n", heap_mem);

  Serial.printf("Address setup: %x\n", (uint32_t)&setup);

  function_in_upper_mem();


}
void loop() {
  digitalWrite(13, !digitalRead(13));
  delay(250);
}
And it will show that the function
Code:
void PROGMEM function_in_upper_mem() {
  Serial.printf("Address function_in_upper_mem: %x\n", (uint32_t)&function_in_upper_mem);
}
Will show that it is indeed up in the other memory region and not copied into ITCM...

However it appears like you can not have both. That is if you uncomment the line: uint8_t const_progmem_array[] PROGMEM = "XYZ";

You will get a compiler error that it conflicts with the data...

Have we found a way to resolve this?
 
And it will show that the function
Code:
void PROGMEM function_in_upper_mem() {
  Serial.printf("Address function_in_upper_mem: %x\n", (uint32_t)&function_in_upper_mem);
}
Will show that it is indeed up in the other memory region and not copied into ITCM...

However it appears like you can not have both. That is if you uncomment the line: uint8_t const_progmem_array[] PROGMEM = "XYZ";

You will get a compiler error that it conflicts with the data...

Have we found a way to resolve this?
The only way to do it is to have two different names for the sections, one name for sections that contain functions, and another name for sections that contain data, and then modify the linker script so it links both adjacent to each other. This will obvious require having two macros.

I believe this is due to the data .sections not wanting to set the executable bit in the ELF section information, and wanting to set that bit for functions.
 
The only way to do it is to have two different names for the sections, one name for sections that contain functions, and another name for sections that contain data, and then modify the linker script so it links both adjacent to each other. This will obvious require having two macros.

I believe this is due to the data .sections not wanting to set the executable bit in the ELF section information, and wanting to set that bit for functions.

Yes, FrankB discovered this during T4_Beta - using PROGMEM on DATA and CODE in the same compile unit caused a fail. He tried a counting macro to edit the name progressively and that failed - maybe for the reason MMeisnner noted.

No Alternate was ever implemented but seemed to be needed.
 
Thanks @MichaelMeissner and @defragster - that is what I remembered as well, but was not sure if anyone had come up with any other ways to do it.

Like maybe a build option: that says leave all code up in this memory, except for those functions marked as FASTRUN.

Actually is there anything like Fast Run? Or are we sort of making all code run this way?
 
Like maybe a build option: that says leave all code up in this memory, except for those functions marked as FASTRUN.

I've been considering making an alternate linker script which would default code and const variable into to slow-but-cached flash. This and the one we have now need user friendly names. I've been thinking something like "Use RAM to improve speed" and "Save RAM for variables".

But before that, I guess we need another name like PROGMEM to be used on functions. Anyone care to suggest words to be forever commandeered.... and create thorny conflicts if anyone uses that word in libraries or programs?
 
I've been considering making an alternate linker script which would default code and const variable into to slow-but-cached flash. This and the one we have now need user friendly names. I've been thinking something like "Use RAM to improve speed" and "Save RAM for variables".

But before that, I guess we need another name like PROGMEM to be used on functions. Anyone care to suggest words to be forever commandeered.... and create thorny conflicts if anyone uses that word in libraries or programs?

That makes sense. I do think we should also consider adding something like the more detailed output, like I show above like in post #6 to the standard setup, as to help make it clear when you might actually be running out of specific types of memory. But as you mentioned we probably need better names.

I agree that you probably need a different term than PROGMEM for program space, hopefully someone will come up with good names for things. Names are not my best thing. I would probably tend to call it something like SLOWRUN but that might not be the best :lol:
 
I've been considering making an alternate linker script which would default code and const variable into to slow-but-cached flash. This and the one we have now need user friendly names. I've been thinking something like "Use RAM to improve speed" and "Save RAM for variables".

But before that, I guess we need another name like PROGMEM to be used on functions. Anyone care to suggest words to be forever commandeered.... and create thorny conflicts if anyone uses that word in libraries or programs?

Since you asked. How about just the corollary - FUNCMEM. Don't think I have ever seen that in any of the libraries I looked at. No its not all that much fun but too tired this morning. :)
 
'RAMming Speed' comes to mind for the FAST compile - but that wasn't the question.

For code to FLASH how about :: PROGFLASH
> of course PROGMEM - was generally to put Data to Flash, but that ship has sailed.

Can code run from upper 512KB? The DMAMEM area? Seems I saw that cannot be done.

Can there be something like a '#define UPPERRAM' that would allocate compile time global objects there?

Is there a fixed initial Stack size? Or a known minimum functional stack size? Users picturing 1024KB RAM can get a hanging Teensy at 49% RAM usage.

Would be nice to have ( perhaps verbose? ) display of imxrt-size like number in the build for memory map awareness.
 
This info greatly helped me understand why my rather large RAM memory project crashes and burns ..
I used the static const prefixes to setup rom images and static ram buffers but kept running out of RAM space (crash/hang)

Will try to fiddle with PROGMEM and DMAMEM to separate the spaces.
 
DMAMEM works fine, PROGMEM keeps getting me in trouble. I declared some datablocks in seperate header files varying in size from a few kB's to 128kB) as const static uint8_t datablock[] PROGMEM = { .... }; and sometimes the pointers give strange data in return. Add or remove a line here or there in the program and it changes. If it works, it works consistent and if it does not it also consistently fails making fault finding very cumbersome.

I even changed from the Arduino IDE to PlatformIO but is seems to use the Arduinoframework for the Teensy.

Any other ways to make sure the data is put into flash and the pointer to it is given the right value?
 
Probably a dumb question, but I have a constant defined in a separate .cpp file as output from wav2sketch. It looks like: const unsigned int AudioSample[188161] = {.......}

If I try to use PROGMEM (because if I don't I'm getting a "`.data' will not fit in region `DTCM'"), I get "expected initializer before 'PROGMEM'" when the line looks like : const unsigned int AudioSample[188161] PROGMEM = {.......}

Obviously my syntax is messed up. What syntax or other things am I missing, and for that matter, what is the syntax for using the DMAMEM keyword?

Thank you to anyone that can point out what I'm doing wrong! (I'm using 1.8.10 with teensyDuino 1.48 with "Teensy 4.0" selected as the board)
 
This post: forum.pjrc.com/threads/57459-Errors-quot-not-within-region-DTCM

Suggests this presentation - not noted if it was effective:
Not sure what the samples look like for definition or size? But declaring them PROGMEM in the header files will keep them on FLASH, otherwise they will reside in RAM which the DTCM portion is some part of 512KB.

I found this in the tree so something like this should work:
Code:
PROGMEM
const unsigned int AudioSampleCashregister[5809] = {
// ...
 
+1 for interest in this and how to place and access large arrays in Flash.

I'm using PlatformIO, verbose build categorises the memory usage as:
Code:
Memory Usage -> http://bit.ly/pio-memory-usage
DATA:    [===       ]  28.0% (used 293680 bytes from 1048576 bytes)
PROGRAM: [          ]   3.9% (used 79280 bytes from 2031616 bytes)
.pio/build/teensy40/firmware.elf  :
section                            size         addr
.text.progmem                     10176   1610612736
.text.itcm                        43280            0
.fini                                 4        43280
.ARM.exidx.text.__aeabi_atexit        8        43284
.data                             25824    536870912
.bss                              44512    536896736
.bss.dma                         180064    538968064
.debug_frame                       1180            0
.ARM.attributes                      46            0
.comment                            110            0
Total                            305204

I'm assuming in above that :
.text.progmem and .text.itcm are what ends up in FLASH
.data, .bss and bss.dma are all RAM,
- is that correct?

If using PROGMEM do we need to use pgm_read_word etc. to retrieve the data? (@gertk - is that what you are doing or are you trying to use the variables directly?), Cheers Paul
 
The compile fails with: 'PROGMEM' does not name a type

If the file is a .ino, you should not see this error. Please share the file (and all other required files) we can try to reproduce the problem.

If the file is a .c or .cpp, you need to add #include <avr/pgmspace.h>. Yes, I know that "avr" probably seems strange, but we're intentionally mimicing AVR's semantics, so that's the include you need to get PROGMEM defined.
 
Using this example: "...\hardware\teensy\avr\libraries\Audio\examples\Synthesis\Wavetable\SimpleWavetable\SimpleWavetable.ino"

It includes this CPP file with no Include of <avr/pgmspace.h>:: "...\hardware\teensy\avr\libraries\Audio\examples\Synthesis\Wavetable\SimpleWavetable\Flute_100kbyte_samples.cpp"
PROGMEM
static const uint32_t sample_0_Flute_100kbyte_FluteD4[7936] = {
0xfbfffc4b,0xfba4fbac,0xfbbffbb2,0xfc00fbdc,0xfc3afc16,0xfcb8fc57,0xfcf9fcf3,0xfc87fcad,
0xfcabfc77,0xfd8cfd0c,0xfe7dfe0c,0xff24fee0,0xffc4ff61,0x006a002d,0x002d004c,0xffe90000,
...

Built as installed - with PROGMEM - compiles it shows as WORKING :: >> Global variables use 43360 bytes
Code:
Sketch uses 131968 bytes (6%) of program storage space. Maximum is 2031616 bytes.
Global variables use 43360 bytes (4%) of dynamic memory, leaving 1005216 bytes for local variables. Maximum is 1048576 bytes.

Building after removing the PROGMEM:: >> Global variables use 72032 bytes
Code:
Sketch uses 131968 bytes (6%) of program storage space. Maximum is 2031616 bytes.
Global variables use 72032 bytes (6%) of dynamic memory, leaving 976544 bytes for local variables. Maximum is 1048576 bytes.

The other examples tried were with the allocation in an included HEADER file to the INO file.
 
I'm working on this issue today. Or really 3 issues:

1: Clear & easy to understand documentation is needed
2: Arduino's size info doesn't fit the memory model
3: PROGMEM can't be used for both code & data in the same file

Right now, I'll get this unanswered question...

Can code run from upper 512KB? The DMAMEM area? Seems I saw that cannot be done.

The short answer is no, you can't execute code from there.

The reason why is in startup.c, which configures the MPU to disallow code execution from the data and peripheral spaces.

Code:
        SCB_MPU_RBAR = 0x00000000 | REGION(0); // ITCM
        SCB_MPU_RASR = MEM_NOCACHE | READWRITE | SIZE_512K;

        SCB_MPU_RBAR = 0x00200000 | REGION(1); // Boot ROM
        SCB_MPU_RASR = MEM_CACHE_WT | READONLY | SIZE_128K;

        SCB_MPU_RBAR = 0x20000000 | REGION(2); // DTCM
        SCB_MPU_RASR = MEM_NOCACHE | READWRITE | [B]NOEXEC[/B] | SIZE_512K;

        SCB_MPU_RBAR = 0x20200000 | REGION(3); // RAM (AXI bus)
        SCB_MPU_RASR = MEM_CACHE_WBWA | READWRITE | [B]NOEXEC[/B] | SIZE_1M;

        SCB_MPU_RBAR = 0x40000000 | REGION(4); // Peripherals
        SCB_MPU_RASR = DEV_NOCACHE | READWRITE | [B]NOEXEC[/B] | SIZE_64M;

        SCB_MPU_RBAR = 0x60000000 | REGION(5); // QSPI Flash
        SCB_MPU_RASR = MEM_CACHE_WBWA | READONLY | SIZE_16M;

If you edit startup.c to delete the NOEXEC parameter for region 3, then you would be able to execute code from the DMAMEM or malloc() controlled areas.

I decided early in the beta test to put NOEXEC on this regions, mainly as a security measure. If we have a future networking library or project with a buffer overflow, disallowing code execution from the areas used for variables greatly reduces the odds it could be exploited as a security vulnerability. It's also much easier to later loosen memory access than it is to tighten, so it seemed like a good idea to use NOEXEC from the beginning.

Whether ITCM should be READWRITE is debatable.
 
@PaulStoffregen - That would be great. I started this thread, as at the time it was unclear to me, how all of these memory sections worked, so I thought I might try to keep a few notes...

I then played around with an alternative to the size program as, I was hit with a few cases of size saying I used something like 50%, but it failed. Which is when I hacked up the program mentioned earlier in this thread (which I still update the platform.txt to run).

Things that confuse and/or maybe would be great if we had alternatives include:

a) FlexRam area is confusing: You have 512KB, but most if not all of the programs code goes into this region, now round up in size to 32kb blocks and only what is left is available for data (and stack)

a1) Currently there is no way to use the remaining FlexRam memory (except the stack), As all heap allocations are done out of the OCRAM. Wish there was a easy method to at least be able to allocate blocks out of here...

b) OCRAM: Only way you can use any of this is by using the heap through either malloc or new ...
Or by using the DMAMEM keyword - But I don't believe you can have your program have this area initialized, it simply allocates memory space.

b1) Unless you edit startup.c - All of this memory is using memory cache, which causes several issues with using DMA... In many cases you can use functions like: arm_dcache_flush and arm_dcache_delete to get DMA to mostly work. We need some documentation on how/when we should use these functions.

b2) Is there any way to say disable the cache for portions of this region of memory? Currently for some of the display libraries that have DMA updates, especially if we enable continuous updates, I have resorted to having one or more buffers in DTCM, where the DMA operations, copy a chunk of the frame buffer down to these buffers using the interrupt at completions, as to have the actual things I wrote out to the screen in the library actually make it to the screen...

c) FLASH/Progmem: Currently the only way you have to not have your sketches code (as well as libraries), not be allocated in the Flexram(ITCM) region of memory which is copied down at startup, is to use the keyword PROGMEM. Likewise if you have large memory tables that you do not wish to have them take up memory space in the DTCM segment you need to mark that variable definition with the PROGMEM keyword. Currently a sketch can not do both. That is if you have a method with the PROGMEM keyword and a variable also with this keyword, you will end up a a link error.

c1) Might be good to have build option? with alternative linker scripts? Which by default leave program up in the FLASH memory and only copy those parts down which are marked with something like FASTRUN?

Again your three things mentioned above would sure go a long ways toward this!
 
Back
Top