malloc/free for EXTMEM and DTCM

Unfortunately it calls RAM2 RAM and RAM DTCM.
Thats because these names are used in the .ld files. The max sizes are not correct, too (see ITCM+DTCM)

Code:
[FONT=courier new]Memory region         Used Size  Region Size  %age Used[/FONT]
[FONT=courier new]            ITCM:         32 KB       [COLOR=#ff0000]512[/COLOR] KB      6.25%[/FONT]
[FONT=courier new]            DTCM:        5312 B      [COLOR=#ff0000] 512[/COLOR] KB      1.01%[/FONT]
[FONT=courier new]             [COLOR=#ff0000]RAM[/COLOR]:       12384 B       512 KB      2.36%[/FONT]
[FONT=courier new]           FLASH:       24152 B      1984 KB      1.19%[/FONT]



So ,I fear, this would lead to even more questions.
Better would be to get the output of the size-tool right:
Code:
Globale Variablen verwenden [COLOR=#ff0000]38076[/COLOR] Bytes ([COLOR=#ff0000]7[/COLOR]%) des dynamischen Speichers, [COLOR=#ff0000]486212[/COLOR] Bytes für lokale Variablen verbleiben. Das Maximum sind [COLOR=#ff0000]524288[/COLOR] Bytes.
Same program as above... Every number is wrong :-(

Even better than this would be a third:
We have a working solution, a simple C-Code that just needs to be compiled for windows, mac & linux ...
 
Last edited:
We have a working solution, a simple C-Code that just needs to be compiled for windows, mac & linux ...


Yes, this probably should go into 1.54-beta6.

Please start a new thread with a link to the latest code. Last time I looked it had parsing problems and initialized variable to zero rather than an illegal value which could catch parsing errors.

Has anyone actually compiled and used it on Mac & Linux yet?
 
Only seen it on Windows. And it was at first written to ONLY work on 1062's.

Not sure what changes it would take for pre-1062's if expanded to work on them.
 
I can try that. Maybe it just needs a boards.txt entry and a little edit to platform.txt so that it is used for T4 only.
 
The main problem seems to be that there is no way to switch of these lines:
(sry, german output)
Code:
Der Sketch verwendet 0 Bytes (0%) des Programmspeicherplatzes. Das Maximum sind 2031616 Bytes.
Globale Variablen verwenden 0 Bytes (0%) des dynamischen Speichers, 524288 Bytes für lokale Variablen verbleiben. Das Maximum sind 524288 Bytes.
 
The main problem seems to be that there is no way to switch of these lines:
(sry, german output)
Code:
Der Sketch verwendet 0 Bytes (0%) des Programmspeicherplatzes. Das Maximum sind 2031616 Bytes.
Globale Variablen verwenden 0 Bytes (0%) des dynamischen Speichers, 524288 Bytes für lokale Variablen verbleiben. Das Maximum sind 524288 Bytes.

Ha! clicked on "submit reply" and 5 seconds later I had the solution.
I'm working on it.
 
So, first information:

https://github.com/FrankBoesing/imxrt-size

1. I've converted it back to plain c
2. commented out some unneeded features
3. added Teensy MM

In release/windows is a windows executable
To compile the source, use gcc imxrt-size.c -o imxrt-size(.exe)

Copy the executable to Arduino/tools

Then, remove
teensy40.upload.maximum_size=...
teensy41.upload.maximum_size=...
teensyMM.upload.maximum_size=...
teensy40.build.command.size=...
teensy41.build.command.size=...
teensyMM.build.command.size=...

from boards.txt. This disables the default size printing

next:
add to platform.txt:

Code:
recipe.hooks.postbuild.4.pattern.[COLOR=#008000]windows[/COLOR]=cmd /c "{runtime.hardware.path}\..\tools\arm\bin\arm-none-eabi-gcc-nm -n {build.path}\{build.project_name}.elf | {runtime.hardware.path}\..\tools\imxrt-size"
recipe.hooks.postbuild.4.pattern.[COLOR=#008000]linux[/COLOR]=bash -c "{runtime.hardware.path}/../tools/arm/bin/arm-none-eabi-gcc-nm -n {build.path}/{build.project_name}.elf | {runtime.hardware.path}/../tools/imxrt-size"
recipe.hooks.postbuild.4.pattern.[COLOR=#008000]macosx[/COLOR]=bash -c "....

So, its too late to do more, now. Maybe tomorrow.
I have to install a new linux vm to try linux.. don't want to use my existing one. But i don't see a reason why it shouldn't work on other systems. The code is simplest C and takes its data from STDIN.

The postbuild-pattern needs a change, of course.

Edit: Works with linux, postbuild-pattern added.
 
Last edited:
Works as replacement here for prior version.

Here is the result from LFSintegrity build:
Code:
FlexRAM section ITCM+DTCM = 512 KB
	ITCM :  67288 B	(68.45% of   96 KB)
	DTCM :  21184 B	( 4.97% of  416 KB)
	Available for Stack: 404800 B
OCRAM: 512KB
	DMAMEM:  12384 B	( 2.36% of  512 KB)
	Available for Heap: 511904 B	(97.64% of  512 KB)
Flash:  90240 B	( 1.11% of 7936 KB)


Versus the current build text:
Code:
Sketch uses 87160 bytes (1%) of program storage space. Maximum is 8126464 bytes.
Global variables use 119476 bytes (22%) of dynamic memory, leaving 404812 bytes for local variables. Maximum is 524288 bytes.

And the other option from build:
Code:
Memory region         Used Size  Region Size  %age Used
            ITCM:         96 KB       512 KB     18.75%
            DTCM:       21184 B       512 KB      4.04%
             RAM:       12384 B       512 KB      2.36%
           FLASH:       90244 B      7936 KB      1.11%
            ERAM:          0 GB        16 MB      0.00%

The last option also included ERAM - which could be helpful to add when that is used
 
The table is more easy to read. Should I use this format?

Suppose Paul will have a mind about that?

Maybe non verbose emulate the IDE style? Verbose show both?

I like the table as noted as it presents the info in a better to understand format without parsing the text.
 
Code:
FlexRAM section ITCM+DTCM = 512 KB
	ITCM :  67288 B	(68.45% of   96 KB)
	DTCM :  21184 B	( 4.97% of  416 KB)
	Available for Stack: 404800 B
OCRAM: 512KB
	DMAMEM:  12384 B	( 2.36% of  512 KB)
	Available for Heap: 511904 B	(97.64% of  512 KB)
Flash:  90240 B	( 1.11% of 7936 KB)

A few hopefully constructive remarks:

While the percentages for ITCM and DTCM are correct technically, they might be misleading since they only show the usage relative to the current ITCM/DTCM split. I suggest to only show the used bytes and add a total percentage to the FlexRam line. I.e.

Code:
FlexRAM section ITCM+DTCM (17% from 512 KB)
	ITCM :  67288 B	
	DTCM :  21184 B	
	Available for Stack: 404800 B

Might be good to keep the nomenclature from the memory graphic on the PJRC page, i.e. RAM1 instead of FlexRam, RAM2 instead of OCRAM. I also wonder how many of the average users understand or even want to understand the details of the memory layout (try reading your table without understanding what ITCM, OCRAM etc. means). It might be more informative to have something like:
Code:
RAM1: 17% from 512 KB used
      Code (ITCM):                    67.3 kB	
      Variables (DTCM)                21.2 kB
      Available for local variables:   404 kB

RAM2: 2.4% of 512 KB used
      Variables (DMAMEM)              12.4 kB	
      Available for Heap:              512 kB	

Flash:  1.1% of 7936 KB used
      Code and Constants:             90.2 kB
 
Thanks Luni,

this looks good!


Good news, it compiles and runs with linux - as expected.
I'll edit the post #85 above for the "receipe".
 
Okay,

i've modified it to use Lunis variant.
Code:
RAM1:  7.28% of 512 kB used.
   Code (ITCM):               5.89 kB
   Variables (DTCM):          5.25 kB
   Available for Variables: 474.75 kB

RAM2:  2.36% of 512 kB used.
   Variables (DMAMEM):       12.09 kB
   Available for Heap:      499.91 kB

EXTMEM: 15.16 kB used.

FLASH:  0.22% of 7936 kB used.
   Code and Constants:      17.75 kB

EXTMEM is printed only for Teensy 4.1

The windows .exe is 32Bit (for those "Windows XP" users),
Linux has a 32 Bit and a 64 Bit variant.

Done.
Mac can be added easyly, I think, the "receipe" should be identical or very similar to Linux.


@Paul, just fork it and modify it as needed.


Edit: the 7.28% above look a bit.. not logical. You need to know that 32KB blocks for ITCM are used to understand the 7.28%. I see many questions on the horizon... Any Ideas?
 
Last edited:
Good Morning,

@Frank B - glad you have it coming along on the different platforms

Wondering if it currently outputs anything if the users sketch has extended RAM variables used in their sketch. I personally want to know that, especially if I wonder why my sketch is not working on a new fresh T4.1 that does not have any additional memory.

Also does it still return an error condition if it thinks the program is too big (i.e. stack space <= 0)

@luni - I mostly agree except we still have some strange math, that the user may or may not need to know about.
Example: If I grow my code 67K code by another 10 or 18KB or so, I still have the same free space for local variables, but if I add anything to variables it goes down.

And I agree that we should try to get the same nomenclature between the Teensy page and this. The interesting thing is what wording works. At times we are just substituting one acronym for another.
The good news is I will be happy that we get will get more of the information.

@Paul and others - In many of these cases it helped cement for me some of these acronyms to actually see what the mean. The stuff is mostly there, but for example: DTCM(Data Tightly coupled Memory) and ITCM(Instruction Tightly ...) OCRAM(Off Chip..) - (Slower uses Cache)

Sorry I keep picking on DMA, but I think it shows up 11 times on the T4.1 product page. Personally I still don't know how RAM2 is optimized to use DMA? It always felt to me that in the case of the T4.x the modifier DMAMEM was simply a convenient repurposing of the T3.x modifier, to simply a way to get mostly large uninitialized buffers) into that other memory space. I also think maybe the paragraph that describes what DMA is:
Direct Memory Access (DMA)
Teensy 4.1 has a general purpose 32 channel DMA controller.
Might be slightly expanded on to say something like what is it used for? Example like some of the information at the start of the manual eDMA chapter (probably reworded).
The enhanced direct memory access (eDMA) controller is a second-generation module
capable of performing complex data transfers with minimal intervention from a host
processor. ...


But again it is great that a lot more of the information is available.

EDIT: I see that EXTMEM is now printed for T4.1 (always? Or only if something is defined?)
 
Yes, it prints errors if Flash or RAM1 overflows (but I think that does the linker anyway). It still prints the EXTMEM, too, as in your variant.
OCRAM is "optimized" for DMA because it is less blocked for RAM accesses than ITCM/DTCM. Maybe it has a higher priority than the CPU( Cache), too (I'm not sure about that).
ITCM/DTCM has high traffic.
Normally, Memory can't be accessed by two "devices" at the same time. That would need special dual port RAM.
 
Last edited:
Just think of" DMAMEM" as an alias for "OCRAM".
(It would be good if it was 32-Byte aligned... if I remember correctly this is on Pauls "todo" - I think the name "DMAMEM" is a little inconsistent, if it's not aligned.. )
 
Yes, it prints errors if Flash or RAM1 overflows (but I think that does the linker anyway). It still prints the EXTMEM, too, as in your variant.
OCRAM is "optimized" for DMA because it less blocked for RAM accesses than ITCM/DTCM. Maybe it has a higher priority than the CPU( Cache), too (I'm not sure about that).
ITCM/DTCM has high traffic.
Normally, Memory can't be accessed by two "devices" at the same time. That would need special dual port RAM.
Thanks,

I believe the linker would only fail, if ITCM or DTCM > 512KB, not when the combined total... I put that check in earlier when I had a sketch (I think it was uncanny eyes) or it might have been a picture viewer sketch that compiled but would not run...

As you mentioned maybe different memory buss... But personally I find doing DMA far more convenient from RAM1 than RAM2, where you don't have to worry about Cache value versus memory value...
Again some of this could maybe be addressed by different Cache settings, but than you are probably trading off performance of your non DMA usage... Always trade offs!
 
Frank B said:
Edit: the 7.28% above look a bit.. not logical. You need to know that 32KB blocks for ITCM are used to understand the 7.28%. I see many questions on the horizon... Any Ideas?

I wouldn't do that, I'd just sum both and calculate the percentage. The deviation is relatively small and probably won't disturb anyone. Never confuse users if you want to avoid support (I'm heading a service department as you might have noticed :)) After all it is just about finding out how much memory is available. You could also print a link to a description of the values. (at least those, using more modern IDEs can simply click on it to open the browser).

KurtE said:
@luni - I mostly agree except we still have some strange math, that the user may or may not need to know about.
Example: If I grow my code 67K code by another 10 or 18KB or so, I still have the same free space for local variables, but if I add anything to variables it goes down.

Yes, of course, this is why I think the best is to simply show the total used memory without the ITCM/DTCM split thing. Anyway, if you have so little memory left that the 32kB max error makes any difference you probably run in to issues with the stack anyway.

It would be cool to issue some warning if stack space gets too small. Maybe something something like: "Your free space for local variables is very small, consider to move some items from RAM1 to FLASH or RAM2 see here http://some_nice_explanation.org for details".
 
Several users have had T_3.6 ported code FAIL on T_4.x. And some T_4.x code fails to run right - either not compile or find the stack won't function.

>> Too Much Info is easy to ignore or explain when it works, not enough info when it fails is a real failure.

Having true info on the RAM1 TCM memory is important, that is the 33KB of CODE is really taking 64KB of ITCM, that should not be hidden.
RAM1: 7.28% of 512 kB used.
Code (ITCM): 32 KB (5.89 kB)
Variables (DTCM): 5.25 kB
Available for Variables: 474.75 kB

Does the 'Verbose' compile flag extend to a 'local' variable in the build? Perhaps non-verbose could do a short version, but in Verbose nothing should be hidden if it leads to a solution or understanding of the problem.
 
a)
Code:
RAM1:  7.28% of 512 kB used.
   Code (ITCM[COLOR=#008000], 32 kB Blocks[/COLOR]):   5.89 kB
   Variables (DTCM):            5.25 kB
   Available for Variables:   474.75 kB

RAM2:  2.36% of 512 kB used.
   Variables (DMAMEM):         12.09 kB
   Available for Heap:        499.91 kB

EXTMEM: 15.16 kB used.

FLASH:  0.22% of 7936 kB used.
   Code and Constants:         17.75 kB

Is this better?

or this?
B)
Code:
RAM1:  7.28% of 512 kB used.
   Code (ITCM):               5.89 kB [COLOR=#008000](1x32 kB Block)[/COLOR]
   Variables (DTCM):          5.25 kB
   Available for Variables: 474.75 kB

RAM2:  2.36% of 512 kB used.
   Variables (DMAMEM):       12.09 kB
   Available for Heap:      499.91 kB

EXTMEM: 15.16 kB used.

FLASH:  0.22% of 7936 kB used.
   Code and Constants:      17.75 kB
 
Last edited:
Back
Top