Using RAM1 & RAM2

gonzales

Well-known member
Hi!
Please explain how to use the memory on Teensy4.1. correctly. I have a very large code with a lot of libraries and the issue of memory usage arose.
My typical situation is
Memory Usage on Teensy 4.1:
FLASH: code:620644, data:104436, headers:9124 free for files:7392260
RAM1: variables:116320, code:377656, padding:15560 free for local variables:3541
RAM2: variables:66752 free for malloc/new:457536

As you can see, there is very little free space for local variables, so I want to make more use of RAM2 and FLASH features.
I used FLASHMEM for all my functions and DMAMEM for global variables, but that's not enough, the code still takes up a lot of space. What else can I do?

I see that the connected libraries also use RAM1.
This is a simple example
C++:
#include <SPI.h>
//#include <Ethernet.h>
//#include <NativeEthernet.h>
#include <QNEthernet.h>
using namespace qindesign::network;

EthernetClient g_ethClient1;

void setup() {
  // put your setup code here, to run once:

}

void loop() {
  // put your main code here, to run repeatedly:

}

If i use Ethernet.h - memory usage is
FLASH: code:19584, data:3016, headers:9140 free for files:8094724
RAM1: variables:3936, code:16880, padding:15888 free for local variables:487584
RAM2: variables:12416 free for malloc/new:511872

With using NativeEthernet.h it is
FLASH: code:209928, data:73160, headers:8748 free for files:7834628
RAM1: variables:29056, code:71560, padding:26744 free for local variables:396928
RAM2: variables:12448 free for malloc/new:511840

And with using QNEthernet.h
FLASH: code:137572, data:20424, headers:8912 free for files:7959556
RAM1: variables:24896, code:133044, padding:30796 free for local variables:335552
RAM2: variables:55616 free for malloc/new:468672

Are there any memory allocation options?
Thanks a lot for answers!!
 
Yes, this makes a significant difference, but I've always been wary of using optimization, as there might be problems with the functioning of the entire code.
 
Yes, this makes a significant difference, but I've always been wary of using optimization, as there might be problems with the functioning of the entire code.
The default is to use optimisation, all that option does is optimise things differently and trade off code speed for size. With the teensy speed isn't normally an issue.
The entire point of the optimisation process is that the end result should be functionally identical. Normally the only time optimisation has an impact on the functionality of the code is when it already has an underlying issue but just happens to work right now. Some timing dependency that isn't correctly delt with but happens to be OK, something that should be flagged as volatile but isn't. Keeping optimisation off because of this sort of thing is asking for them to come back to bite you in the future. Aside from that optimisation shouldn't impact how your code operates.

You can get occasional weird issues, I've had one where it optimised a memcpy of 8 bytes into a couple of 32 bit load and store operations which then failed if the memory addresses being copied weren't 32 bit boundary aligned. But that sort of thing is generally very rare.
 
The usual behaviour of any of the optimization levels besides "smallest code" is to replicate the same code multiple times to avoid branching, so it blows up the size quite significantly. It also uses a much bulkier version of newlib to provide all the basic C functions, rather than newlib-nano.
 
Well in addition to the RAM1 and RAM2 normally that comes with the Teensy 4.1, you have the option to solder 1-2 memory chips to the Teensy.
  • You can solder 1 or 2 PSram chips
  • You can solder 1 flash memory chip
  • You can solder 1 flash memory chip and 1 PSram chips
The PSram chips are fixed at 8 megabytes and are volatile (i.e. when you power cycle the Teensy, the contents will not be restored). The MEM1 and MEM2 banks each have 512 kilobytes of memory. So for data, you have 16 or 32 times the size one of the memory banks.

The flash memory chips are non-volatile (i.e. their value persists between power cycling). Typically flash memory is used to host a file system, similar to a SD card.

You can put non-initialized static/global arrays into the 8 or 16 megabytes provided by the PSram chips using the 'EXTMEM' macro. You should use 'memset' to explicitly clear these arrays at program start. The 'extmem_malloc' function can be used to dynamically allocate memory on the PSram chips.

I believe the memory in PSram is cached like normal memory. If the memory system has to bring the memory into the cache it is somewhat slower than main memory. It depends on the program whether this is an issue or not. But even if it is slower, it acts like normal memory. You can pass the pointer around and use it. You don't have to use special functions to copy the data from PSram to main memory to act upon it.

There are some special rules about memory that is accessed via DMA actions. If you are coding with DMA, be sure to read the fine print between MEM1, MEM2, and PSmem on when you need to explicitly flush the cache.

If your soldering skills are not up to soldering the 2 SMT chips to the Teensy, the company protosupplies.com offers Teensy 4.1s with various options for expansion memory already soldered in:
  • A single 8 megabyte PSram chip and a single 16 megabyte flash chip
  • A single 8 megabyte PSram chip and a single 128 megabyte flash chip
  • A single 8 megabyte PSram chip and a single 256 megabyte flash chip
  • A pair of 8 megabyte PSram chips
I've ordered several Teensy 4.1's from Protosupplies.com and I have been happy with them.

In terms of flash file systems, there is a system called MTP that allows your file systems (on flash memory, in the SD cards, and the flash memory on the Teensy chip) to be used as remote USB access from a host. Note, MTP is not currently part of the base Teensy software, but it is around. This would allow easy access between your host computer and the data in the Teensy. You would still have to open, read, write, and close the files just like you would with a SD card.
 
Last edited:
Thanks for the answers, but my question was, what is the best way to use almost free RAM2 memory? Why would I complicate the device with Ext memory when my RAM 2 is almost free? In my issue it is more then 400kB.
 
Thanks for the answers, but my question was, what is the best way to use almost free RAM2 memory? Why would I complicate the device with Ext memory when my RAM 2 is almost free? In my issue it is more then 400kB.
Well it is free until your program is running. When you call malloc/new, it allocates out of RAM2.

For static/global variables you can use 'DMAMEM' and the variable is put into MEM2. Note, these variables cannot be initialized. I don't recall if they are guaranteed to be zeroed.

If you have large static/global read-only tables, you can declared them 'PROGMEM' and the variables will only be located in flash memory and won't be copied to MEM1 at program startup.

You can declare functions 'FLASHMEM', and they will only be in flash memory, and not copied to MEM1.

Note, stuff in flash memory is slower to access then MEM1/MEM2. This is all documented in
 
Last edited:
Back
Top