Reading and Writing from RAM on Teensy 4.1

RobertJFreeman

New member
Hello, I am quite new to Teensy projects, but I had a simple question I was hoping to get some help on. I am trying to test how quickly I can write and read from RAM on my Teensy, but am unable to find information on how to write and read from RAM in the first place. I understand that all data and variables used during runtime are stored in RAM and that RAM gets wiped clean with every reboot or power cut. However, for my purposes I would like to know how quickly this RAM can fill up (with say random numbers) and do not know how I could go about testing that. Any and all help is appreciated, thank you!

(Ultimately, I would like to read image data from a camera I will connect to the Teensy and write that data to RAM. Then, read that data from RAM and write it to the USB port to be saved on my laptop. Speed is very important in this project, thus I am interested in how quickly this process can occur.)
 
Will depend on if storage is in RAM1 or RAM2. See T_4.1 product page if details needed.
Both at 512KB - But RAM1 is filled with CODE and default program Data - and runs at Processor speed. RAM2 typically unused but runs at 1/4 of processor speed.

This test for PSRAM function could be a start?: PaulStoffregen/teensy41_psram_memtest/blob/master/teensy41_psram_memtest.ino

Allocate a RAM1 or RAM2 array as large as desired - then change the pointer setting for begin and end.

It uses slower timers and doesn't test for speed except overall - but using the cycle counter "ARM_DWT_CYCCNT" cn get very accurate timings.
 
The raw speed of internal RAM, either tightly coupled RAM1 or normal RAM2, isn't going to be the limiting factor for this sort of application. Plenty of other aspects of how your code is written will be far more important.

But to try answering your question anyway, to read and write memory, you would simply create a variable or array and code which reads or writes it. For example:

Code:
char mybuffer[1024];

void dosomething() {
  mybuffer[5] = 100;
  mybuffer[6] = 42;
}

However many tricky hardware and compiler optimizations in play that make your code (usually) run faster, but at the cost of adding quite a lot of complexity to understanding when actual memory access occurs. If you wanted to benchmark the actual raw speed of RAM access, you could use "volatile" to force the compiler to forego optimizations and always immediately access RAM as your code says to do. But if your variables are in RAM2, even with volatile your reads and writes go to the ARM processor's cache. You would also need to call the cache flush functions if you want to force actual memory access. Usually that's only done when using DMA. While it might be interesting for special hardware testing, it's pretty pointless from a code speed perspective... you want compiler optimizations and caching to make your code run fast.

There are also a number of ways you can write your code which affect speed. The processor is 32 bits, so accessing 4 bytes as a 32 bit integer is the same speed as accessing a single 8 bit char (assuming the integer is aligned, as the compiler will do by default). The processor also has many tricky hardware optimizations. For example, the DTCM bus can actually perform two 32 bit accesses in the same cycle, but only under certain circumstances. So for certain high performance code, you might go to some trouble to align to 8 byte boundaries in RAM1, or 32 byte boundaries in RAM2 (the cache row size is 32 bytes) and structure your code to take advantage of these hardware tricks.

It's a very deep rabbit hole of low-level details and optimizations.

But I'm pretty sure you'll find other factors than the RAM speed end up being far more important for overall performance.
 
Back
Top