external PSRAM usage advice on the Teensy 4.1

samm_flynn

Well-known member
I've just installed two 8MB PSRAM chips from the Teensy store and ran the test sketch — everything appears to be working fine.

I don’t come from a programming background, but in my current project, I have a time-sensitive Timer ISR running at 1 kHz. It reads about 256 bytes from a float array that's part of a circular buffer currently residing in RAM2.

I’m considering moving this buffer to EXTMEM (PSRAM) to free up internal memory, and I’d really appreciate some advice from those with experience using external memory on the Teensy.

As a specific example, I have a Packet class that uses dynamically allocated arrays. I send over 1,000 of these packets per second via USBSerial. I'm wondering if PSRAM is a good place for those dynamically allocated arrays, or if that would lead to performance issues.

So more broadly, I’m looking for guidance on:

Is PSRAM suitable for heap-allocated objects?

When do PSRAM’s speed or latency limitations actually become a problem?

Are there best practices for things like atomic access or memory alignment when working with EXTMEM?

Any insight especially "gotchas" would be incredibly helpful. Thanks!
 
Usually PSRAM is useful for buffers to hold data you write and then read once, or occasionally. It's especially useful for network applications, like accessing a web server which might transmit a large file, or collecting and transmitting data to a web server where you want to be able to buffer quite a lot of data in case it's slow to respond.

The theoretical bandwidth is about 40 Mbyte/sec if using the default 88 MHz clock, or about 60 Mbyte/sec if you edit the startup code to use 132 MHz. So for your application of 1000 packets/sec, it really depends on the packet size. If they're 10K each, then you're talking about using about half the bandwidth, which is probably the area where I'd start to be concerned. If the packets are relatively small, like only a few hundred bytes, you'll use only a small fraction of the bandwidth.

PSRAM also works pretty well for complex stuff that fits within the Cortex-M7 32K cache. For example, if you're doing matrix multiply or FIR filters or direct convolution reverb, expect to run into problems if any 1 data set your program handles at a time grows larger than 32K.

PSRAM can be accessed by DMA, which doesn't use the cache. Total bandwidth is an issue. It works well for 320x240 or even 320x480 displays. But for HD video, there just isn't enough bandwidth.
 
I would add that PSRAM tends to perform much better for sequential access rather than random access, for example if you had the choice it would be better to use array style storage rather than something like a linked list.
 
Back
Top