T4- Is is possible to use more than the 1/2 of the SRAM that is tightly coupled?

Status
Not open for further replies.

bmillier

Well-known member
In my audio-based program, I use a large RAM array of 32-bit integers, and when I dimension it more than about 95000 (= 380k bytes), the combination of program code (~ 80K), audio library buffers (80), misc other variables, and this large array will exceed 50% of total SRAM. The linker then throws an error ".bss is not within region DTCM" at the end of the linking process. Actually, the program will fail to run when the memory usage approaches this 50% point, even before the compiler error occurs at the 50% value. And, when this happens, the T4 bootloader must be re-initialized by the 15 sec button push.
I see where the SRAM is split into 2 segments, only 1/2 of which is "closely coupled", and I know that the program in Flash is transferred to SRAM before execution. But, how does one make use of the upper half of the SRAM?. I could post my code (long) if helpful, but I think its only the combined size that is the determining factor
Thanks
 
Yes the linker scripts and the like don't do a very good job of reporting the memory usage.

I tried to make some sense of it in the thread: https://forum.pjrc.com/threads/57326-T4-0-Memory-trying-to-make-sense-of-the-different-regions

In there is a program, I currently use on PC, that I hacked into the protocol.ini to report more information.

An example output from my build looks like:
Code:
cmd /c "D:\\arduino-1.8.10\\hardware\\teensy\\..\\tools\\arm\\bin\\arm-none-eabi-gcc-nm -n C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_517702\\st7735_t3_simpletest_FB.ino.elf | D:\\GITHUB\\imxrt-size\\Debug\\imxrt-size.exe"

FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaaf
    ITCM :  48496 B	(74.00% of   64 KB)
    DTCM :  33472 B	( 7.30% of  448 KB)
    Available for Stack: 425280
OCRAM: 512KB
    DMAMEM:      0 B	( 0.00% of  512 KB)
    Available for Heap: 524288 B	(100.00% of  512 KB)
Flash:  75536 B	( 3.72% of 1984 KB)
===info ||| Multiple libraries were found for "{0}" ||| [Adafruit_GFX.h]
 Used: D:\arduino-1.8.10\hardware\teensy\avr\libraries\Adafruit_GFX
 Not used: C:\Users\kurte\Documents\Arduino\libraries\Adafruit_GFX_Library
Multiple libraries were found for "ST7735_t3.h"
 Used: C:\Users\kurte\Documents\Arduino\libraries\ST7735_t3
 Not used: D:\arduino-1.8.10\hardware\teensy\avr\libraries\ST7735_t3
Multiple libraries were found for "SPI.h"
 Used: D:\arduino-1.8.10\hardware\teensy\avr\libraries\SPI
Using library Adafruit_GFX at version 1.5.6 in folder: D:\arduino-1.8.10\hardware\teensy\avr\libraries\Adafruit_GFX 
Using library ST7735_t3 at version 1.0.0 in folder: C:\Users\kurte\Documents\Arduino\libraries\ST7735_t3 
Using library SPI at version 1.0 in folder: D:\arduino-1.8.10\hardware\teensy\avr\libraries\SPI 
"D:\\arduino-1.8.10\\hardware\\teensy/../tools/arm/bin/arm-none-eabi-size" -A "C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_517702/st7735_t3_simpletest_FB.ino.elf"
Sketch uses 75536 bytes (3%) of program storage space. Maximum is 2031616 bytes.
Global variables use 81968 bytes (7%) of dynamic memory, leaving 966608 bytes for local variables. Maximum is 1048576 bytes.

What is sort of interesting is the area that describes the FlexRAM (sorry about using the internal names). But it is 512kb, and currently most all code goes into there rounded up in 32kb sections. Any 32kb section left go into DTCM and this is where your program variables go... Any left from there is used for Stack space...

The OCRAM is the other 512kb and the only things that go into there are DMAMEM and Heap (malloc/new)...
 
Thanks KurtE for the lightning -fast response. I'm now looking through the thread you posted above (I do remember seeing something in the T4 Beta thread, but that is soooo long!). Your explanation in this thread is very good.
I'm working on a wavetable synth program- there is a very good Teensy Audio library for this, but it places the wavetable in Flash, so you can only have 1 or 2 fixed voices. I've modified it so that it works from SRAM, and you can change the voices by loading in various files from an SD card. But, the wavetable array must be contiguous, so I see a problem there. I guess I could go from a 100K integer array up to about 125K if I moved it into the OCRAM region. To be honest, I'm not an expert in C, and generally use statically defined arrays, not dynamic ones using malloc, but I might try it. I just pass a pointer to the array to the wavetable library routine, so it might be OK getting a pointer into the OCRAM (??)
The 32K chunks that the program Flash uses when it is transferred to SRAM is something I didn't know about, so I'll have to insure that my program doesn't go from its ~70K up beyond 96K, or I'll lose another whole 32K chunk.
 
If these tables are truly fixed, then you might move them to PROGMEM.

For example with the Uncanny Eyes stuff we were playing with, we had some large tables, that we left in program memory:
Like:
Code:
#define SCLERA_WIDTH  375
#define SCLERA_HEIGHT 375

const uint16_t sclera[SCLERA_HEIGHT][SCLERA_WIDTH] PROGMEM = {
  0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901,
  0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901,
  0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901, 0x6901,
...
This leaves the table in Program space. That is space that is not copied down into ITCM, but left in flash.

Unlike T3.x, PROGMEM actually means something.
 
@ KurtE. I finished reading your thread, and decided to just put DMAMEM in the declaration line for my large SAMPLE_BUFFER array, and increasing the array size from 95,000 to 120,000. Using just the standard linker report, I am seeing that 58% of the total SRAM is being used (I don't see your DCTM, OCRAM area breakdowns in the report of course). But, no compilation/linker errors are thrown, and the program seems to be working fine, whereas it would die if I went beyond about 95,000 before. Many thanks!
 
@ KURTE Re your post #4. That came in while I was writing the post above. But no, these wavetables are NOT fixed. They are totally different depending upon which voice I load in from the SD CARD. The original Teensy wavetable object in the audio library was using fixed wavetables (in FLASH), but that was not what I, or most otheres really need.
 
Glad it is working. Again hope at some point soon, we will get something like I did in the other thread into the official builds. I manually hack it in, using the instructions I mentioned.

Should mention, that IF you use DMAMEM (or malloc/new memory), and your code uses DMA, then you need to be careful as to knowing that the values your code get/set using normal operations like: for (int 1=0; i< 500; i++) my_high_memory = i;
And then do a DMA write from this array. You will very likely NOT get the values you just set. You may get some you may not others. Why because the system may not have flushed the memory cache out to the physical memory, and the DMA works off of the physical memory...

Likewise reads... If you do a DMA Read into an array and then have code that access that array, that code may or may not see the data that was retrieved, but instead use whatever was in the cache...

There are functions, to tell the portion of the cache to flush itself out (which you need to do when you are about to do the DMA writes) and likewise there is a call to say delete the stuff in the cache in a certain range, as to force it to have to load the data from the physical...

I have been bit a few times on this.
 
The wavetable synth code is readd from those tables using ordinary instructions. Unless it's been changed to somehow use DMA, the M7's cache behavior shouldn't be an issue.
 
Yes- I didn't change the wavetable audio object's code. I just point the routine to my SRAM array instead of the constant array in Flash that the object normally uses (in the examples). While the update routine uses DMA to move buffers around a lot, the wavetable array only gets loaded when one changes the voice, and never while audio is actually playing.
 
Status
Not open for further replies.
Back
Top