When I use over 50% ram of Teensy 4.0 ... not working and.... can't read usb...

yongbum jung

New member
(First of all, I'm Korean and My english too bad, so I don't know if i can explain that I want to know.)

I'm using all teensy version.
Actually I'm use ram of teensy3.6 to control motor position control (not eeprom).
anyway.

I'm use like this (on teensy 3.6).

short rec_motor_1[15500], unsigned short rec_motor_time_1[15500];
short rec_motor_2[15500], unsigned short rec_motor_time_2[15500];
short rec_motor_3[15500], unsigned short rec_motor_time_3[15500];
short rec_motor_4[15500], unsigned short rec_motor_time_4[15500];

I can show the ram use % (around 95% use) when I compile.
Anyway upload and work is good on teensy 3.6.

But I show the new teensy4.0 at teensy web site, I found out that more RAM is available with teensy4.0. So I bought and test now, but If i use like the (on Teensy 4.0)
short rec_motor_1[23000], unsigned short rec_motor_time_1[23000];
short rec_motor_2[23000], unsigned short rec_motor_time_2[23000];
short rec_motor_3[23000], unsigned short rec_motor_time_3[23000];
short rec_motor_4[23000], unsigned short rec_motor_time_4[23000];

I can show the ram use % (around 45% use) when I compile.
But If I use more ram like this.


short rec_motor_1[23000], unsigned short rec_motor_time_1[23000];
short rec_motor_2[23000], unsigned short rec_motor_time_2[23000];
short rec_motor_3[23000], unsigned short rec_motor_time_3[23000];
short rec_motor_4[23000], unsigned short rec_motor_time_4[23000];
short rec_motor_5[23000], unsigned short rec_motor_time_5[23000];
....

I can show the ram use % (around 51% use) when I compile.
and teensy not working and use device not find on computer.
If i change the ram use % under 50%, then it works again....

Can you help me ?? Is there something wrong in my code? in teensy 4.0?

Thank a lot,
Best regards,
Yong.
 
I’m confused still after reading most of the linked thread. The top half of ram is usable for stuff like larger arrays if logged data?
 
My take on it is that one half is more optimized than the other allowing for faster code execution/memory access, that’s probably wrong on all accounts, but that’s how I choose to understand it right now.
 
I’m confused still after reading most of the linked thread. The top half of ram is usable for stuff like larger arrays if logged data?

Easy to get confused and harder to get clear answers or full use …

Hopefully good questions on that thread will evolve good answers for general clarity.

Teensy 4.0 has two segmented blocks of 512KB where one is usable at compile time - will hold data and code not marked PROGMEM. Code can run from this RAM area, using FASTRUN assures it is place there.

All addressing is FLAT - one pointer can get anywhere - but there are address jumps/cusps between the two blocks.

The second 512KB 'other'/DMAMEM block is where the heap and dynamic allocations come from at runtime, code cannot execute from this area.

And that area has a magic aura of TCM - tightly coupled memory - which relates to cache or access improvements - so it can act funny with DMA access doing unseen updates behind the cache - or doing writes from stale RAM where the cache holds updated info. That AFAIK only affects DMA 'direct memory access' that bypasses the cache which needs to be flushed/purged to prevent those conflicts. But otherwise should be normally usable - just not until runtime …

Currently AFAIK the Teensy CORE code doesn't rely on any DMA buffer or area in that DMAMEM area - but when USB updates to DMA processing that may change.

For now the one DOS BOX emulator creator IIRC allocated all of that second DMAMEM 512KB to use as the majority of the 640KB memory space used and it works along with another 128KB from the other area.


… that summarizes my understanding - not sure it adds anything … and hopefully it is generally valid and complete as far as it goes.
 
I've been looking at lot at the T4 memory structure lately as i've been having problems with cache coherency.

@defragster, perhaps you can clear up any misconceptions.

I'm pretty sure the TCM (FlexRAM) is not cached. The "tightly coupled" part means it's single-cycle access, so no need to cache it. Let me know if I got anything below wrong.

- The 1024 KB of composed of 512KB of tightly coupled FlexRAM and 512KB of OCRAM.
- The current T4 linker scripts assign a portion of the FlexRAM to instruction TCM (128KB), and the remainder to data TCM. None is allocated to general purpose RAM.
- program code is loaded into ITCM, compile time constants and statics are loaded into DTCM. The stack also goes in DTCM.
- Tightly coupled memory is single cycle access (hence the tightly coupled) and is not cached. There are no DMA cache coherency issues with the TCM.
- The second half of the 1024KB is a dedicated OCRAM (general purpose RAM) and not tightly coupled. This contains the heap. It is cached, and thus cache coherency must be maintained by using appropriate cache maintenance operations.
 
@blackaddr - I think you are reasonably close... As mentioned in the other thread, I hacked up an earlier hacked up program (I believe by @FrankB) which prints out more stuff for the T4...

Also note: That @Paul has recently added more information about the memory up on the main PJRC website, details shown in other thread:
https://forum.pjrc.com/threads/5732...ferent-regions?p=219686&viewfull=1#post219686

Example from: an updated sketch from @defragster on the other thread, my output shows:
Code:
md /c "D:\\arduino-1.8.10\\hardware\\teensy\\..\\tools\\arm\\bin\\arm-none-eabi-gcc-nm -n C:\\Users\\kurte\\AppData\\Local\\Temp\\arduino_build_434165\\bar.ino.elf | D:\\GITHUB\\imxrt-size\\Debug\\imxrt-size.exe"

FlexRAM section ITCM+DTCM = 512 KB
    Config : aaaaaaab
    ITCM :  23008 B	(70.21% of   32 KB)
    DTCM :  12992 B	( 2.64% of  480 KB)
    Available for Stack: 478528
OCRAM: 512KB
    DMAMEM:   8272 B	( 1.58% of  512 KB)
    Available for Heap: 516016 B	(98.42% of  512 KB)
Flash:  32528 B	( 1.60% of 1984 KB)
I keep meaning to update my program to output some friendlier data, like someone else suggest on the thread.

But in particular: the FlexRam has a total size of 512KB, which has 8 blocks of 32KB (FlexRam config register) which can configure each of these 32KB blocks to either be assigned to ITCM or DTCM (or not all). So our link process assigns most of the code to be copied down to ITCM (Instruction Tightly Coupled Memory) and as many of these 32KB blocks are assigned to it for the code to fit. The remaining blocks are than assigned to DTCM (Data tightly coupled Memory)

By default const memory defines such as arrays, go into ITCM, I believe this also includes strings, like: Serial.printf("This is a string");

Stack goes at the end of DTCM and yes the heap as well as anything defined as DMAMEM go into OCRAM... And yes our startup code enables the caching of this region of memory, which is done in the function: FLASHMEM void configure_cache(void)
Which is part of startup.c

During the T4 beta, at times I would reconfigure the setup for this region to disable the cache... Which for example fixed several DMA issues I was having. But it was sort of throwing out the baby with the bath water... Then again if are only using this region to do DMA...

Note: In a posting a few down from the one I linked to, @Paul in his current github code has added some new defines to allow you to keep some of your code in Flash instead of being copied down to ITCM.

So now you can keep both data and program stuff from taking away from that 512KB.
That is you can keep Data up in flash, by doing the old style stuff like:
Code:
const uint8_t myData[] PROGMEM = {....};
Note the word PROGMEM is important here, unlike T3.x which ignores it.

Now (with Paul's core), code can stay up in flash as well:
Code:
FLASHMEM void MyFunction() {...}

And I believe that you can now also define strings, like you did on AVR to also stay in flash, It has been awhile since I have done that, but I think it is something like:
Serial.printf(F("This is a string"));

There are of course still other qualifiers for defines, like there is: Make sure this function runs in fast memory:
Code:
FASTRUN void MyFunction() {...}

Allocate some memory in the upper 512KB block
Code:
DMAMEM uint8_t my_screen_buffer[320*240*2];
Side note on DMAMEM - I don't think you can have an initialized block here...
 
Good looking answer KurtE - indeed no DMAMEM is initialized - Paul write up on T4 memory note below - but can also be seen in startup where Code and Init data blocks are pulled from flash - but nothing for the upper 512MB region.

>> https://www.pjrc.com/store/teensy40.html

Snippets:
RAM1 is accessed by 2 extremely high speed 64 bit buses, ITCM for running code, and DTCM for accessing data. For the highest possible performance, place code and variables in RAM1. Caching is not used with RAM1, because all location in RAM1 are accessed at the same speed as the M7 processor's caches.

String constants may be placed only in the flash using F("string") syntax.

DMAMEM Variables - Variables defined with "DMAMEM" are placed at the beginning of RAM2. Normally buffers and large arrays are placed here. These variables can not be initialized.
 
Back
Top