Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 8 of 8

Thread: Teensy 3.6 RAM memory use very high

  1. #1

    Teensy 3.6 RAM memory use very high

    I have been working for the past six years on a big project with a lot of code.
    Recently I have moved from Teensy 3.6 to Teensy 4.1.

    Now when I compile it, Arduino shows:
    Code:
    Sketch uses 307840 bytes (3%) of program storage space. Maximum is 8126464 bytes.
    Global variables use 345108 bytes (65%) of dynamic memory, leaving 179180 bytes for local variables. Maximum is 524288 bytes.
    How can the global variables suddenly become so large? On Teensy 3.6 it is less than 10k. Here it is larger than the program code. Where can I start looking? I use a lot of PROGMEM arrays and also a derived class structure in my code.

    To get an idea of my code:
    https://github.com/sixeight7/VController_v3/

  2. #2
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    24,282
    Quote Originally Posted by sixeight View Post
    How can the global variables suddenly become so large?
    RAM is being used for code. If you install 1.54-beta7, the memory usage info is improved to show how the memory is being used. 1.53 uses Arduino's default memory summary which can't show more detail than everything is either flash or ram.

  3. #3
    Quote Originally Posted by PaulStoffregen View Post
    RAM is being used for code.
    So technically a Teensy 3.6 can have more code than a Teensy 4? Or is there a way around it? I did read something about FLASHMEM (https://forum.pjrc.com/threads/57326...ferent-regions), but to add that to every bit of the code seems impractical. Or does adding RAM help?

    I already added DMAMEM to the largest memory buffers.

    Really want to know, so I can make a future proof product. I sell these and would not like to run out of program space.

  4. #4
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    14,191
    There is a way around the appearance that : "Teensy 3.6 can have more code than a Teensy 4"

    See FLASHMEM here - and other memory details: pjrc.com/store/teensy41.html#memory
    Code:
    FLASHMEM - Functions defined with "FLASHMEM" executed directly from Flash. If the Cortex-M7 cache is not already holding a copy of the function, a delay results while the Flash memory is read into the M7's cache. FLASHMEM should be used on startup code and other functions where speed is not important.
    All of the larger FLASH on the T_4.x's can hold code (or const data), but any code not in RAM1/ITCM is subject to longer load times when it isn't in the 32KB Code cache.

    As noted the TeensyDuino 1.54 beta 7 ( or coming later versions to release ) have an improved memory usage display for RAM1 [ITCM and DTCM] and RAM2 [DMAMEM] - with details shown on linked page.

  5. #5
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    8,348
    Quote Originally Posted by sixeight View Post
    Or is there a way around it?
    You could edit the linker script.
    Quote Originally Posted by sixeight View Post
    I did read something about FLASHMEM [...] but to add that to every bit of the code seems impractical.
    Come on, even with 500 functions, with ctrl-c / ctrl-v it's done in a few minutes.

  6. #6
    So, how does it compare speedwise? Running code from flash on the Teensy 3.6 compared to running code from flash on the Teensy 4.1? The Teensy 3.6 has a much higher bandwidth - 411 vs 66 on the Teensy 4.1. Or is that not a relevant specification in this case?

  7. #7
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    24,282
    It really depends on how well (or poorly) the code makes use of the cache.

    Teensy 3.6 has an 8K cache. Teensy 4.1 has a pair of 32K caches, but cache misses have a much larger impact on performance.

    Code running from ITCM RAM doesn't use the cache, since all 512K of RAM which can be ITCM / DTCM is as fast as using the 32K caches.

  8. #8
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    8,348
    ...and you can use both.
    If you keep your "hot" functions in the RAM (ITCM size is minimum 32kB anyway) it will be fast. Then, the additional 32kB (Teensy 4) instruction cache for the flash does a pretty good job.
    Of course, uncached code will be slower. Just use it for initialization things, settings, etc. - code that does not need to be fast. Remember, if there is a loop there, it will be in cache after the first run..

    I used a special linker script to keep everything in flash. Some benchmarks ran as fast as from ITCM. !No! difference.
    Last edited by Frank B; 05-07-2021 at 09:33 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •