Forum Rule: Always post complete source code & details to reproduce any issue!
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 26 to 50 of 51

Thread: Targeting Teensy 4.1 uses 5x as much dynamic memory as Teensy 3.6

  1. #26
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,478
    Extending what FrankB has noted ... Referencing this page - See FLASHMEM :: pjrc.com/store/teensy40.html

    Any code header not marked 'FLASHMEM' will be brought to the primary 512KB of RAM1

    The 1062 IS Generally MORE powerful in ALL respects - but one - is has no onboard flash directly connected to the MCU - but uses a somewhat slower to access external FLASH part for code/storage. Where 'somewhat' is an even bigger factor when the core is running at 600 MHz rather than 120 MHZ of the T_3.5.

    So preloading code into RAM1 allows faster code execution but on 32KB storage boundaries takes runtime RAM from the sketch.

    Uncommenting code leaves it unusable. Putting FLASHMEM on that code makes it active but keeps it stored on FLASH where execution requires it to be read before execution - if not in the CODE cache from prior use.

    If there are lesser run - or startup only pieces of of code mark them as FLASHMEM and they will not be pulled into RAM on startup.

    It looks like this in use:
    Code:
    FLASHMEM
    bool LittleFS_Program::begin(uint32_t size)
    {
    ...
    Putting this line in platform.txt will show the memory breakdown:
    Code:
    teensy41.build.flags.ld=-Wl,--print-memory-usage,--gc-sections,--relax "-T{build.core.path}/imxrt1062_t41.ld"
    Also in that linked page on memory layout is the DMAMEM in RAM2 - it runs at 25% of CPU speed - but gives access to another 512KB of usable RAM. But using that takes manual loading and manipulation.

  2. #27
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    @johan: yes, just write
    Code:
    PROGMEM const unsigned int bpb606bd04[10561] = {
    And you'll see NXP is not guilty

  3. #28
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    @Defragster: Do you have any Idea why my alternative linkage wasn't wanted?
    It could have been one click to get Johans program working. A sample-player does not need extreme high speed.

    Edit: Do we have a WIKI page about FLASHMEM/PROGMEM and memory layout?

  4. #29
    Quote Originally Posted by Frank B View Post
    @johan: yes, just write
    Code:
    PROGMEM const unsigned int bpb606bd04[10561] = {
    And you'll see NXP is not guilty
    Unfortunately I get this message:
    C:\Users\johan\AppData\Local\Temp\arduino_build_90 3646\sketch\src\samples\bpb606bd04.cpp:6:1: error: 'PROGMEM' does not name a type
    PROGMEM const unsigned int bpb606bd04[10561] = {
    ^

  5. #30
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,746
    Quote Originally Posted by Frank B View Post
    Do you have any Idea why my alternative linkage wasn't wanted?
    Please don't take this so personally Frank.

    I hope you can understand the timing wasn't ideal. We've been struggling to keep PJRC running since the pandemic hit, and during that time we released Teensy 4.1 (which had been mostly designed before February - but actually doing a commercial release is always the other 90% of the work). Substantial changes to the linker file were not desired at that time, because of the new product and difficulty just keeping the business going. Supporting a 2nd memory model was (and still is) anticipated to bring many small issues, likely increasing the tech support load, and also create a need for rewriting the web pages, which today only just barely manage to document the 1 memory model we're supporting. We just barely managed to get Teensy 4.1 released in May, and months later I *still* haven't managed to fully create the back side of the pinout card. It's simply the non-ideal reality of where we're at today.

    Long-term, offering a RAM-conserving memory model does make sense. Maybe we'll do it in 1.54, or but more likely it will go into 1.55 or 1.56.

    Please, I want you to understand this sort of alternative memory model comes with a substantial support cost. It's more like a free puppy than a free beer! It does have a substantial benefit and it is worthwhile to implement. But we're limited on how fast things can happen, especially with the pandemic forcing PJRC to run with 1 less employee. Robin & I are working long hours every day just to keep everything going. I'm also trying to prioritize the long-need File & FS addition, and SdFat & LittleFS as the main development goal for version 1.54. There's only so many high-risk software changes we can make under these not-so-ideal conditions (and a memory model change is high risk - very likely to expose subtle bugs or other issues in the many libraries we support). Even just taking the time to write forum messages like this one is a challenge right now.

  6. #31
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,746
    Quote Originally Posted by johanbilen View Post
    C:\Users\johan\AppData\Local\Temp\arduino_build_90 3646\sketch\src\samples\bpb606bd04.cpp:6:1: error: 'PROGMEM' does not name a type
    PROGMEM const unsigned int bpb606bd04[10561] = {
    ^
    Did you include Arduino.h? It's only automatically included for .ino files.

    You're compiling a .cpp file. Like all special names & symbols which aren't defined on the command line, you need to include the appropriate header which defines the feature you wish to use.

  7. #32
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,369
    Quote Originally Posted by Frank B View Post
    Edit: Do we have a WIKI page about FLASHMEM/PROGMEM and memory layout?
    Don't think so. I can write something up but since I don't use those often a pointer to some detailed info would be great.

  8. #33
    Member
    Join Date
    Feb 2017
    Location
    Chicago, IL
    Posts
    27
    Quote Originally Posted by Frank B View Post
    @Defragster: Do you have any Idea why my alternative linkage wasn't wanted?
    I wanted it; still do I've fudged mine for now to work for me, but it's ugly.

  9. #34
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,478
    @luni the best ref to date is the pjrc.com/store/teensy40.html - and the thread that it came from - though it is long and spread out.

    A functional example or two would be great.

    Paul regularly uses the FLASHMEM in cores for CODE - the PROGMEM still works for DATA - but as noted when not in sketch INO ( header into cpp or other? ) it had to see arduino.h to know what it is.

    And FLASHMEM was created because PROGMEM on CODE conflicted in the linker when PROGMEM was used on data in the same unit.

    And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.

  10. #35
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    @Paul, this is not the right place or medium for an answer to you. Just this much: I think you sometimes underestimate the users here. I deliberately don't mention any names here because I might forget someone. Some of them spend an incredible amount of time here in the forum, writing code or debugging. Or write a wiki article or just delete spam. They catch a lot of the work. They do it because they enjoy it.

  11. #36
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,746
    Yes, that stuff about the memory keywords probably needs to become a dedicated page with examples. Also needed is info about using const with pointers.

  12. #37
    Quote Originally Posted by PaulStoffregen View Post
    Did you include Arduino.h? It's only automatically included for .ino files.

    You're compiling a .cpp file. Like all special names & symbols which aren't defined on the command line, you need to include the appropriate header which defines the feature you wish to use.
    I have not included Arduino.h, I will try that, thanks. However, I still cannot understand how 8Mbyte Flash is insufficient when the Flash in T3.5 was enough to load my samples?

  13. #38
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    Quote Originally Posted by johanbilen View Post
    I have not included Arduino.h, I will try that, thanks. However, I still cannot understand how 8Mbyte Flash is insufficient when the Flash in T3.5 was enough to load my samples?
    It is sufficiant. You just don't use it*



    *correctly


    The compiler tries to copy your arrays to the RAM. With PROGREM you tell him: Don't do that, please.
    On Teensy 3.x this was the default.
    Last edited by Frank B; 12-03-2020 at 11:11 PM.

  14. #39
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,746
    The reason is because the linker scripts are designed differently on Teensy 4 than they were on Teensy 3. As Frank mentioned, the reason for the different design is memory performance.

    To explain a bit more, both have flash which is slower than the CPU. Both also have cache / buffers between the flash and CPU. There is no extra delay in many cases where data is in the cache. But the cost of a cache miss is very different. On Teensy 3.5 the worst case is waiting only 5 cycles. On Teensy 4.x, a cache miss can take hundreds of cycles, because the flash memory is external and access over only a 4 bit interface running at a relatively slow clock (compared to 600 MHz inside the chip).

    Because of the huge different in cache miss performance, on Teensy 4.x the linker script is designed to put "const" variables in the fast DTCM RAM. On Teensy 3.x, "const" puts the variables in flash. To get the same behavior on Teensy 4.x, you need to add "PROGMEM", as we've tried to explain.

    There is indeed a good reason for this inconsistency between the boards. It's one of so many reasons why these new boards are called version 4, rather than Teensy 3.7 & 3.8. Many things about the memory architecture are very different from the older Teensy 3.x boards. We try to make most code compatible, but this situation where PROGMEM was optional on 3.x but required on 4.x is one of those places where the software support differs, because the hardware has very different performance trade-offs.

  15. #40
    Quote Originally Posted by PaulStoffregen View Post
    The reason is because the linker scripts are designed differently on Teensy 4 than they were on Teensy 3. As Frank mentioned, the reason for the different design is memory performance.

    To explain a bit more, both have flash which is slower than the CPU. Both also have cache / buffers between the flash and CPU. There is no extra delay in many cases where data is in the cache. But the cost of a cache miss is very different. On Teensy 3.5 the worst case is waiting only 5 cycles. On Teensy 4.x, a cache miss can take hundreds of cycles, because the flash memory is external and access over only a 4 bit interface running at a relatively slow clock (compared to 600 MHz inside the chip).

    Because of the huge different in cache miss performance, on Teensy 4.x the linker script is designed to put "const" variables in the fast DTCM RAM. On Teensy 3.x, "const" puts the variables in flash. To get the same behavior on Teensy 4.x, you need to add "PROGMEM", as we've tried to explain.

    There is indeed a good reason for this inconsistency between the boards. It's one of so many reasons why these new boards are called version 4, rather than Teensy 3.7 & 3.8. Many things about the memory architecture are very different from the older Teensy 3.x boards. We try to make most code compatible, but this situation where PROGMEM was optional on 3.x but required on 4.x is one of those places where the software support differs, because the hardware has very different performance trade-offs.
    Thanks to you guys for your patience in explaining this, I now get that I have to add the "arduino.h" to the .cpp-file of my sample as well as PROGMEM in order to put my samples into Flash memory.

  16. #41
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,369
    Quote Originally Posted by defragster View Post
    @luni the best ref to date is the pjrc.com/store/teensy40.html - and the thread that it came from - though it is long and spread out.
    A functional example or two would be great.
    Paul regularly uses the FLASHMEM in cores for CODE - the PROGMEM still works for DATA - but as noted when not in sketch INO ( header into cpp or other? ) it had to see arduino.h to know what it is.
    And FLASHMEM was created because PROGMEM on CODE conflicted in the linker when PROGMEM was used on data in the same unit.
    And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.
    I started with a WIKI article about this stuff here: https://github.com/TeensyUser/doc/wiki/Memory-Mapping. It currently lives in the Basisc section. It is still a bit sketchy but good enough for a first review.

    While I wrote the article I thought a simple tool to do experiments with various memory locations and attributes might be fun. Here a few examples what you can do with it. (please note, it currently only works with T3.x. I'll extend it to T4.x when I find some time):

    Code:
    int i;
    const double x = 42;  // <== constant
    
    void setup(){
        static int k;
    
        while (!Serial) {}
        printMemoryInfo(i);
        printMemoryInfo(x);
        printMemoryInfo(k);
    }
    
    void loop(){
    }
    Which prints:
    Code:
    i
      Start address: 0x1FFF'11E4
      End address:   0x1FFF'11E7
      Size:          4 Bytes
      Location:      RAM (not initialized)
    
    x
      Start address: 0x0001'08C8
      End address:   0x0001'08CF
      Size:          8 Bytes
      Location:      FLASH
    
    k
      Start address: 0x1FFF'11E8
      End address:   0x1FFF'11EB
      Size:          4 Bytes
      Location:      RAM (not initialized)
    More examples here: https://github.com/TeensyUser/doc/wi...the-memorytool


    Code:
    And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.
    I wasn't able to reproduce that can you do an example where a const variable doesn't end up in FLASH?

  17. #42
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    Great tool, luni
    You may want to take a look at the .sym file, that gets generated on a build, too.
    (In the wiki there is a page which describes a way for a better *.sym generation)

    On Teensy 3 consts are located in flash, by default. This is the reason why PROGMEM, for the case above, was not needed before Teensy 4.
    With a look at the linker-files you can find many interesting details.
    For example the interrupt-table was located in the flash, long years ago. Then I suggested to copy it to the RAM, where it is now. This was a unusual concept (for Arduino), that time.

  18. #43
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,369
    Quote Originally Posted by Frank B View Post
    Great tool, luni
    You may want to take a look at the .sym file, that gets generated on a build, too.
    (In the wiki there is a page which describes a way for a better *.sym generation)
    Sure, when you added the info about nm to the WIKI some months ago, I added a call to it in VisualTeensy. I use it quite often to check things.

  19. #44
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,478
    Quote Originally Posted by luni View Post
    I started with a WIKI article about this stuff here: https://github.com/TeensyUser/doc/wiki/Memory-Mapping. It currently lives in the Basisc section. It is still a bit sketchy but good enough for a first review.

    While I wrote the article I thought a simple tool to do experiments with various memory locations and attributes might be fun. Here a few examples what you can do with it. (please note, it currently only works with T3.x. I'll extend it to T4.x when I find some time):

    [CODE]
    ...

    Code:
    And const or static alone are not enough to get stuff kept on FLASH - it takes those linker segmentation commands.
    I wasn't able to reproduce that can you do an example where a const variable doesn't end up in FLASH?
    Just saw this coming back for another post. Below is confirmation what Paul said in his posts is correct - Wiki should get an update:

    Using the boards.txt ",--print-memory-usage" >> teensy41.build.flags.ld=-Wl,--print-memory-usage,--gc-sections,--relax "-T{build.core.path}/imxrt1062_t41.ld"
    Code:
    // BEFORE
    Memory region         Used Size  Region Size  %age Used
                ITCM:         64 KB       512 KB     12.50%
                DTCM:       17088 B       512 KB      3.26%
                 RAM:       12384 B       512 KB      2.36%
               FLASH:       88356 B      7936 KB      1.09%
                ERAM:          0 GB        16 MB      0.00%
    
    // Added :: const uint32_t myArr[30000]={0};
    Memory region         Used Size  Region Size  %age Used
                ITCM:         64 KB       512 KB     12.50%
                DTCM:      139968 B       512 KB     26.70%
                 RAM:       12384 B       512 KB      2.36%
               FLASH:      208548 B      7936 KB      2.57%
                ERAM:          0 GB        16 MB      0.00%
    
    // Edited to :: PROGMEM const uint32_t myArr[30000]={0};
    Memory region         Used Size  Region Size  %age Used
                ITCM:         64 KB       512 KB     12.50%
                DTCM:       17088 B       512 KB      3.26%
                 RAM:       12384 B       512 KB      2.36%
               FLASH:      208548 B      7936 KB      2.57%
                ERAM:          0 GB        16 MB      0.00%
    Added this code in setup so the compiler didn't optimize the allocation away:
    Code:
    	int vvv=0;
    	for ( int uuu=0; uuu<30000; uuu++) {
    		vvv+=myArr[uuu];
    	}
    	Serial.println(vvv);
    	Serial.println(myArr[29000]);

  20. #45
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,369
    Mea culpa, as so often, I got distracted and never finalized this article. I'll add the missing info about PROGMEM asap.

  21. #46
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,478
    Cool, just came back here myself and saw there was a question.

    That ',--print-memory-usage' is nice to have turned on, I created a boards.local.txt to keep it around. I like the IMXRT-size thing - but not everyone can do that so easily.

  22. #47
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,369
    Quote Originally Posted by defragster View Post
    ...Just saw this coming back for another post. Below is confirmation what Paul said in his posts is correct - Wiki should get an update...
    Here you are: https://github.com/TeensyUser/doc/wiki/Memory-Mapping.
    Might be a good idea to double check all those special cases....

  23. #48
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    Great!

    Should we mention the cache and the three core cache-functions from imxrt.h?
    arm_dcache_flush();
    arm_dcache_delete();
    arm_dcache_flush_delete();

  24. #49
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    1,369
    Sure, never used them, where can I find info and use cases?

  25. #50
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    7,955
    It's very important if you use DMA.
    I can add the text.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •