Heap stops expanding with 200Kb free RAM - Teensy3.6

Status
Not open for further replies.

simenzhor

New member
Hi, I've been struggling with dynamically allocating buffer arrays for the sdFat library for several full work days now, and figure it's time to ask for help.
Just to mention it first: Using smaller code snippets everything seems to work fine, but it's when I use a larger project (and subsequently more memory) the problems arise.

I have modified the RamMonitor from this thread to work with Teensy3.6, by adding the following elif to the RAM start address constant:
Code:
static const uint32_t  HWADDRESS_RAMSTART = 
#elif defined(__MK66FX1M0__)
      0x1FFF0000; // Teensy 3.6 (from https://github.com/PaulStoffregen/cores/blob/238b102ac46d1184e7055943cb5fe60bcb8eabbe/teensy3/mk66fx1m0.ld)
And I have tried two other libraries to monitor the available RAM (MemoryFree and the sdFatUtil from SdFat). They all seem to roughly agree (within a few hundred bytes at least) that there is a lot of RAM left when my code crashes.

The exact thing my code is trying to do when it crashes is to dynamically allocate an array of size 380 using the new[] operator. The exact line can be found here:
https://github.com/SimenZhor/FerroF...problems/lib/Animation/src/Animation.cpp#L587.

Using the RamMonitor mentioned above I'm printing out this "state" right after that allocation is attempted:
Code:
Pointer to _duty_buf: 0x0,0
Pointer to _frame_buf: 0x1fff3350
==== memory report ====
heapsize: 25788
heapfree: 320
heaptotal: 26108
stacksize: 240
stackfree: 784
stacktotal: 1024
totalfree: 237648
total: 262144
free: 231 Kb (90.4% of 256 Kb)
stack: 1 Kb (0.4% of 256 Kb)
heap: 25 Kb (10.0% of 256 Kb)
As we can see, the pointer returned to _duty_buf by new[] is 0x0, which, by the way, can't be recognized by the code as nullpointer for some reason.
I think the most important line is this one: "heapfree: 320" which is obviously too small for an array of size 380. But for some reason the total heap size will not expand to allow this allocation to happen. If we compare to a similar printout I do earlier in the code (before any allocation happens), we see that the heaptotal has expanded as expected until this point:
Code:
==== memory report ====
heapsize: 2940
heapfree: 3304
heaptotal: 6244
stacksize: 160
stackfree: 864
stacktotal: 1024
totalfree: 248904
total: 262144
free: 242 Kb (94.6% of 256 Kb)
stack: 1 Kb (0.4% of 256 Kb)
heap: 6 Kb (2.4% of 256 Kb)

I'm using the PlatformIO environment in VSCode, but I tried to run my code in the Arduino IDE (version: 1.8.9 on Windows 10) as well with the same result.

I've tried to follow the tips from these threads:
https://forum.pjrc.com/threads/46506-3-6-Available-Memory
https://forum.pjrc.com/threads/2655...t-seems-to-just-keep-going-(Teensy-3-1)/page2

There is no other hardware connected to the Teensy except for an SD card in the SD slot.
 
With some of these random-ish crashes, it can be difficult to pinpoint what went wrong. Specially since we have no code provided here to help figure out what is going on. Which is why there is the forum rule that you should include complete code to reproduce the issue.

But if I were guessing, you could be barking up wrong tree.

I have used malloc to allocate a lot of memory, in particular for example a frame buffer for display or 240x320x2 bytes or 153600 bytes...

More likely there is something else going on. Probably something that corrupted the heap. Which can be be lots of different things like:
a) Stack and heap collide. Did not look through your stack thing, to see how it deduces how much stack is used, but I have seen programs that put very large buffer on stack. Sometimes found those, by filling all memory with certain memory pattern and then have code walk the space between heap and stack pointer to see where that pattern ends...

b) Writing to some random uninitialized variable that happens to be up in high memory and if happens to collide with with heap code ...

c) Overwrite what you allocated. I have seen in the past, where code will do something like: my_array = malloc(array_size); for (i=1; i <= array_size; i++) my_array = 0;
Where the person allocated the array for N bytes but used N+1... Which works in many cases as the heap will typically allocate blocks alligned to 8 or 16 bytes... But if the allocation size of the request turned out to be a multiple of allocation size and ... Then corruption.

d) Used memory you already freed and/or free it twice - Seen this multiple times.

e) Or used some form of library that does multi-threading, or do something similar, where the default heap code I am pretty sure is not thread safe. But this can probably also happen if you do things like malloc or new inside an interrupt handler. That is suppose the main line code is inside of a malloc and is ready to dole out some memory and the interrupt code is called and it also goes into the heap and tries to allocate stuff... The two allocations may conflict...

So again not sure what else to suggest.
 
With some of these random-ish crashes, it can be difficult to pinpoint what went wrong. Specially since we have no code provided here to help figure out what is going on. Which is why there is the forum rule that you should include complete code to reproduce the issue.

I did include the entire repo of code in my original post. I even made a new branch just to reproduce this issue, and linked to the exact line where things go wrong. Here is the link again: https://github.com/SimenZhor/FerroF...problems/lib/Animation/src/Animation.cpp#L587

a) Stack and heap collide. Did not look through your stack thing, to see how it deduces how much stack is used, but I have seen programs that put very large buffer on stack. Sometimes found those, by filling all memory with certain memory pattern and then have code walk the space between heap and stack pointer to see where that pattern ends...
I'll admit that I don't fully understand what all the variables and constants in the RamMonitor I use represent, but that being said I have now tried three different repositories that claim to do the same thing (determine space between top of heap and bottom of stack) and they seem to match quite closely (there's a few hundred bits difference). Additionally I have confirmed that the new[] operator returns 0x0, (which for some reason can't be detected as a nullpointer by any of the tests in the next line: https://github.com/SimenZhor/FerroF...problems/lib/Animation/src/Animation.cpp#L588 ). So my code doesn't exactly crash at the point I mentioned, but it does later when I attempt to write to address 0x0+whatever offset which is in the read only Flash.

I do appreciate all the suggestions though! They summarize very well what I've been trying to find when googling around before I knew knowing exactly what went wrong (the 0x0 pointer returned by new[], which does not happen if the array size is significantly smaller than the 'heapfree: 320' printout from the RamMonitor).

Edit: I'll post an extract of what the code does around where I get this pointer, I have removed a few lines for clarity, the rest of the code is available in the link above.
Code:
    uint8_t* frame_buf8 = new uint8_t[frames*cols*sizeof(uint32_t)];
    _frame_buf = (uint32_t*) frame_buf8;
    
    _duty_buf = new uint8_t[frames*cols*rows*sizeof(uint8_t)];              //This returns 0x0 (seen in the printout below)
    if (&_duty_buf[0] == 0 || _duty_buf == 0 || _duty_buf == (uint8_t*) nullptr || &_duty_buf[0] == (uint8_t*) nullptr || _duty_buf == nullptr || &_duty_buf[0] == nullptr || _duty_buf == NULL || &_duty_buf[0] == NULL ||&_duty_buf[0] ==(uint8_t *) 0 || _duty_buf == (uint8_t *)0 )
    {
        //For some reason none of the checks above return true, so the code believes that the pointer is valid and this return never hits.
        return-1;
    }else{
        Serial.printf("Pointer to _duty_buf: %p,%d\n",_duty_buf,_duty_buf); //This is where the 0x0 value is observed, when it is later passed on                  
                                                                            //to another method as (uint8_t*) it is identified as a nullptr
        Serial.printf("Pointer to _frame_buf: %p\n",_frame_buf);
        report_ram2();                                                      //This function prints out all the memory info from my OP
    }
 
Last edited:
Again sometimes hard to figure out what all that is going on fully, you may need to try different things and see if anything helps.

Often it is through steps of elimination... As I mentioned, often it has to do with the actual heap being corrupted. Not that you exhausted the heap.

Again I often start looking at places where I use lots of Stack space. Things like your function: int Animation::save_to_SD_card(uint16_t file_index){

You have some arrays on stack:
Code:
    uint32_t frame_buf[frames*cols];
    uint8_t duty_buf[frames*cols*rows];

Or for example in the function you are dying in: I believe is the read_from_SD_card method?

You have things at start like:
Code:
    for (int i = 0; i < _num_frames; i++)
    {
        Serial.printf("i:%d\n",i);
        _frames[i]->delete_frame();
    }
    delete _frames;
Well were these actually previously allocated? If not should probably skip...

Again lots of different things to look at and try...
 
At line 45 you have:
Code:
        for (int y = 0; y < rows; y++)
        {
            for (int x = 0; x < cols; x++)
            {
                if (duty_cycle[x*cols + y] < DUTY_CYCLE_RESOLUTION)

I dont' think you are addressing the array correctly. Shouldn't it be y*cols+x ?
The array is declared to be 10x19, so when x and y reach their limits x=18 and y=9 so x*cols+y is 352 when there are only 190 elements in the array.
In this case you are only reading outside the array. If you have a similar addressing problem somewhere else when writing to the array, it might cause the problems you are seeing.

Pete
 
You have things at start like:
Code:
delete _frames;
Well were these actually previously allocated? If not should probably skip...
Ooooh, I think you're right! That _frames array was allocated in static memory at compile-time (the objects it pointed to was not though, so those were correctly deleted)! Thanks a heap (sorry for the pun)!

I dont' think you are addressing the array correctly. Shouldn't it be y*cols+x ?

You are right! Although this is a more innocent bug, I really appreciate that you took the time to look at my code this thoroughly!
 
Last edited:
Status
Not open for further replies.
Back
Top