malloc/free for EXTMEM and DTCM

luni

Well-known member
This morning I stumbled over Andrey Rys static memory allocator library smalloc (https://github.com/electrorys/smalloc). This library provides malloc, calloc, free and a couple of other memory allocation functions which work on a static/global buffer which can be passed into the library. Thus, it can easily be used for dynamic memory allocation on EXTMEM and DTCM without changing anything in the core files. I did a few tests which worked without obvious issues.
Off course smalloc also provides sm_free and sm_calloc. You can choose if you want your memory zeroed and you can attach an "out of memory" callback to handle memory overrun issues. It can also be used to handle more than one memory pool which might come in handy from time to time.

For the convenience of Arduino users I did a clone of Andreys unix? library, sorted the original files into the folders required by the Arduino library specification and added a few usage examples. I also added a header handling the extern "C" {}wrapper required for the usual Arduino ino/cpp projects. You find the clone here https://github.com/luni64/static_malloc


Here a quick example showing the usage:

Code:
#include "static_malloc.h"

constexpr size_t myHeapSize = 1024 * 100;
EXTMEM uint8_t myHeap[myHeapSize];       // 100kB memory pool on the external ram chip

//----------------------

float* floatArray;
uint32_t* u1;
char* text;

void setup()
{
    sm_set_default_pool(myHeap, myHeapSize, false, nullptr); // init with the EXTMEM pool, do not zero the buffer, no out of memory callback

    u1 = (uint32_t*)sm_malloc(sizeof(uint32_t));        // one uint32_t
    floatArray = (float*)sm_malloc(10 * sizeof(float)); // array of 10 floats
    text = (char*)sm_malloc(100);                       // c-string 100 bytes

    *u1 = 100;
    for (int i = 0; i < 10; i++) { floatArray[i] = i * M_PI; }
    text = strcpy(text, "Hello World");

    while (!Serial) {}
    Serial.println(text);
    Serial.println(*u1);
    Serial.println(floatArray[4]);
}

void loop(){
}
Output:
Code:
Hello World
100
12.57

Unfortunately, I forgot my T4.1 at another place and can currently only test it with a DTCM buffer. Would be great if someone could give the external RAM a try.

I added examples to the repo showing the use of the "out of memory" callback and how to use 'placement new' to construct c++ objects on the EXTMEM heap.
 
Last edited:
Quick mod gave:
Code:
T:\tCode\[B]libraries\static_malloc\examples\Simple\simple.ino[/B] Oct 10 2020 14:09:20
[U]PSRAM size =16 MB[/U]
f1 address: [B][U]0x7000000c[/U][/B], value: 3.141593
f2 address: 0x700000b4, value: 1.414214
t2 address: 0x70000030, value: This is a text
----------------------------

f3 address: [B][U]0x7000000c[/U][/B], value: 42.000000
f2 address: 0x700000b4, value: 1.414214
t2 address: 0x70000030, value: This is a text

with edits:
Code:
#include "static_malloc.h"

// reserve some space for sm_malloc. sm_malloc will never overwrite this buffer
// You can generate the buffer wherever you want. I.e. in normal DTCM Memory, on the external RAM chip (EXTMEM) etc.

[B]extern uint8_t external_psram_size;[/B]
constexpr size_t myHeapSize = 1024 * 100;
[B]EXTMEM uint8_t myHeap[myHeapSize];[/B]

void setup()
{
    sm_set_default_pool(myHeap, myHeapSize, 0, nullptr);

    while ( !Serial && millis()<4000 );
    Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
    Serial.printf( "PSRAM size =%d MB\n", external_psram_size );

    float* f1 = (float*)sm_malloc(sizeof(float)); // allocate a float
    char* t1 = (char*)sm_malloc(100);             // allocate a 100 byte cstring
    float* f2 = (float*)sm_malloc(sizeof(float)); // another float

    *f1 = M_PI;
    *f2 = M_SQRT2;
    snprintf(t1, 100, "This is a text");

    Serial.printf("f1 address: %p, value: %f\n", f1, *f1);
    Serial.printf("f2 address: %p, value: %f\n", f2, *f2);
    Serial.printf("t2 address: %p, value: %s\n", t1, t1);
    Serial.println("----------------------------\n");

    sm_free(f1);

    float* f3 = (float*)sm_malloc(sizeof(float)); // allocate a float
    *f3 = 42;

    [B]Serial.printf("f3 address: %p, value: %f\n", f3, *f3); // should get the same address as f1 had[/B]
    Serial.printf("f2 address: %p, value: %f\n", f2, *f2);
    Serial.printf("t2 address: %p, value: %s\n", t1, t1);
}

void loop()
{
}
 
Ok hooked up a T4.1 with 1 PSRAM chip and ran the three example programs:
Simple.ino
Code:
f1 address: 0x20000de8, value: 3.141593
f2 address: 0x20000e90, value: 1.414214
t2 address: 0x20000e0c, value: This is a text
----------------------------

f3 address: 0x20000de8, value: 42.000000
f2 address: 0x20000e90, value: 1.414214
t2 address: 0x20000e0c, value: This is a text
MemoryOverflow.ino
Code:
Start
10 bytes allocated
Memory overrun, 200 requested bytes can not be al
And the LED blinked repeatedly

UsingNew.ino
Code:
no serial output but LED Blinked in accordance with the sketch

Very cool -now have to try it with one of my more involved sketches :) But that is for another day.

EDIT: Decided to run @defragsters test sketch as well and of course got same results:
Code:
C:\Users\Merli\AppData\Local\Temp\arduino_modified_sketch_387168\simple.ino Oct 10 2020 19:17:04

PSRAM size =8 MB
f1 address: 0x7000000c, value: 3.141593
f2 address: 0x700000b4, value: 1.414214
t2 address: 0x70000030, value: This is a text
----------------------------

f3 address: 0x7000000c, value: 42.000000
f2 address: 0x700000b4, value: 1.414214
t2 address: 0x70000030, value: This is a text
 
Last edited:
Good that it works with real EXTMEM as well :)

Here another example, showing how to use two heaps. One on DTCM and one on the external RAM:

Code:
#include "static_malloc.h"

smalloc_pool DTCM_Pool;
smalloc_pool EXTM_Pool;

void setup()
{
    constexpr size_t bufSize = 1024 * 200;                          // 200kB each
    static EXTMEM uint8_t EXTM_Buf[bufSize];                        // external RAM buffer
    static uint8_t DTCM_Buf[bufSize];                               // internal fast RAM (DTCM) buffer

    sm_set_pool(&EXTM_Pool, EXTM_Buf, bufSize, true, nullptr);      // register EXTMEM pool
    sm_set_pool(&DTCM_Pool, DTCM_Buf, bufSize, true, nullptr);      // register DTCM pool
                                                                    
    //-------------------------------------------------------------------------------------------
    while(!Serial){}

    float* f  = (float*)sm_malloc_pool(&DTCM_Pool, sizeof(float));  // dynamically allocate a float in the DTCM pool
    char* str  = (char*)sm_malloc_pool(&EXTM_Pool, 2 * 1024);       // dynamically allocate a 2kB text buffer in the EXTMEM pool

    *f = M_PI;                                                      // use heap variables..
    strcpy(str, "Hello external RAM");

    Serial.printf("DTCM:     f:  %f\n", *f);
    Serial.printf("EXTMEM: str:  %s\n", str);

    sm_free_pool(&DTCM_Pool, f);                                    // release used memory
    sm_free_pool(&EXTM_Pool, str);
}

void loop(){
}

Gives:
Code:
DTCM:     f:  3.141593
EXTMEM: str:  Hello external RAM
 
Last edited:
Cool that it can sm_alloc from multiple pools! There is a text label error in setup where simple.ino showed 't2' for this 't1':
Code:
    Serial.printf("[B][U]t1[/U][/B] address: %p, value: %s\n", t1, t1);

<edit>
>> CHEAT below since the EXTMEM size isn't known at compile time - I compile alloc 8MB and double when there is 16MB of EXTMEM - could have done 16MB and limited to 8MB - the other way?
Allocs are too big for ALL to work with only 8MB EXTMEM - but it seems to run right.

That means we could sm_alloc() from the trapped ITCM block where it could be 1KB to 31KB in size!

Wondered about that, but didn't look at the code. Wrote this to do some random alloc/free and it seems to be running about 9 hours:
Code:
T:\tCode\libraries\static_malloc\examples\SimplePlus\simpleplus.ino Oct 11 2020 03:24:29
PSRAM size =16 MB
f1 address: 0x700000d8, value: 3.141593
f2 address: 0x700007d4, value: 1.414214
t2 address: 0x70000750, value: This is a text
----------------------------

f3 address: 0x700000d8, value: 42.000000
f2 address: 0x700007d4, value: 1.414214
t2 address: 0x70000750, value: This is a text
0 address: 0x708e5030 size: 0x40000
5 address: 0x70174f0c size: 0x10000
1 address: 0x70925050 size: 0x20000
 >>> 0 FREE add: 0x708e5030	7 address: 0x7094506c size: 0x70000
 >>> 1 FREE add: 0x70925050	4 address: 0x708e5030 size: 0x50000
6 address: 0x709b508c size: 0x50000
0 address: 0x70a050a8 size: 0x50000
9 address: 0x70a550c4 size: 0x60000
 >>> 4 FREE add: 0x708e5030	4 address: 0x708e5030 size: 0x30000
3 address: 0x70ab50dc size: 0x40000
 >>> 4 FREE add: 0x708e5030	 >>> 7 FREE add: 0x7094506c	1 address: 0x708e5030 size: 0x30000
7 address: 0x70915048 size: 0x60000
 >>> 9 FREE add: 0x70a550c4	 >>> 0 FREE add: 0x70a050a8	 >>> 1 FREE add: 0x708e5030	4 address: 0x70a050a8 size: 0x70000
 >>> 4 FREE add: 0x70a050a8	 >>> 6 FREE add: 0x709b508c	1 address: 0x70975060 size: 0x50000
0 address: 0x709c507c size: 0x70000
2 address: 0x708e5030 size: 0x10000
 >>> 7 FREE add: 0x70915048	7 address: 0x708f5050 size: 0x70000
 >>> 0 FREE add: 0x709c507c	 >>> 7 FREE add: 0x708f5050	 >>> 2 FREE add: 0x708e5030	8 address: 0x708e5030 size: 0x70000
6 address: 0x709c507c size: 0x50000
4 address: 0x70a15098 size: 0x50000
 >>> 5 FREE add: 0x70174f0c	 >>> 4 FREE add: 0x70a15098	9 address: 0x70a15098 size: 0x60000
7 address: 0x70af50fc size: 0x50000
 >>> 6 FREE add: 0x709c507c	 >>> 8 FREE add: 0x708e5030	 >>> 7 FREE add: 0x70af50fc	5 address: 0x708e5030 size: 0x50000
8 address: 0x709c507c size: 0x40000
 >>> 1 FREE add: 0x70975060	6 address: 0x7093504c size: 0x80000
 >>> 3 FREE add: 0x70ab50dc	 >>> 9 FREE add: 0x70a15098	2 address: 0x70174f0c size: 0x10000
1 address: 0x70a0509c size: 0x80000
 >>> 2 FREE add: 0x70174f0c	2 address: 0x70a850b8 size: 0x50000
 >>> 5 FREE add: 0x708e5030	3 address: 0x708e5030 size: 0x20000
 >>> 6 FREE add: 0x7093504c	4 address: 0x7090504c size: 0x80000
6 address: 0x70ad50d4 size: 0x80000
9 address: 0x70b550f0 size: 0x60000
 >>> 3 FREE add: 0x708e5030	 >>> 8 FREE add: 0x709c507c	3 address: 0x70985068 size: 0x70000
 >>> 3 FREE add: 0x70985068	 >>> 9 FREE add: 0x70b550f0	0 address: 0x70174f0c size: 0x10000
9 address: 0x70985068 size: 0x60000
5 address: 0x70b550f0 size: 0x70000
 >>> 2 FREE add: 0x70a850b8	 >>> 5 FREE add: 0x70b550f0	 >>> 6 FREE add: 0x70ad50d4	 >>> 9 FREE add: 0x70985068	 >>> 4 FREE add: 0x7090504c	5 address: 0x708e5030 size: 0x10000
7 address: 0x708f5050 size: 0x60000
4 address: 0x70955068 size: 0x20000
9 address: 0x70975084 size: 0x70000
8 address: 0x70a850b8 size: 0x30000
6 address: 0x70ab50d0 size: 0x80000
 >>> 7 FREE add: 0x708f5050	 >>> 1 FREE add: 0x70a0509c	 >>> 5 FREE add: 0x708e5030	 >>> 6 FREE add: 0x70ab50d0	 >>> 4 FREE add: 0x70955068	 >>> 0 FREE add: 0x70174f0c	 >>> 9 FREE add: 0x70975084	2 address: 0x70174f0c size: 0x10000
4 address: 0x708e5030 size: 0x30000
5 address: 0x70915048 size: 0x60000
 >>> 2 FREE add: 0x70174f0c	3 address: 0x70975060 size: 0x80000
7 address: 0x709f507c size: 0x70000
 >>> 5 FREE add: 0x70915048	0 address: 0x70ab50d0 size: 0x80000
 >>> 4 FREE add: 0x708e5030	5 address: 0x708e5030 size: 0x40000
 >>> 7 FREE add: 0x709f507c	1 address: 0x709f507c size: 0x60000
 >>> 5 FREE add: 0x708e5030	 >>> 3 FREE add: 0x70975060	7 address: 0x708e5030 size: 0x30000
 >>> 1 FREE add: 0x709f507c	5 address: 0x70915048 size: 0x70000
 >>> 0 FREE add: 0x70ab50d0	 >>> 5 FREE add: 0x70915048	 >>> 7 FREE add: 0x708e5030	 >>> 8 FREE add: 0x70a850b8	
	 0 Alloc 512B at address: 0x70000750 
	 1 Alloc 512B at address: 0x7000AC44 
	 2 Alloc 512B at address: 0x7000B07C 
	 3 Alloc 512B at address: 0x7000BD24 
	 4 Alloc 512B at address: 0x7000C15C 
	 5 Alloc 512B at address: 0x7000C378 
	 6 Alloc 512B at address: 0x7000C594 
	 7 Alloc 512B at address: 0x7000CE04 
	 8 Alloc 512B at address: 0x7000D020 
	 9 Alloc 512B at address: 0x7000D23C 

 >>> 1 FREE add: 0x7000ac44	1 address: 0x708e5030 size: 0x60000
 >>> 4 FREE add: 0x7000c15c	 >>> 5 FREE add: 0x7000c378	 >>> 8 FREE add: 0x7000d020	 >>> 7 FREE add: 0x7000ce04	 >>> 2 FREE add: 0x7000b07c	4 address: 0x70945048 size: 0x40000
5 address: 0x70985068 size: 0x60000
 >>> 6 FREE add: 0x7000c594	8 address: 0x709e5080 size: 0x60000
 >>> 1 FREE add: 0x708e5030	6 address: 0x708e5030 size: 0x50000
 >>> 3 FREE add: 0x7000bd24	 >>> 8 FREE add: 0x709e5080	 >>> 0 FREE add: 0x70000750	7 address: 0x709e5080 size: 0x30000...

With reservation for 10 alloc pointers loop() picks one if nullptr it allocs, if not it frees. Every hundred loop()'s all are set free() - and alloc()'d - 2 sec pause - and repeats with a 1 sec pause every 10 loops, so net 10 alloc or free every second for 9 hours.

The updated 'SimplePlus.ino' - where any Serial USB inout causes a PAUSE until more Serial input is sent:
Code:
#include "static_malloc.h"

// reserve some space for sm_malloc. sm_malloc will never overwrite this buffer
// You can generate the buffer wherever you want. I.e. in normal DTCM Memory, on the external RAM chip (EXTMEM) etc.

extern uint8_t external_psram_size;
constexpr size_t myHeapSize = 1024 * 1024 * 8;
EXTMEM uint8_t myHeap[myHeapSize];
void* someAllocs[10];

void setup()
{
    // sm_set_default_pool(myHeap, myHeapSize, 0, nullptr);
    // CHEAT
    if ( external_psram_size == 8 )
        sm_set_default_pool(myHeap, myHeapSize, 0, nullptr);
    else
        sm_set_default_pool(myHeap, 2 * myHeapSize, 0, nullptr);

    while ( !Serial && millis() < 4000 );
    Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
    Serial.printf( "PSRAM size =%d MB\n", external_psram_size );

    float* f1 = (float*)sm_malloc(sizeof(float)); // allocate a float
    char* t1 = (char*)sm_malloc(100);             // allocate a 100 byte cstring
    float* f2 = (float*)sm_malloc(sizeof(float)); // another float

    *f1 = M_PI;
    *f2 = M_SQRT2;
    snprintf(t1, 100, "This is a text");

    Serial.printf("f1 address: %p, value: %f\n", f1, *f1);
    Serial.printf("f2 address: %p, value: %f\n", f2, *f2);
    Serial.printf("t2 address: %p, value: %s\n", t1, t1);
    Serial.println("----------------------------\n");

    sm_free(f1);

    float* f3 = (float*)sm_malloc(sizeof(float)); // allocate a float
    *f3 = 42;

    Serial.printf("f3 address: %p, value: %f\n", f3, *f3); // should get the same address as f1 had
    Serial.printf("f2 address: %p, value: %f\n", f2, *f2);
    Serial.printf("t1 address: %p, value: %s\n", t1, t1);
    sm_free(t1);
    sm_free(f3);
    sm_free(f2);
    for ( int ii = 0; ii < 10; ii++ ) someAllocs[ii] = nullptr;
}

uint32_t lCnt = 0;
void loop()
{
[B]    uint32_t ii = random(10);[/B]
    lCnt++;
    if ( nullptr != someAllocs[ii] ) {
        [B]sm_free(someAllocs[ii]);[/B]
        Serial.printf(" >>> %ld FREE add: %p\t", ii, someAllocs[ii]);
        someAllocs[ii] = nullptr;
    }
    else {
        [B]uint32_t jj = 1024 * 64 * random(1, 8);[/B]
        [B]someAllocs[ii] = sm_malloc( jj );[/B]
        Serial.printf("%ld address: %p size: 0x%X\n", ii, someAllocs[ii], jj );
    }
    if ( !(lCnt % 100) ) {
[B]        for ( ii = 0; ii < 10; ii++ ) {
            if ( nullptr != someAllocs[ii] ) {
                sm_free(someAllocs[ii]);
                Serial.printf(" >>> %ld FREE add: %p\t", ii, someAllocs[ii]);
                someAllocs[ii] = nullptr;
            }
        }
        Serial.print('\n');
        for ( ii = 0; ii < 10; ii++ ) {
            someAllocs[ii] = sm_malloc( 512 );
            Serial.printf("\t %ld Alloc 512B at address: 0x%X \n", ii, someAllocs[ii] );
        }
[/B]        Serial.print('\n');
        delay(2000);

    }
    if ( !(lCnt % 10) )
        delay(1000);
    if ( Serial.available() ) {
        while ( Serial.available() ) Serial.read();
        Serial.print(" ... PAUSED ... \n");
        while ( !Serial.available() ) delay(10);
        while ( Serial.available() ) Serial.read();
    }
}

Here are the last two runs of 100 after 9 hours:
Code:
	 9 Alloc 512B at address: 0x7000E100 

 >>> 8 FREE add: 0x7000dee4	 >>> 1 FREE add: 0x7000ac44	 >>> 6 FREE add: 0x7000daac	 >>> 2 FREE add: 0x7000c15c	 >>> 3 FREE add: 0x7000c378	 >>> 4 FREE add: 0x7000ce04	8 address: 0x70995064 size: 0x30000
 >>> 9 FREE add: 0x7000e100	9 address: 0x709c507c size: 0x70000
3 address: 0x70a3509c size: 0x70000
4 address: 0x70aa50bc size: 0x50000
 >>> 8 FREE add: 0x70995064	 >>> 0 FREE add: 0x70000750	 >>> 4 FREE add: 0x70aa50bc	1 address: 0x70995064 size: 0x20000
 >>> 3 FREE add: 0x70a3509c	8 address: 0x70a3509c size: 0x20000
4 address: 0x70a550b8 size: 0x30000
 >>> 4 FREE add: 0x70a550b8	0 address: 0x70a550b8 size: 0x20000
 >>> 1 FREE add: 0x70995064	 >>> 8 FREE add: 0x70a3509c	6 address: 0x70995064 size: 0x20000
 >>> 0 FREE add: 0x70a550b8	 >>> 7 FREE add: 0x7000dcc8	4 address: 0x70a3509c size: 0x50000
 >>> 5 FREE add: 0x7000d458	8 address: 0x70a850b8 size: 0x70000
7 address: 0x70af50d8 size: 0x20000
5 address: 0x70b150f4 size: 0x70000
 >>> 7 FREE add: 0x70af50d8	7 address: 0x70b85114 size: 0x70000
3 address: 0x70bf5134 size: 0x60000
 >>> 5 FREE add: 0x70b150f4	 >>> 8 FREE add: 0x70a850b8	 >>> 6 FREE add: 0x70995064	 >>> 9 FREE add: 0x709c507c	2 address: 0x70995064 size: 0x20000
 >>> 3 FREE add: 0x70bf5134	 >>> 4 FREE add: 0x70a3509c	4 address: 0x709b5080 size: 0x50000
 >>> 2 FREE add: 0x70995064	0 address: 0x70a0509c size: 0x30000
9 address: 0x70a350b4 size: 0x50000
6 address: 0x70995064 size: 0x20000
 >>> 7 FREE add: 0x70b85114	1 address: 0x70a850d0 size: 0x50000
2 address: 0x70174f0c size: 0x10000
3 address: 0x70ad50ec size: 0x70000
7 address: 0x70b4510c size: 0x30000
 >>> 3 FREE add: 0x70ad50ec	 >>> 7 FREE add: 0x70b4510c	 >>> 9 FREE add: 0x70a350b4	3 address: 0x70a350b4 size: 0x30000
7 address: 0x70ad50ec size: 0x20000
 >>> 4 FREE add: 0x709b5080	 >>> 7 FREE add: 0x70ad50ec	8 address: 0x709b5080 size: 0x30000
4 address: 0x70ad50ec size: 0x50000
 >>> 6 FREE add: 0x70995064	6 address: 0x70b25108 size: 0x30000
 >>> 3 FREE add: 0x70a350b4	7 address: 0x70995064 size: 0x10000
 >>> 1 FREE add: 0x70a850d0	9 address: 0x70a350b4 size: 0x60000
 >>> 8 FREE add: 0x709b5080	3 address: 0x709a5084 size: 0x30000
 >>> 6 FREE add: 0x70b25108	 >>> 7 FREE add: 0x70995064	 >>> 2 FREE add: 0x70174f0c	1 address: 0x70b25108 size: 0x70000
 >>> 3 FREE add: 0x709a5084	8 address: 0x70995064 size: 0x50000
 >>> 0 FREE add: 0x70a0509c	 >>> 4 FREE add: 0x70ad50ec	 >>> 8 FREE add: 0x70995064	7 address: 0x70995064 size: 0x70000
3 address: 0x70a05084 size: 0x20000
0 address: 0x70a950cc size: 0x30000
 >>> 0 FREE add: 0x70a950cc	4 address: 0x70a950cc size: 0x30000
2 address: 0x70ac50e4 size: 0x40000
 >>> 7 FREE add: 0x70995064	0 address: 0x70995064 size: 0x60000
6 address: 0x70b95128 size: 0x60000
7 address: 0x70bf5140 size: 0x20000
8 address: 0x70c1515c size: 0x20000
 >>> 4 FREE add: 0x70a950cc	 >>> 6 FREE add: 0x70b95128	6 address: 0x70b95128 size: 0x40000
 >>> 3 FREE add: 0x70a05084	3 address: 0x709f507c size: 0x40000
 >>> 7 FREE add: 0x70bf5140	 >>> 8 FREE add: 0x70c1515c	4 address: 0x70174f0c size: 0x10000
 >>> 6 FREE add: 0x70b95128	6 address: 0x70b95128 size: 0x50000
 >>> 9 FREE add: 0x70a350b4	 >>> 0 FREE add: 0x70995064	7 address: 0x70a3509c size: 0x70000
 >>> 1 FREE add: 0x70b25108	 >>> 2 FREE add: 0x70ac50e4	 >>> 3 FREE add: 0x709f507c	 >>> 4 FREE add: 0x70174f0c	 >>> 6 FREE add: 0x70b95128	 >>> 7 FREE add: 0x70a3509c	
	 0 Alloc 512B at address: 0x70000750 
	 1 Alloc 512B at address: 0x7000AC44 
	 2 Alloc 512B at address: 0x7000C15C 
	 3 Alloc 512B at address: 0x7000C378 
	 4 Alloc 512B at address: 0x7000CE04 
	 5 Alloc 512B at address: 0x7000D458 
	 6 Alloc 512B at address: 0x7000DAAC 
	 7 Alloc 512B at address: 0x7000DCC8 
	 8 Alloc 512B at address: 0x7000DEE4 
	 9 Alloc 512B at address: 0x7000E100 

 >>> 1 FREE add: 0x7000ac44	1 address: 0x70995064 size: 0x50000
 >>> 8 FREE add: 0x7000dee4	 >>> 3 FREE add: 0x7000c378	 >>> 0 FREE add: 0x70000750	 >>> 2 FREE add: 0x7000c15c	3 address: 0x709e5080 size: 0x40000
 >>> 1 FREE add: 0x70995064	 >>> 9 FREE add: 0x7000e100	 >>> 3 FREE add: 0x709e5080	3 address: 0x70995064 size: 0x40000
 >>> 5 FREE add: 0x7000d458	9 address: 0x70174f0c size: 0x10000
2 address: 0x709d5084 size: 0x40000
 >>> 6 FREE add: 0x7000daac	0 address: 0x70a150a4 size: 0x60000
 >>> 2 FREE add: 0x709d5084	8 address: 0x709d5084 size: 0x30000
6 address: 0x70a750bc size: 0x10000
 >>> 6 FREE add: 0x70a750bc	 >>> 7 FREE add: 0x7000dcc8	6 address: 0x70a750bc size: 0x60000
1 address: 0x70ad50d4 size: 0x70000
5 address: 0x70b450f4 size: 0x40000
 >>> 6 FREE add: 0x70a750bc	7 address: 0x70a750bc size: 0x40000
 >>> 8 FREE add: 0x709d5084	 >>> 0 FREE add: 0x70a150a4	 >>> 9 FREE add: 0x70174f0c	 >>> 1 FREE add: 0x70ad50d4	 >>> 7 FREE add: 0x70a750bc	7 address: 0x70174f0c size: 0x10000
2 address: 0x709d5084 size: 0x10000
1 address: 0x709e50a4 size: 0x50000
 >>> 2 FREE add: 0x709d5084	8 address: 0x70a350c0 size: 0x30000
 >>> 5 FREE add: 0x70b450f4	6 address: 0x70a650d8 size: 0x50000
 >>> 6 FREE add: 0x70a650d8	 >>> 8 FREE add: 0x70a350c0	0 address: 0x70a350c0 size: 0x20000
5 address: 0x70a550dc size: 0x40000
2 address: 0x709d5084 size: 0x10000
 >>> 7 FREE add: 0x70174f0c	 >>> 3 FREE add: 0x70995064	7 address: 0x70a950fc size: 0x70000
 >>> 1 FREE add: 0x709e50a4	8 address: 0x70995064 size: 0x30000
 >>> 5 FREE add: 0x70a550dc	 >>> 2 FREE add: 0x709d5084	9 address: 0x70174f0c size: 0x10000
3 address: 0x709c507c size: 0x30000
2 address: 0x70b0511c size: 0x70000
 >>> 0 FREE add: 0x70a350c0	 >>> 3 FREE add: 0x709c507c	1 address: 0x709c507c size: 0x60000
 >>> 1 FREE add: 0x709c507c	5 address: 0x709c507c size: 0x20000
 >>> 9 FREE add: 0x70174f0c	3 address: 0x709e5098 size: 0x50000
9 address: 0x70a350b4 size: 0x20000
 >>> 9 FREE add: 0x70a350b4	 >>> 7 FREE add: 0x70a950fc	 >>> 5 FREE add: 0x709c507c	6 address: 0x709c507c size: 0x20000
 >>> 2 FREE add: 0x70b0511c	5 address: 0x70a350b4 size: 0x60000
1 address: 0x70a950cc size: 0x40000
 >>> 4 FREE add: 0x7000ce04	 >>> 1 FREE add: 0x70a950cc	2 address: 0x70174f0c size: 0x10000
 >>> 6 FREE add: 0x709c507c	 >>> 5 FREE add: 0x70a350b4	 >>> 3 FREE add: 0x709e5098	3 address: 0x709c507c size: 0x10000
5 address: 0x709d509c size: 0x70000
0 address: 0x70a450bc size: 0x40000
7 address: 0x70a850dc size: 0x70000
 >>> 7 FREE add: 0x70a850dc	 >>> 8 FREE add: 0x70995064	 >>> 2 FREE add: 0x70174f0c	9 address: 0x70995064 size: 0x20000
1 address: 0x70a850dc size: 0x30000
2 address: 0x70ab50f4 size: 0x60000
 >>> 3 FREE add: 0x709c507c	 >>> 9 FREE add: 0x70995064	 >>> 2 FREE add: 0x70ab50f4	2 address: 0x70ab50f4 size: 0x70000
 >>> 1 FREE add: 0x70a850dc	 >>> 0 FREE add: 0x70a450bc	7 address: 0x70995064 size: 0x40000
9 address: 0x70a450bc size: 0x40000
 >>> 2 FREE add: 0x70ab50f4	 >>> 9 FREE add: 0x70a450bc	1 address: 0x70a450bc size: 0x50000
8 address: 0x70a950d8 size: 0x40000
 >>> 5 FREE add: 0x709d509c	3 address: 0x709d5084 size: 0x20000
 >>> 8 FREE add: 0x70a950d8	9 address: 0x709f50a0 size: 0x50000
 >>> 1 FREE add: 0x70a450bc	 >>> 3 FREE add: 0x709d5084	 >>> 7 FREE add: 0x70995064	 >>> 9 FREE add: 0x709f50a0	
	 0 Alloc 512B at address: 0x70000750 
	 1 Alloc 512B at address: 0x7000AC44 
	 2 Alloc 512B at address: 0x7000C15C 
	 3 Alloc 512B at address: 0x7000C378 
	 4 Alloc 512B at address: 0x7000CE04 
	 5 Alloc 512B at address: 0x7000D458 
	 6 Alloc 512B at address: 0x7000DAAC 
	 7 Alloc 512B at address: 0x7000DCC8 
	 8 Alloc 512B at address: 0x7000DEE4 
	 9 Alloc 512B at address: 0x7000E100 

 ... PAUSED ...
 
Last edited:
Good test, however you make it easy for the library since you clean the memory every 100 iterations :). For the fun of it I was thinking of doing a quick GUI showing a live memory map to watch fragmentation. Maybe I find some time for it.
I really like the fact that smalloc only operates on the chunk of memory you give it. Together with the out of memory callback this makes use of dynamic allocation quite safe/testable I'd say.

>> CHEAT below since the EXTMEM size isn't known at compile time - I compile alloc 8MB and double when there is 16MB of EXTMEM - could have done 16MB and limited to 8MB - the other way?
Allocs are too big for ALL to work with only 8MB EXTMEM - but it seems to run right.

Yes that works but as you know, this it is a dangerous cheat :)

Code:
  sm_set_default_pool(myHeap, 2*myHeapSize, 0, nullptr);

With this call the lib assumes that it got a reserved block of continuous memory starting at 'myHeap' and being' myHeapSize' bytes long. Since you actually only reserved 8MB but told the library that it got 16MB it happily uses the full 16MB. Of course the compiler/linker does not know anything about this and would place additional EXTMEM variables in a place above the first 8MB where they would be overwritten by the library. However, this trick might be a good solution if you can assure that you (and libraries) will use the EXTMEM exclusively with smalloc.
 
Last edited:
Good test, however you make it easy for the library since you clean the memory every 100 iterations :). For the fun of it I was thinking of doing a quick GUI showing a live memory map to watch fragmentation. Maybe I find some time for it.
I really like the fact that smalloc only operates on the chunk of memory you give it. Together with the out of memory callback this makes use of dynamic allocation quite safe/testable I'd say.



Yes that works but as you know, this it is a dangerous cheat :)

Code:
  sm_set_default_pool(myHeap, 2*myHeapSize, 0, nullptr);

With this call the lib assumes that it got a reserved block of continuous memory starting at 'myHeap' and being' myHeapSize' bytes long. Since you actually only reserved 8MB but told the library that it got 16MB it happily uses the full 16MB. Of course the compiler/linker does not know anything about this and would place additional EXTMEM variables in a place above the first 8MB where they would be overwritten by the library. However, this trick might be a good solution if you can assure that you (and libraries) will use the EXTMEM exclusively with smalloc.

I figured 100 would be a good start - and coded for only blocks of 10 allocs. I was really wanting to prove that everything would get returned to the free heap. With no idea how robust the library was expected to be - or if I could even get the code done in time to sleep :) Having a second set of 10 interweave and only go away less often would be another way to validate. Also since he test mostly fails on 8MB PSRAM the alloc sizes are on the large size and not vary variable.

That worked according to the addresses indicated - but nothing was every stored or validated to be there. I should have at least put the address in the block [start and end and the size too] - but that would have added casting overhead :)

INDEED - dangerous - thus the CHEAT comment. There isn't a perfect solution for validation on every given system that may have 0, 8 or 16 MB onboard - at least without some more advanced work to have it really incorporated.

The other noted idea would be safer in some ways for linker help - but not much more pretty - and still need verification/change if no PSRAM EXTMEM were present:
Code:
constexpr size_t myHeapSize = 1024 * 1024 * 16;
EXTMEM uint8_t myHeap[myHeapSize];

...

    if ( external_psram_size == 8 )
        sm_set_default_pool(myHeap, myHeapSize/2, 0, nullptr);
    else
        sm_set_default_pool(myHeap, myHeapSize, 0, nullptr);
 
For the fun of it I was thinking of doing a quick GUI showing a live memory map to watch fragmentation. Maybe I find some time for it.

Took defragsters idea and quickly hacked a GUI to watch the library mallocing and freeing chunks of memory in a 56000 byte memory pool. (Win10 binaries and sources here: https://github.com/luni64/static_malloc/tree/main/extras/MallocViewer)

Here a short meditative video observing smalloc at work:

And here the Teensy code in case somebody is interested:

Code:
#include "static_malloc.h"
#include "smalloc/smalloc_i.h"

constexpr unsigned heapSize = 800 * 70; // must match  GUI
constexpr unsigned nrOfChunks = 500;
constexpr unsigned maxChunkLen = 250;

void* chunks[nrOfChunks];
uint8_t myHeap[heapSize];

void setup()
{
    pinMode(LED_BUILTIN, OUTPUT);
    memset(chunks, 0, sizeof(chunks));
    sm_set_default_pool(myHeap, sizeof(myHeap), false, onError);
}

void loop()
{
    if (Serial.available())
    {
        switch (Serial.read())
        {
            case '*': // allocate/deallocate
            {
                size_t chunkLen = random(1, maxChunkLen);
                unsigned idx = random(0, nrOfChunks);

                if (chunks[idx] == nullptr) // if slot empty allocate memory and store pointer
                {
                    chunks[idx] = sm_malloc(chunkLen);
                    printInfo(idx, '+');

                } else
                {
                    printInfo(idx, '-');
                    sm_free(chunks[idx]);
                    chunks[idx] = nullptr;
                }
                break;
            }

            case 'c': // clear all memory
                for (unsigned i = 0; i < nrOfChunks; i++)
                {
                    if (chunks[i] != nullptr)
                    {
                        sm_free(chunks[i]);
                        chunks[i] = nullptr;
                    }
                }
                break;

            default:
                break;
        }
    }

    static elapsedMillis stopwatch = 0;
    if (stopwatch > 500)
    {
        stopwatch = 0;
        digitalToggle(LED_BUILTIN);
    }
}

// Helpers ------------------------------------------

void printInfo(unsigned idx, char type)
{
    size_t total, totalUser, totalFree;
    int nrBlocks;

    sm_malloc_stats(&total, &totalUser, &totalFree, &nrBlocks);

    smalloc_hdr* hdr = USER_TO_HEADER(chunks[idx]);

    unsigned blockStart = (char*)hdr - (char*)myHeap; // address relative to buffer start
    unsigned usrStart = blockStart + 12;              // block starts with 12byte header
    unsigned TagStart = usrStart + hdr->rsz;          // tag starts rsz bytes after user start
    unsigned blockEnd = TagStart + 12;                // tag is always 12 byte long

    Serial.printf("%c %u %u %u %u %u %u %u\n", type, idx, blockStart, blockEnd, total, totalUser, totalFree, nrBlocks);
}

unsigned onError(smalloc_pool*, unsigned)
{
    Serial.println("e 0 0 0\n");
    pinMode(LED_BUILTIN, OUTPUT);
    while (true)
    {
        digitalToggle(LED_BUILTIN);
        delay(50);
    }
}
 
I'm going to bring the smalloc files into the core library. I see you tried to contact the author a couple weeks ago.

https://github.com/electrorys/smalloc/issues/1

Since the repository hasn't had any updates for 3 years and the author personal website appears to not work, I'm going to guess this code is pretty much abandoned. Fortunately it's MIT license.

If anyone objects, please speak up now!
 
smalloc add w/MIT license seems like a good idea to allow controlled alloc of EXTMEM - or other.

Had some testing as @luni presented it and seemed decently robust - with no license issue more testing possible if done in beta to PJRC liking.
 
Since the repository hasn't had any updates for 3 years and the author personal website appears to not work, I'm going to guess this code is pretty much abandoned. Fortunately it's MIT license.
This is exactly what I was thinking :)

Cloned the current core but it looks like something is broken. I get the following warning:
Code:
CORE [CC]  startup.c 
C:\toolchain\Teensyduino\current_fork\cores\teensy4/startup.c: In function 'configure_external_ram':
C:\toolchain\Teensyduino\current_fork\cores\teensy4/startup.c:443:3: warning: implicit declaration of function 'memset' [-Wimplicit-function-declaration]
   memset(&extmem_smalloc_pool, 0, sizeof(extmem_smalloc_pool));
   ^
C:\toolchain\Teensyduino\current_fork\cores\teensy4/startup.c:443:3: warning: incompatible implicit declaration of built-in function 'memset'
C:\toolchain\Teensyduino\current_fork\cores\teensy4/startup.c:443:3: note: include '<string.h>' or provide a declaration of 'memset'
CORE [CC]  eeprom.c

And Serial.printxx doesn't work anymore. Compiles and blinks but doesn't print
Code:
void setup()
{
  while (!Serial);
  pinMode(LED_BUILTIN, OUTPUT);

  Serial.printf("Start");
}

void loop()
{
  digitalToggle(LED_BUILTIN);
  delay(250);
}
 
That fixed the warning but I still can't print (neither with my makefile build nor with the IDE).
- Just to make sure I reinstalled Arduino 1.13
- Installed TeensyDuino1.54 beta2
- replaced the core files in the arduino folder by the cloned ones

Serial somehow works, since the sketch waits until I connect SerMon. But it doesn't print anything. Tried teensy.exe and TyCommander.

Prints perfectly with an older installation.

I'll rewind the commits and see when it works again...
 
I tested on Ubuntu 18.04 with fresh copy of Arduino 1.8.13 and the lastest work-in-progress. Serial.printf() works fine.

Maybe I need to dig out my Windows test laptop?
 
I get the problems since this commit. Works before that.

Commit: dc567880488dc53ba2e71eb07e04356bda7e2a7e [dc56788]
Parents: fa1c5a1bca
Author: PaulStoffregen <paul@pjrc.com>
Date: Sonntag, 18. Oktober 2020 06:16:07
Committer: PaulStoffregen
Changes to support HAB (thanks dresden-fx)
 
I get the problems since this commit. Works before that.

Does a fresh copy of Arduino 1.8.13 and Teensyduino 1.54-beta2 work?

That commit did indeed cause problems on Teensy 4.1 (or perhaps more clearly expose long-standing problems). The issue is believed to be fully resolved by a later commit, which is part of 1.54-beta2. You can see several messages about the mysterious startup issue on the 1.54-beta1 thread.
 
Hi Paul,

Good news, will Sync up. Glad there is some memory allocate functions for external memory. I also wish we had one for the area of memory many may have on their machine in the ITCM, when there main usage of DTCM/ITCM leaves a lot of area that currently only used for stack. It is always a shame to not use this memory as it is the main one you want to use for speed.
 
I tested on Ubuntu 18.04 with fresh copy of Arduino 1.8.13 and the lastest work-in-progress. Serial.printf() works fine.

Maybe I need to dig out my Windows test laptop?

Just for a check I did downloaded and install the latest core over 1.54b2 and am seeing the same thing for both the T4.0 and the T4.1 - Serial.printxx not printing to SerMon. T3.6 still works though.
 
Back
Top