Can arrays in DMAMEM be initialized?

Bill Greiman

Well-known member
I have a SdFat user posting an SdFat issue. He wrote an initialized array to an SD and said the content was not correct.

I looked at his example and it appears a DMAMEM array like the example below is not initialized. It appears to have random content.

Code:
DMAMEM uint8_t test[] = {1,2,3,4,5,6,7,8,9,10};
void setup() {
 Serial.begin(9600);
 while (!Serial) {
 }
 for (size_t i = 0; i < sizeof(test); i++) {
  Serial.print(test[i]);
  Serial.print(' ');
 }
 Serial.println();
}
void loop() {}

Typical output:
64 23 114 16 242 140 4 49 61 115

The correct output is printed if I remove the DMAMEM.
1 2 3 4 5 6 7 8 9 10

What should I tell him?
 
No, they cannot be initialized. From the Teensy 4.0 and 4.1 product pages:

  • DMAMEM - Variables defined with DMAMEM are placed at the beginning of RAM2. Normally buffers and large arrays are placed here. These variables can not be initialized, your program must write their initial values, if needed.
 
@MichaelMeissner is of course correct. As noted if user code wants known values there, they must be copied/created.

The T_4.x family has everything in FLASH on startup. The PROGMEM/Flash based ResetHandler() code moves RAM1 Program and Data from Flash to their expected locations before calling setup().

There are no such provisions for RAM2/DMAMEM. The processor uses some of that RAM2 space on initial starting, and the rest is left holding whatever happens to be there. On a warm restart that is prior values beyond the processor startup security scratch space.
 
If anyone knows a way to accomplish this, I would like to make future Teensyduino able to initialize DMAMEM. But so far, all my attempts have been unsuccessful.

The main problem is, at least by my limited knowledge of gcc, our definition of PROGMEM only puts the variable into a unique section but doesn't give a unique section name where its initialization data goes.

Code:
#define DMAMEM __attribute__ ((section(".dmabuffers"), used))

Maybe now that we've upgraded to gcc 11.3.1 (in 1.58-beta) perhaps there is some attribute we can use to tell gcc which section to put the initialization data into? Or maybe this has been possible all along with gcc 5.4.1 and I just don't know how to do it?

Anyway, just to explain, the reason we don't have DMAMEM initialization is I've never been able to find a way to tell gcc to put the initialization data for these variables into its own section, which the linker script would then place into flash and the startup code would copy to RAM.
 
If anyone knows a way to accomplish this, I would like to make future Teensyduino able to initialize DMAMEM. But so far, all my attempts have been unsuccessful.

The main problem is, at least by my limited knowledge of gcc, our definition of PROGMEM only puts the variable into a unique section but doesn't give a unique section name where its initialization data goes.

Code:
#define DMAMEM __attribute__ ((section(".dmabuffers"), used))

Maybe now that we've upgraded to gcc 11.3.1 (in 1.58-beta) perhaps there is some attribute we can use to tell gcc which section to put the initialization data into? Or maybe this has been possible all along with gcc 5.4.1 and I just don't know how to do it?

Anyway, just to explain, the reason we don't have DMAMEM initialization is I've never been able to find a way to tell gcc to put the initialization data for these variables into its own section, which the linker script would then place into flash and the startup code would copy to RAM.

I haven't delved into linker scripts for maybe 20 years now, but I believe the technique is to use the AT output attribute. Here is code clipped from:

Output section LMA

Every section has a virtual address (VMA) and a load address (LMA); see section Basic Linker Script Concepts. The address expression which may appear in an output section description sets the VMA (see section Output section address).

The linker will normally set the LMA equal to the VMA. You can change that by using the AT keyword. The expression lma that follows the AT keyword specifies the load address of the section.

This feature is designed to make it easy to build a ROM image. For example, the following linker script creates three output sections: one called `.text', which starts at 0x1000, one called `.mdata', which is loaded at the end of the `.text' section even though its VMA is 0x2000, and one called `.bss' to hold uninitialized data at address 0x3000. The symbol _data is defined with the value 0x2000, which shows that the location counter holds the VMA value, not the LMA value.

Code:
SECTIONS
  {
  .text 0x1000 : { *(.text) _etext = . ; }
  .mdata 0x2000 : 
    AT ( ADDR (.text) + SIZEOF (.text) )
    { _data = . ; *(.data); _edata = . ;  }
  .bss 0x3000 :
    { _bstart = . ;  *(.bss) *(COMMON) ; _bend = . ;}
}

The run-time initialization code for use with a program generated with this linker script would include something like the following, to copy the initialized data from the ROM image to its runtime address. Notice how this code takes advantage of the symbols defined by the linker script.

Code:
extern char _etext, _data, _edata, _bstart, _bend;
char *src = &_etext;
char *dst = &_data;

/* ROM has data at end of text; copy it. */
while (dst < &_edata) {
  *dst++ = *src++;
}

/* Zero bss */
for (dst = &_bstart; dst< &_bend; dst++)
  *dst = 0;
 
If anyone knows a way to accomplish this, I would like to make future Teensyduino able to initialize DMAMEM. But so far, all my attempts have been unsuccessful.
Hi, sorry to dig this message from the past, but is this still true, or did you find a solution?
My question is: can we initialize arrays defined as DMAMEM and EXTMEM?
 
Thanks, is it worth the effort in terms of execution time reduction?
Do you mean the time for your application to zero the memory before using it? That seems like a question only you can answer, but if you do it in your setup() or one of the pre-setup "hook" functions, would that matter in your application?
 
Actually, this is my first time using DMAMEM and EXTMEM on a Teensy 4.1
I have large arrays whose values are defined in .h files, for example:
Code:
static float weights[1234] = { 1.0, 2.0, ...
...
...
};

To put this array into RAM2, I understand that I have to do something like this:
Code:
DMAMEM float newWeights[1234];
#include "weights.h"
memcpy(newWeights, weights, 1234);

Is this correct?
I guess that the memcpy can take some time.

Then, if it is possible to use a customised linker script and modified startup code, just to write a single instruction as:
Code:
DMAMEM float newWeights[1234]
 = { 1.0, 2.0, ...
...
...
};
and that by doing this, my code will run faster, then it is worth the effort (the effort to learn how to do it). But if I only gain a few percent on execution speed, then it not worth it.
 
In startup.c PJRC pulls RAM1 contents for CODE and DATA using a manual copy using the extents defined in the linkage.

Nothing is FREE - the same would have to be done on startup to pull 'stored Flash' data to RAM2 or EXTMEM/PSRAM whether the linker script evolved to support it - or as noted in p#10 a manual: memcpy(newWeights, weights, 1234);
 
So the code snippet proposed in post 10 is correct?
Sometimes, at compilation, I get an error such as: "DMAMEM is not known" or "EXTMEM is not known". Do I need to #include something before using DMAMEM or EXTMEM ?
 
Sometimes, at compilation, I get an error such as: "DMAMEM is not known" or "EXTMEM is not known".

In Arduino IDE, click Tools menu and look at state Board menu. If another non-Teensy board is configured, or if an older model Teensy is selected, then Arduino IDE will compile your program according to that other board's libraries. Teensy specific keywords like DMAMEM only are known if you have the correct board configured.

If Teensy is connected to your PC, it should appear in the drop-down list on the main window's toolbar. If not, you can use the Tools > Board menu to open a window that lets you select Teensy (or any other type of board) even if it's not connected to your PC.
 
So the code snippet proposed in post 10 is correct?
'Conceptually' that looks right. Given a pointer to the resource "address[]" in FLASH then a memcpy() to the location desired for use should work to then reference from that new location.
 
So the code snippet proposed in post 10 is correct?
Sometimes, at compilation, I get an error such as: "DMAMEM is not known" or "EXTMEM is not known". Do I need to #include something before using DMAMEM or EXTMEM ?
Be sure that you're start 'init' table is only inside flash. I can image that "static" only means that it will be copied to standard RAM and that you end up with 3 copies: 1 in flash (to initialize from), 1 in standard memory and 1 inside DMAMEM.
 
Do you mean that I should remove the static keyword?

Code:
float weights[1234] = { 1.0, 2.0, ...
...
...
};
 
So the code snippet proposed in post 10 is correct?

It's pretty close, but a couple small issues. You want PROGMEM on the array holding the actual const data, so it doesn't consume any RAM1 memory. In setup() where you call memcpy, use sizeof() because memcpy wants the actual number of bytes rather than the number of items in the array.


Code:
DMAMEM float newWeights[4];
PROGMEM const float newWeights_data[4] = {1.24, 0.42, -5.7, 0.51};

void setup() {
  memcpy(newWeights, newWeights_data, sizeof(newWeights));
  // rest of your program setup
}

void loop() {
  // your program...
}
 
Thanks Paul. My initial array definition is in a .h file. Is it the same? Do I add PROGMEM as well?
 
Yup, #include just includes the contents of another file, as if you had typed it all. Simple stuff really.

Except things can get complicated quickly if you include that same header file from multiple .cpp / .ino files. Then it is the same as if you had typed it all in both places. If these arrays are created as global variables, hopefully it's easy to understand how having 2 global variables with the same name is an error. Or at least it's easy to understand if you had typed the code yourself.

When you get multiple .cpp files including header files which includ other header files, everything feels more abstract and it can be difficult to understand why something becomes an error. While the compiler doesn't enforce any rules about what you can and can't put in a header file, because it's really just including all the stuff from another file as if you had typed it, generally speaking most C / C++ programmers follow conventions about the sort of code that goes into header files versus what goes into the .cpp / .ino files. Normally definitions of functions and arrays and constants are put into header files. Some other thing are too, but this message is already long, so I'm not covering everything. My main point is certain things are usually not put into .h header files. For global scope variables (which are also generally not considered to be best practice) normally only an "extern" declaration of the variable is put into header files. Then exactly 1 .cpp / .ino file would have the actual definition of the variable. None of this is a hard requirement from the compiler. It's just the common practice most programmers follow, because it prevents errors like duplicate definition of global scope variables.
 
It's also a very good practice to expose as little as possible inside .h file. Only those things that are needed for the other .c/.cpp files.

For a sensor interface for example don't expose internal info to get the sensor working.

Edit: With c++ you could use a namespace to isolate things. Quite handy when the .h/.cpp couple for a sensor expose quite some things. Unfortunately then you have to keep this doing every change happens. I would love that the producers of sensor boards (aka sparkfun, adafruit,...) would do that.
 
Last edited:
Thanks for your answers. I understand now better how to use the header files and optimize the memory usage.
Here what I do : all remarks and suggestions are welcome. My objective is to save as much memoray as I can and increase speed also.

Header file example (conv_layer_3_0_weight.h):
Code:
#include <stdint.h>


PROGMEM const float conv_layer_3_0_weight_output_0[73728] =
{
    0.02301940880715847, -0.03338766098022461, 0.057616718113422394, -0.02217615395784378, -0.010713234543800354,
    0.021934974938631058, -0.017979929223656654, -0.06382313370704651, -0.05984314903616905, 0.020558379590511322,
...
};

Global variables declaration:
Code:
#include "..\include\parameters\conv_layer_3_0_weight.h" // moved from the local block
float input[32768];
DMAMEM float output[32768];
DMAMEM float weightsRAM2[73728];
EXTMEM float weightsPSRAM[294912];

Then, I use these variables in { } blocks like this, to ensure the variables do not exist when I don't need them:

Code:
{
// #include "..\include\parameters\conv_layer_3_0_weight.h" // Moved to global
#include "..\include\parameters\conv_layer_3_0_bias.h"
        memcpy(weightsRAM2, conv_layer_3_0_weight_output_0, 294912);
        convolution_forward<CONV_LAYER_2_CONV_LAYER_2_2_MAXPOOL_OUTPUT_0_NB_CHANNELS,
                            CONV_LAYER_2_CONV_LAYER_2_2_MAXPOOL_OUTPUT_0_IN_HEIGHT,
                            CONV_LAYER_2_CONV_LAYER_2_2_MAXPOOL_OUTPUT_0_IN_WIDTH,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_OUTPUT_0_NB_OUTPUTS,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_OUTPUT_0_OUT_HEIGHT,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_OUTPUT_0_OUT_WIDTH,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_PADDING_Y,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_PADDING_X,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_STRIDE_Y,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_STRIDE_X,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_DILATION_Y,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_DILATION_X,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_KERNEL_HEIGHT,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_KERNEL_WIDTH,
                            CONV_LAYER_3_CONV_LAYER_3_0_CONV_ACTIVATION>(input, output, weightsRAM2, conv_layer_3_0_bias_output_0, CONV_LAYER_3_CONV_LAYER_3_0_CONV_RESCALING);
        memcpy(input, output, 131072); // 32768 * 4 = 131072
    }

EDIT: it seems that a PROGMEM array cannot be a local variable, so I need to pur the #include in the global variables definition lines.
 
Last edited:
Back
Top