Curious: Code size for FreeRTOS port, ARM vs. AVR; CPU speed

Status
Not open for further replies.

stevech

Well-known member
Without changing the GCC optimizer settings.. I did a build of the (greatly appreciated) FreeRTOS port for both ARM and AVR.
The ARM (Cortex M4, a thumb-2 CPU) was about 16K of code in flash; the AVR compile was under 8K.
I would have expected the M4 to be smaller. Not a big issue with 128K. Maybe it's not code size but the number of libraries used in the ARM port and demo.

I also ask, as a newbie, about the 96MHz CPU mode for the Teensy3. Does this suggest that Freescale's CPUs are OK at that speed at, say, -20F to +130F ambient?
 
Without changing the GCC optimizer settings.. I did a build of the (greatly appreciated) FreeRTOS port for both ARM and AVR.
The ARM (Cortex M4, a thumb-2 CPU) was about 16K of code in flash; the AVR compile was under 8K.
I would have expected the M4 to be smaller. Not a big issue with 128K. Maybe it's not code size but the number of libraries used in the ARM port and demo.

I also ask, as a newbie, about the 96MHz CPU mode for the Teensy3. Does this suggest that Freescale's CPUs are OK at that speed at, say, -20F to +130F ambient?
Well the underlying machines are different, so you will see large differences. Beyond the basic Arm vs. Avr instruction sizes, etc. is the fact that the AVR port enables the -ffunction-sections on the compiler and the --gc-sections on the linker, which the last I checked was not being done by the Teensy build process. These options tell the linker to delete from the final image any function that is not referenced. So, if your library has 10 different methods for different data types, and you define all 10 methods in the same file, but you only call one method, under the AVR the unused 9 methods would not be included in the image, while under Teensy 3.0, all 10 would be included. Paul has said that someday, he would like to investigate using those options, but as far as I know, it isn't being used right now.
 
Most C code does compile to larger binary sizes on 32 bit ARM than it does on 8 bit AVR. I've spent a little time looking into this.

#1: When accessing any statically allocated memory, ARM needs for 32 bit memory addresses and related code to fetch them from small lookup tables created near each function, rather than AVR's instructions than embed a 16 bit memory address directly into a single 32 bit instruction.

#2: The newlib C library contains lots code designed for larger system with POSIX compliance. The avr-libc implements very lean functions that give a minimal but useful functionality, but nothing even remotely like POSIX standards. Just one use of printf or sprintf anywhere brings in a huge amount of library code.

The compiler settings are also not perfect. We are using -ffunction-sections and --gc-sections. However, all the .o files are not being built into a .a library, as is done on AVR. That might also have an effect. It's on my to-do list, but at a lower priority because it only effects code size.

On AVR, there is some serious linker magic done with the interrupt vectors. I wrote about it in this message:

http://forum.pjrc.com/threads/23467-Using-std-vector?p=32561&viewfull=1#post32561

Micheal, if you have any insights on how the AVR linker is working that magical trick, please please please tell me?!
 
I believe it is due to a weak reference. The file hardware/arduino/cores/arduino/HardwareSerial.h has this declaration:
Code:
extern void serialEventRun(void) __attribute__((weak));

Note, there are two underscores before/after the attribute. For good coding practice, the 'weak' should also have two leading and two trailing underscores, since some user out there might #define weak to be 53 or some such.

And then there is the reference in hardware/arduino/cores/arduino/main.cpp:
Code:
              if (serialEventRun) serialEventRun();

There are other weak references scattered through the library. HID/CDC also define a WEAK macro to simplify using the attribute.

Basically a weak reference will not pull in serialEventRun from the library if it has a reference, but if something else brings in the module that defines the function, it satisfies the link with the address. If the symbol is not pulled in, it is resolved as 0. So the line from main.cpp says if the function was pulled in, call it. If not fail the if test.

Another way to do this is to declare a pointer to a function as a common symbol (i.e. declare it in one file without a definition). If nothing else defines the variable, the linker will create a symbol in the bss area and have it initialized by 0 (presumably on an embedded board, the first thing that runs would zero out the bss area). In the module that gets pulled in with a reference would define the same variable, but initialize it with the address of a static function in the object file. Like main.cpp above, you would test the pointer before calling it (unless you have a return instruction located at location 0).

For the first case, you need to have the weak attribute, or else the compiler will delete the test, since it knows that the ISO standard mandates no object/function starts at location 0, and the if statement would never succeed.

Looking at other attributes used, I see packed in the wifi, which might be a slow down for arm (depending on whether the compiler believes it can do unaligned load/stores on ARM, it might have to do load byte/store byte to deal with packed values). There are some naked/interrupt attributes, which are presumably interrupt stuff, and would be completely different on an Arm. A few things use attribute section to get a specific section, which also might be machine dependent.

One minor coda. Weak references only work on the ELF object file format, which is pretty much the standard object file format these days for software where GCC/binutils are the primary compilers (I believe LLVM also). Some systems use other object file formats, which might not support weak references. But for Arduino, Teensy, etc. you can use weak references.
 
Last edited:
I'm learning (slowly) about "weak" for the linker. This fuzzes things up, for me, it used to be simple: If I don't reference a function, it doesn't get pulled from the .a.

I've experienced 8 bit micros like the AVR do have smaller code, for some use cases, and for 16 bit int's. But code grows fast when the 8 bitter has to do lots of 32 bit load/store and indexing.
The ARM7TDMI in 16 bit thumb mode, for the same source, was about 30% smaller than in 32 bit mode, in my prior recent work, for a 100KB program in flash.

I'm such a newbie- I'm still looking for the Arduino IDE way to change the compiler command options!
 
Status
Not open for further replies.
Back
Top