Arduino / Teensy code size - huge difference

Status
Not open for further replies.
I have a puzzlement over compiled code size when swapping between Arduino and Teensy.

Project compile results below. When compiling for the Teensy the resulting code is three time the size and uses 88% more memory.

It's identical code, all I've done is change the target board.

Any clues as to why?

Not feasible to post the code as there are 23 files in the project, these are the external dependencies, no other libraries are used.

#include <EEPROM.h>
#include <LiquidCrystal_I2C.h>
#include <MFRC522.h>
#include <Wire.h>


Compiling 'teamTRACK' for 'ATmega2560 (Mega 2560) (Arduino Mega)'
Program size: 24,092 bytes (used 9% of a 253,952 byte maximum) (3.11 secs)
Minimum Memory Usage: 3019 bytes (37% of a 8192 byte maximum)

Compiling 'teamTRACK' for 'Teensy 3.5'
Program size: 60,824 bytes (used 12% of a 524,288 byte maximum) (6.31 secs)
Minimum Memory Usage: 5700 bytes (2% of a 262136 byte maximum)



Jim
 
ATMega is a completely different architecture. It is 8 Bit cpu - 1 Byte per instruction.
Teensy is 32 Bit. Each Assembler instruction needs 2..4 Bytes. (edited here (thumb instructions - have to check this..)
Then, the MEGA does not need initilialzations on startup. It has no USB Core. No cache that needs initialzation, no temperature control, the interrupt vector table is so much smaller...etc etc.
On the plus side: This does not matter much. The Teensy flash is so much larger... ;)
 
Clarification for Frank B's answer: not all X-bit CPUs use only X-bit instructions; some architectures have some shorter formats, and many 8-bit architectures have also at least some longer than 8-bit instructions. And all that of course ignores the effect of instructions' operands. Operands in some cases also scale roughly in proportion of the CPU's native word-size.

And while Teensy has more flash, in many cases flash (read) speeds are comparable between devices, so having to read more or less code can affect operation speed (if it gets limited by the flash transfer rate). Even more significant if trying to load code from external serial flash.

(Edit: in general though, the difference can be ignored.)
 
Clarification for Frank B's answer: not all X-bit CPUs use only X-bit instructions; some architectures have some shorter formats, and many 8-bit architectures have also at least some longer than 8-bit instructions.

Yes, that's correct.
 
Thanks for the architecture lesson, I've only worked with the 8-bit stuff up until now so hadn't fully realised what was going on.

My main worry was un-wanted libraries being included (=possible instability). Initialisation of the on-board hardware wasn't something I'd considered either, all makes sense.

On the plus side: This does not matter much. The Teensy flash is so much larger... ;)
Which is why I've moved to the Teensy, as my project uses the Arduino Nano and I've run out of flash and ram! As you say, they have plenty of space :)

Had to use the 3.5 as the project is 5v, but looking at moving it to 3.3v so I can use any of the range.



Jim
 
You will notice that it also needs more RAM. And the AVR program may behave differently on ARM.
The main difference on ARM "Int" is 4 byte size vs 2 byte on AVR. Same for pointers and derived data types, or float (ARM:4 bytes) and double(ARM:8 bytes).
If a program relies on int being only 16 bit in size, trouble is inevitable. The 2 additional bytes need more RAM, of course, too.
Then, the alignment is 4 bytes.

My main worry was un-wanted libraries being included (=possible instability).
This works the same on both :)

If a library uses "int" and is written for AVR, this should be understood as a warning sign...
 
Last edited:
Another common problem we often see here is that Teensy is so much faster. Wherever Bitbang is used, Teensy may simply be too fast for the attached hardware. Then, for example, successive digitalWrites should be slowed down.
 
So many reasons for the code size difference....

1: Compiler optimization: Arduino Mega optimizes for smallest code size. On AVR (and on ARM M0 chips), speed isn't much different. Teensy 3.5 defaults to optimize for speed. On ARM M4 & M7 it makes a substantial speedup. You can try other optimization levels on your Teensy by clicking Tools > Optimize. Setting this to "Smallest Code" gives compiler settings similar to used on AVR.

2: USB Stack: Arduino Mega uses a 2nd dedicated chip for USB which has 16K flash, but that other chip's memory usage isn't included in the number the Arduino IDE shows you. On Teensy (and most of Arduino's newer products) the USB code is compiled into your program.

3: Advanced peripherals: Simpler peripherals on AVR require less code. Many are designed to auto-start, like the crystal oscillator. Compare that with Teensy's highly configurable clock system with a PLL. More advanced peripherals with more features generally require more code. Peripherals using DMA especially tend to use more code, but that extra complexity can offer far lower CPU usage during sustained fast data transfer.

4: printf(): If any of your code or any library uses printf() or sprintf(), the AVR printf is optimized for small size but lacks many features. On Teensy, a size optimized printf is used if you set Tools > Optimize to smallest code. Otherwise, you get a fully featured printf which supports floating point and all formatting options. It costs about 20K code size.

5: double: On AVR, double is treated the same as float, using only 32 bits. On Teensy double really is 64 bits. If your code or any libraries use double, especially on Teensy 3.5 / 3.6 where the FPU handles only 32 bit float, the compiler must link a lot of 64 bit library code.

6: AVR emulation: Arduino's API was designed around the AVR hardware, so many functions require very little code. On non-AVR boards, functions like digitalWrite() need to emulate AVR behavior for best compatibility.

7: Pointer size: On AVR a pointer is 16 bits. On ARM all memory addresses are 32 bits. Code using tables of pointers or other heavy use of indirect addressing tends needs twice as much memory for storing memory addresses.

8: "Unused" code: Some features like Serial1/2/3 get linked into every program on some Teensy boards, even if you never use them, mostly because of the default fault handler tries to complete sending if your code crashes (AVR doesn't even have fault handling hardware). Because Arduino Uno is so small, years of intense work have gone into Arduino's AVR core library and IDE to eliminate unused code, as well as trade-offs for smaller code size. On Teensy 3 & 4 the development focus has been much more on optimizing performance and leveraging advanced hardware, even when it requires linking more code into every compiled program.
 
Status
Not open for further replies.
Back
Top