Teensyduino 1.34 Beta #1 (ARM Toolchain Update)

Status
Not open for further replies.

Paul

Administrator
Staff member
Here is a first beta test for Teensyduino 1.34.

This version updates the ARM toolchain used to compile for Teensy LC and 3.x.


Old beta download links removed. Please use the latest version:
https://www.pjrc.com/teensy/td_download.html


Changes since Teensyduino 1.33:
  • Update ARM Toolchain to gcc 5.4 (was gcc 4.8)
  • Add Tools > Optimize menu
  • Fix driver install/update on Windows 7 & 8
  • Prevent "might not have installed correctly" message on Windows
 
This beta adds a Tools > Optimize menu, so you can easily experiment with 10 different optimization settings.

opt.png
(click for full size)

LTO is Link Time Optimization. It can really reduce the size and increase the speed of many programs. But it might come with compatibility issues.

Fastest, Faster and Fast are optimization flags -O3, -O2 and -O. Debug is -Og, and Smallest is -Os plus use of the nano C library.

Please give these various settings a try. We probably have a couple months before Arduino releases a new version. If this toolchain update or LTO breaks too many programs & libraries, I can always revert to the 4.8 toolchain. But hopefully we can play with this quite a bit in the coming weeks and build enough confidence to use it in a non-beta release.
 
Does Odroid XU4 use the same ARM ABI? All the info I've found says RPi is v6 and upwards compatible with v7 or v8, but programs compiled on those boards can't run on v6.

But if there is a way, I'd love to use faster hardware! So far I've been using the original model B. Ordered a RPi3. It's supposed to arrive early next week.
 
PURGED - wow <57 MB downloaded! TD_1.33 was 72 MB.

Question - would it be easy for 'somebody' to make a makefile that would casually build multiple binaries with a single command?
-O3, -O2 and -O. Debug is -Og, and Smallest is -Os plus use of the nano C library

<edit>: TD_1.34 Installed fine on copy of 1.6.13, first few default compiles of open sketches no problems. Just this one accurate warning I see that may not be new:
TimedStartBlink_SerEv:55: warning: array subscript has type 'char'
That added 'optimize' Menu does look nice.
 
Last edited:
I'm getting the following errors in some of my sketches:

- error: 'strncmpi' was not declared in this scope
- error: 'strcasestr' was not declared in this scope

(not from libs, but from my own code)



Edit: They should be in string.h - i've included that lib (and it worked with older teensyduino-versions)
 
Last edited:
string.h:
Code:
#if __GNU_VISIBLE
char    *_EXFUN(strcasestr,(const char *, const char *));

I added "#define __GNU_VISIBLE 1" to my Sketch, and now the compiler complains:

Code:
c:\arduino\hardware\tools\arm\arm-none-eabi\include\sys\features.h:256:0: note: this is the location of the previous definition

 #define __GNU_VISIBLE  0

features.h :
Code:
#ifdef _GNU_SOURCE
#define    __GNU_VISIBLE        1
#else
#define    __GNU_VISIBLE        0
#endif

, so _GNU_SOURCE is not defined.

Is there a compiler-switch missing ?
 
Last edited:
you're right, RPI3 jessie is still running in 32 bit mode.

i tested 1.6.12 with 1.34beta1 on mac os,
coremark:
previously T3.2@96mhz -O2 189.4 iterations/sec | with LTO fastest 207.29
previously T3.6@180mhz -O2 384.0 | with LTO fastest 447.7
... so many optimization choices ...

Code:
T3.6@180mhz coremark
        fastest LTO 447.676389
        fastest     463.692033
        faster LTO  437.121360
        faster      434.528617
        fast  LTO   333.619557
        fast        333.032915
        small LTO   323.248789  no float printf
        small       320.692182
 
Last edited:
is compiler doing hardware floating point? my linpack benchmark with LTO fastest is giving < 1 megaflop, should be 30 megaflops

is it really using gcc 5.4, if i look at version of /Applications/Arduino.app//Contents/Java/hardware/tools/arm/arm-none-eabi/bin/gcc
it still says 4.8.4
maybe i'm looking in wrong place (i don't use mac that much)

EDIT found it /Applications/Arduino.app//Contents/Java/hardware/tools/arm/bin/arm-none-eabi-gcc-5.4.1

EDIT 2, restarted IDE floating point seems OK now ... pilot error i guess
 
Last edited:
Installed just fine on Fedora 25 and Arduino 1.6.13. I'll let you know if I run into anything as I use it...
 
One simple sketch: TD_1.34 beats TD_1.33, and using LTO on TD_1.34 makes a real difference

I created a sketch watching pin 3 with FreqMeasure.countToFrequency().

Then in the sketch under test with pin 12 as output in loop() I do :: #define q12() {GPIOC_PTOR=128;} // Toggle pin 12

That pin 12 feeds the first T_3.6 pin 3 and shows the cycle rate each sec.

So far that - with a not quite empty loop() on a T_3.6 at 240 MHz shows:

TD_1.33 [ using an empty yield() ]:: == 1,153,846.13
TD_1.33 [ using PJRC yield() ]:: == 451,127.81
TD_1.34 Without LTO [ using an empty yield() ]:: FASTER == FASTEST == 1,304,347.88 Hz

TD_1.34 With LTO [ using an empty yield() ]::
FASTER = 1,935,483.88 Hz
FASTEST = 1,818,181.88 Hz

TD_1.34 With LTO [ using PJRC yield() ]::
FASTER = 857,142.88
FASTEST = 800,000.00
 
Raspberry Pi 3 is up and running here, after a few false starts with weaker power supplies. It really does draw over 2 amps when running at full load! (edit: or maybe not... could have been just crappy wall wart power supplies...) Building the toolchain now. Seems to be going a *lot* faster. The Broadcom chip gets scorching hot. I've got a 120mm fan cooling the whole thing.

Edit: "vcgencmd measure_temp" says "temp=51.0'C", and that's with a big 120mm fan blowing down.

rpi3.jpg
 
Last edited:
Hi Paul,

That sounds HOT! It is interesting that last week I purchased a RPI3 as it looks like Trossen Robotics will be using them in some of their products... Mine came with two small heat sinks to stick on the two chips and with the package I purchased from Amazon a 2.5 amp power supply.

For the fun of it, I pulled my spare Odroid Xu3-lite out from my cabinet (my 2 XU4's are mounted in two robots) and I downloaded that latest Ubuntu 16.04 image and updated the 32gb EMMC...

I saw a project up in your github: https://github.com/PaulStoffregen/ARM_Toolchain_2016q3_Source And it looked like maybe it was the one you are trying to build?

So I downloaded it to Odroid and tried to follow your steps. I did not create virtual memory yet, but went through some of the steps.

It looked like it completed: ./build-prerequisites.sh
But it died somewhere in: ./build-toolchain.sh

If this is what you are building, it will be interesting to see the differences in time building between using this and RPI3. But some of this may depend on if the build process can use more of the cores of the 8 cores of the processor...

If it works, I could probably loan it to you.
 
Yup, that's the new toolchain.

I almost got through ./build-toolchain.sh, until the USB hub I was using for power died. It ran for approx 5 hours to get into 2nd gcc compile stage.

On the old Pi (original version 1) running wheezy (from 2014), it complains about a gcc bug after about a day compiling.

This one is the old toolchain. It builds in approx 47 hours on the old Pi running wheezy.

https://github.com/PaulStoffregen/ARM_Toolchain_2014q3_Source

I'm restarting the new toolchain build again on the Pi3 with jessie. This time I've added a small heatsink on the chip and a lab bench power supply capable of 10 amps.
 
I have restarted the build on the Odroid 3 times now, as it failed with missing some different things I need to apt-get install... It is using one of their 4amp power supplies. And I don't think it is overheating too badly as the automatic fan is only on part of the time. So I don't think the automatic throttling of the Cores down to 900mhz is happening due to heat... Could be wrong.

It was good I added 2GB of virtual memory using the information in: https://www.digitalocean.com/community/tutorials/how-to-add-swap-on-ubuntu-14-04

As the htop command has shown all 8 cores doing something and at times the SWP being used.

Not sure how long this one will run.

It does run reasonably warm by looking at:
Code:
odroid@odroid:~$ cat /sys/devices/10060000.tmu/temp
sensor0 : 77000
sensor1 : 69000
sensor2 : 83000
sensor3 : 77000
sensor4 : 65000
I think it throttles when it gets to 100C so here shows maybe 83C high

Odroid-build-toolchain.jpg
 
Last edited:
I've completed building the new toolchain for Raspberry Pi and (hopefully) other ARM-based machines.

Edited the first post. The installer for linuxarm is now available.

If anyone tests on non-RPI boards, or anything other than Raspberry Pi 3, please let me know if it worked?
 
Talkie compiles and runs and sounds right with PJRC PROP on T_3.6 @240 MHz Faster and Fastest +/- LTO : no warnings or errors. Same at 180 Fastest and Smallest with LTO.

<edit>: Verify compiled for T_3.2 - no warn/err - but not uploaded tested.
 
Last edited:
Status
Not open for further replies.
Back
Top