Teensyduino 1.34 Beta #1 (ARM Toolchain Update)

Status
Not open for further replies.
What is it doing to your inline asm?
here you can see without downloading anything, just compile this sketch with and without LTO and look at disassembly of the sketch using objdump: (I used Fastest with LTO, Fastest menu options for this test)
Code:
void setup() {
  
}

void loop() {
  delayMicroseconds(100000);
}
While 'delayMicroseconds' seems to work it looks like LTO is inlining everything, the disassembly for the 'delyMicroseconds' is not the same as without LTO. Also with LTO I don't see a call to 'setup' or the 'loop' functions in 'main' either. This example is to just show what LTO is doing to the assembly, not point to a problem with delayMicroseconds!

I will check with my scope to see if the delay times are right today.

does barrier work with enclosing asm volatile("" ::: "memory");
see https://forum.pjrc.com/threads/17469-millis()-on-teensy-3?p=22279&viewfull=1#post22279
LTO still seems to muck with the inline assembly even when the memory barrier is added.

For my particular problem, in my Zilch library I redefine yield which does the context switch and the compiled assembly needs to be exactly what inline assembly is or it does not work. With LTO enabled it is really not even close. I tried to stop inlining my yield function and putting in memory barriers to the inline assembly part but to no avail.

I think if this toolchain update does get adopted there should be menu control for using or not using LTO?

Sorry for all the LTO references, might give some a headache.:p


edit: I'm using a Teensy 3.2 for this.
 
Last edited:
I'm still debating what the default should be for each board. Obviously Teensy LC needs -Os. In most the tests I've tried, -O3 adds significantly to the program size, so I'm a bit reluctant to make it the default for Teensy 3.0, 3.1, 3.2 where so many programs already exist.

Yes, why don't stay with -O1 :)
The other added options are great. If one wants to use them, they are available now.

@Duff: why not print a #warning that your code does not work with LTO
 
It might be the LTO bug/feature has been fixed. However, unless somebody sends in a bug report to the proper channels, it may never be fixed (https://gcc.gnu.org/bugs/)
Is this bug or feature, I don't know but LTO seems to really turn your code upside down.

@Duff: why not print a #warning that your code does not work with LTO
That could work, I'll see what I can do with that.

Frank B said:
Yes, @Duff should create a minimal example and report the bug ...
Before I go that far I would want to see if other peoples code is effected, I really don't know enough right now to say one way or the other. Frank did you look at the memcpy_audio.S disassembly? Does it work?
 
LTO (or link time optimization) essentially combines all of your modules together, and then optimizes the whole program. So if you have a function in a module, say foo.cpp:

Code:
float add_flt (float a, float b)
{
  return a+b;
}

And you have a call to add_flt in a different module, say bar.cpp or bar.pde.

If LTO is not used, then it will always generate a call instruction, because it does not know what add_flt does.

If LTO is used, then the compiler can see that it is a simple function, and it should inline the function, instead of generating a function call, floating point add, and return.

<edit>
It really depends on the code whether LTO is a win or not. Since I'm more focused on individual optimizations in the PowerPC (specifically adding support for the forthcoming power9 processor), I don't tend to use LTO for my tests. Other people in IBM run spec with LTO and make pronouncements about the speed of the machine. I don't speak for IBM, but as a generalization, roughly half of the benchmarks were performance neutral, one benchmark was 3% slower, and the rest were faster (ranging from 2-24% faster). Of course spec code is much different than most of the code that runs on Teensys, so your mileage will vary.
 
Last edited:
Is LTO tied to the optimization level at all? Is there some way to stop it from inlining a function? I found a pragma "no-lto" but it gives all types of warnings about "plugin needed to handle lto object".

Just to update everyone that delayMicroseconds works the same for LTO and non LTO, i checked with my scope just now.
 
Is this bug or feature, I don't know but LTO seems to really turn your code upside down.
Duff!! Indeed !

Frank did you look at the memcpy_audio.S disassembly? Does it work?

Yes, it works, but, really it is totally different

Edit: err.. no my fault, it is identical :) I compared the wrong parts..lol..
sorry
 
Last edited:
So I found at least with my code this works:
Code:
[COLOR=#78492A][FONT=Menlo]#pragma GCC push_options[/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo]#pragma GCC optimize ([COLOR=#d12f1b]"no-lto"[/COLOR])

void funct() {

}

[/FONT][/COLOR]
[COLOR=#78492A][FONT=Menlo]#pragma GCC pop_options[/FONT][/COLOR]
It does give all types of warnings when I tried it for delayMicroseconds though.
 
Not sure how important this, is, but have had Teensy loader app fault, if I had the windows still open when I try to shutdown windows.

This is Windows 10 64 bit, with this beta...

TD-fault_shutdown.jpg

It happened today and yesterday when I did a shutdown.
 
@KurtE. I have had the same fault display (at shutdown every time I have used the loader) on my win 10 64-bit system since I updated my teensyduino to the latest version. However it doesn't seem to affect anything that I can see.
 
Arduino just released version 1.8.0. The version increase appears to be related to unifying Arduino.org and Arduino.cc boards into a single software release.

I'm going to revert to the old toolchain and publish a new beta. We'll probably do a week or so of testing and merging little last-minute updates. Then this gcc 5.4 toolchain testing can resume in January. Or if anyone really wants to keep playing with gcc 5.4 can still use 1.34-beta1, just not with the new Arduino 1.8.0 release.
 
I found a fix for the crash on Windows 10 restart problem.

Right now I'm looking into compiler warnings with several libraries. Some happen with -O2 or -O3 optimization, even gcc 4.8. Many others are just sloppy library code. I'm trying to clean as much of this up as I can. Debating whether to make 1.34-beta3, or just do a normal release.
 
Before I lose this... here's my list of libraries known to have errors with the new toolchain:


Code:
Adafruit_CC3000  buildtest example

ks0108  error compiling

LowPower fails on all boards, even Teensy 2.0

PS2Keyboard  errors

ST7565 error, C++ overload on srandom()


These libraries have warnings. Probably harmless, and most probably also happen with the old toolchain.

Code:
FlexCAN  CANtest warning

OSC   many warnings

RadioHead  warnings

teensy_ssd1351  warnings

TinyGPS  test_with_gps_device warning

VirtualWire  warnings, unused stuff

X10  many warnings - ancient arduino stuff


Adafruit_SleepyDog  warning on Teensy 3.x

AppleMidi  warnings

Eigen313  warnings

MFRC522  warnings

EthernetBonjour  many warnings
 
With Teensy 3.2, "fastest with LTO" i get

"upload@1679610-Teensy Firmware 'print_mac.ino.TEENSY31.hex' is not compatible with '1679610-Teensy'"

with TYQT :)

Might be a TYQT issue ??
 
Status
Not open for further replies.
Back
Top