Forum Rule: Always post complete source code & details to reproduce any issue!
Page 3 of 4 FirstFirst 1 2 3 4 LastLast
Results 51 to 75 of 84

Thread: Teensyduino 1.34 Beta #1 (ARM Toolchain Update)

  1. #51
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,044
    Quote Originally Posted by duff View Post
    How do you stop LTO from touching my inline assembly code?
    What is it doing to your inline asm?

  2. #52
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    1,386
    Quote Originally Posted by duff View Post
    How do you stop LTO from touching my inline assembly code?
    does barrier work with enclosing asm volatile("" ::: "memory");
    see https://forum.pjrc.com/threads/17469...ll=1#post22279

  3. #53
    Senior Member duff's Avatar
    Join Date
    Jan 2013
    Location
    Las Vegas
    Posts
    899
    Quote Originally Posted by PaulStoffregen View Post
    What is it doing to your inline asm?
    here you can see without downloading anything, just compile this sketch with and without LTO and look at disassembly of the sketch using objdump: (I used Fastest with LTO, Fastest menu options for this test)
    Code:
    void setup() {
      
    }
    
    void loop() {
      delayMicroseconds(100000);
    }
    While 'delayMicroseconds' seems to work it looks like LTO is inlining everything, the disassembly for the 'delyMicroseconds' is not the same as without LTO. Also with LTO I don't see a call to 'setup' or the 'loop' functions in 'main' either. This example is to just show what LTO is doing to the assembly, not point to a problem with delayMicroseconds!

    I will check with my scope to see if the delay times are right today.

    Quote Originally Posted by manitou View Post
    does barrier work with enclosing asm volatile("" ::: "memory");
    see https://forum.pjrc.com/threads/17469...ll=1#post22279
    LTO still seems to muck with the inline assembly even when the memory barrier is added.

    For my particular problem, in my Zilch library I redefine yield which does the context switch and the compiled assembly needs to be exactly what inline assembly is or it does not work. With LTO enabled it is really not even close. I tried to stop inlining my yield function and putting in memory barriers to the inline assembly part but to no avail.

    I think if this toolchain update does get adopted there should be menu control for using or not using LTO?

    Sorry for all the LTO references, might give some a headache.


    edit: I'm using a Teensy 3.2 for this.
    Last edited by duff; 12-20-2016 at 04:59 PM.

  4. #54
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    1,386
    Quote Originally Posted by duff View Post
    LTO still seems to muck with the inline assembly even when the memory barrier is added.
    Since it is a Link-Time Optimization I guess it makes sense that a compile-time memory-barrier would have no effect

  5. #55
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    Quote Originally Posted by PaulStoffregen View Post
    I'm still debating what the default should be for each board. Obviously Teensy LC needs -Os. In most the tests I've tried, -O3 adds significantly to the program size, so I'm a bit reluctant to make it the default for Teensy 3.0, 3.1, 3.2 where so many programs already exist.
    Yes, why don't stay with -O1 :-)
    The other added options are great. If one wants to use them, they are available now.

    @Duff: why not print a #warning that your code does not work with LTO

  6. #56
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Location
    Ayer Massachussetts
    Posts
    2,631
    It might be the LTO bug/feature has been fixed. However, unless somebody sends in a bug report to the proper channels, it may never be fixed (https://gcc.gnu.org/bugs/)

  7. #57
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    Yes, @Duff should create a minimal example and report the bug ...

  8. #58
    Senior Member duff's Avatar
    Join Date
    Jan 2013
    Location
    Las Vegas
    Posts
    899
    Quote Originally Posted by MichaelMeissner View Post
    It might be the LTO bug/feature has been fixed. However, unless somebody sends in a bug report to the proper channels, it may never be fixed (https://gcc.gnu.org/bugs/)
    Is this bug or feature, I don't know but LTO seems to really turn your code upside down.

    Quote Originally Posted by Frank B View Post
    @Duff: why not print a #warning that your code does not work with LTO
    That could work, I'll see what I can do with that.

    Quote Originally Posted by Frank B
    Yes, @Duff should create a minimal example and report the bug ...
    Before I go that far I would want to see if other peoples code is effected, I really don't know enough right now to say one way or the other. Frank did you look at the memcpy_audio.S disassembly? Does it work?

  9. #59
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Location
    Ayer Massachussetts
    Posts
    2,631
    LTO (or link time optimization) essentially combines all of your modules together, and then optimizes the whole program. So if you have a function in a module, say foo.cpp:

    Code:
    float add_flt (float a, float b)
    {
      return a+b;
    }
    And you have a call to add_flt in a different module, say bar.cpp or bar.pde.

    If LTO is not used, then it will always generate a call instruction, because it does not know what add_flt does.

    If LTO is used, then the compiler can see that it is a simple function, and it should inline the function, instead of generating a function call, floating point add, and return.

    <edit>
    It really depends on the code whether LTO is a win or not. Since I'm more focused on individual optimizations in the PowerPC (specifically adding support for the forthcoming power9 processor), I don't tend to use LTO for my tests. Other people in IBM run spec with LTO and make pronouncements about the speed of the machine. I don't speak for IBM, but as a generalization, roughly half of the benchmarks were performance neutral, one benchmark was 3% slower, and the rest were faster (ranging from 2-24% faster). Of course spec code is much different than most of the code that runs on Teensys, so your mileage will vary.
    Last edited by MichaelMeissner; 12-20-2016 at 10:31 PM.

  10. #60
    Senior Member duff's Avatar
    Join Date
    Jan 2013
    Location
    Las Vegas
    Posts
    899
    Is LTO tied to the optimization level at all? Is there some way to stop it from inlining a function? I found a pragma "no-lto" but it gives all types of warnings about "plugin needed to handle lto object".

    Just to update everyone that delayMicroseconds works the same for LTO and non LTO, i checked with my scope just now.

  11. #61
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    Quote Originally Posted by duff View Post
    Is this bug or feature, I don't know but LTO seems to really turn your code upside down.
    Duff!! Indeed !

    Quote Originally Posted by duff View Post
    Frank did you look at the memcpy_audio.S disassembly? Does it work?
    Yes, it works, but, really it is totally different

    Edit: err.. no my fault, it is identical I compared the wrong parts..lol..
    sorry
    Last edited by Frank B; 12-21-2016 at 06:04 PM.

  12. #62
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,044
    Just discovered -O3 with LTO optimizes away all the data from the audio lib sample player example. Not good.

  13. #63
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    My mp3 codecs work.. it is quite complex code, with lots of tables and inline-assembler, , too.

  14. #64
    Senior Member duff's Avatar
    Join Date
    Jan 2013
    Location
    Las Vegas
    Posts
    899
    So I found at least with my code this works:
    Code:
    #pragma GCC push_options
    #pragma GCC optimize ("no-lto")
    
    void funct() {
    
    }
    
    
    #pragma GCC pop_options
    It does give all types of warnings when I tried it for delayMicroseconds though.

  15. #65
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    3,432
    Not sure how important this, is, but have had Teensy loader app fault, if I had the windows still open when I try to shutdown windows.

    This is Windows 10 64 bit, with this beta...

    Click image for larger version. 

Name:	TD-fault_shutdown.jpg 
Views:	112 
Size:	12.7 KB 
ID:	9218

    It happened today and yesterday when I did a shutdown.

  16. #66
    Senior Member bmillier's Avatar
    Join Date
    Apr 2016
    Location
    Halifax, N.S. Canada
    Posts
    127
    @KurtE. I have had the same fault display (at shutdown every time I have used the loader) on my win 10 64-bit system since I updated my teensyduino to the latest version. However it doesn't seem to affect anything that I can see.

  17. #67
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,044
    Arduino just released version 1.8.0. The version increase appears to be related to unifying Arduino.org and Arduino.cc boards into a single software release.

    I'm going to revert to the old toolchain and publish a new beta. We'll probably do a week or so of testing and merging little last-minute updates. Then this gcc 5.4 toolchain testing can resume in January. Or if anyone really wants to keep playing with gcc 5.4 can still use 1.34-beta1, just not with the new Arduino 1.8.0 release.

  18. #68
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    Quote Originally Posted by PaulStoffregen View Post
    Or if anyone really wants to keep playing with gcc 5.4 can still use 1.34-beta1, just not with the new Arduino 1.8.0 release.
    I stay with 1.34-beta1

  19. #69
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    3,432
    I have both on my Windows machine

  20. #70
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,044
    I found a fix for the crash on Windows 10 restart problem.

    Right now I'm looking into compiler warnings with several libraries. Some happen with -O2 or -O3 optimization, even gcc 4.8. Many others are just sloppy library code. I'm trying to clean as much of this up as I can. Debating whether to make 1.34-beta3, or just do a normal release.

  21. #71
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,044
    Before I lose this... here's my list of libraries known to have errors with the new toolchain:


    Code:
    Adafruit_CC3000  buildtest example
    
    ks0108  error compiling
    
    LowPower fails on all boards, even Teensy 2.0
    
    PS2Keyboard  errors
    
    ST7565 error, C++ overload on srandom()

    These libraries have warnings. Probably harmless, and most probably also happen with the old toolchain.

    Code:
    FlexCAN  CANtest warning
    
    OSC   many warnings
    
    RadioHead  warnings
    
    teensy_ssd1351  warnings
    
    TinyGPS  test_with_gps_device warning
    
    VirtualWire  warnings, unused stuff
    
    X10  many warnings - ancient arduino stuff
    
    
    Adafruit_SleepyDog  warning on Teensy 3.x
    
    AppleMidi  warnings
    
    Eigen313  warnings
    
    MFRC522  warnings
    
    EthernetBonjour  many warnings

  22. #72
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    With Teensy 3.2, "fastest with LTO" i get

    "upload@1679610-Teensy Firmware 'print_mac.ino.TEENSY31.hex' is not compatible with '1679610-Teensy'"

    with TYQT :-)

    Might be a TYQT issue ??

  23. #73
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    6,110
    Quote Originally Posted by Frank B View Post
    With Teensy 3.2, "fastest with LTO" i get

    "upload@1679610-Teensy Firmware 'print_mac.ino.TEENSY31.hex' is not compatible with '1679610-Teensy'"

    with TYQT :-)

    Might be a TYQT issue ??
    I posted a note to Koromix last year and he posted a fix shortly after - it is working [on T_3.0] on the latest version I confirmed last night - see this post

  24. #74
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,397
    Confirmed :-)

  25. #75
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    6,110
    Quote Originally Posted by Frank B View Post
    Confirmed :-)
    Good - I posted a note and link on the new 1.36 beta thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •