Forum Rule: Always post complete source code & details to reproduce any issue!
Page 1 of 3 1 2 3 LastLast
Results 1 to 25 of 52

Thread: Teensyduino 1.36 Beta #1 (ARM Toolchain Update)

  1. #1
    Administrator Paul's Avatar
    Join Date
    Oct 2012
    Posts
    237

    Teensyduino 1.36 Beta #1 (ARM Toolchain Update)

    Here is a first beta test for Teensyduino 1.36.

    Linux 32 bit:
    https://www.pjrc.com/teensy/td_136-b...nstall.linux32

    Linux 64 bit:
    https://www.pjrc.com/teensy/td_136-b...nstall.linux64

    Linux ARM: (coming soon...)
    https://www.pjrc.com/teensy/td_136-b...stall.linuxarm

    Mac OS-X:
    https://www.pjrc.com/teensy/td_136-b...inoInstall.dmg

    Windows:
    https://www.pjrc.com/teensy/td_136-b...inoInstall.exe


    The only change since 1.35 is switching to the gcc 5.4 toolchain, which was briefly attempted with 1.34-beta1 before an Arduino release forced reverting back to the old toolchain. With any luck we'll have time to throughly test.

    The LTO (link time optimization) and the "Fastest" (-O3 optimizaton) may break some program or libraries. Please report anything you find which doesn't work.

  2. #2
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,272
    Saw this used elsewhere - does __attribute__((optimize("O1"))) show a way to control local optimizations that would work? On 1.34b1 I saw Zilch had an issue - though it may have been from the LTO rework after the compile. I adorned a couple functions in a simple sketch and it compiles and changes code size and works . . . not tested on a known fail case . . .

    Code:
    __attribute__((optimize("O3"))) void foo(  ... )
    {
    }
    Last edited by defragster; 01-21-2017 at 02:27 AM.

  3. #3
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    14,249
    Here's a list of the problems discovered from 1.34-beta1.



    Libraries which might have compile errors:

    • Adafruit_CC3000 buildtest example
    • ks0108 error compiling
    • LowPower fails on all boards, even Teensy 2.0
    • PS2Keyboard errors
    • ST7565 error, C++ overload on srandom()
    Last edited by PaulStoffregen; 01-21-2017 at 12:23 AM.

  4. #4
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,272
    I'll look at delayMicroseconds() after my 1.8.1 and 1.36 downloads finish and get installed. I'll test it against millis() and the T_3.5/3.6 manitou's RTC isr() and then if needed try the __attribute__((optimize("O1"))) [or "O2"] since that is a single function to test against.

  5. #5
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,272
    Quick test shows an elapsedMicros over 10 seconds behavior agrees with 10 sets of delayMicroseconds( 1000000 ) on 1.34 and 1.36.

    Under 1.8.0 w/TD_1.34 and 1.8.1 w/TD1.36 - with and without the LTO and FASTEST.

    I ran it in a for() so there is 13+ms over head that seems consistent enough.

  6. #6
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,272
    For those using TYQT on a Teensy before T_3.5 if you get failures to upload - you may need an updated version of TYQT after the toolchain update. I found this last year with beta of 1.34 and Koromix addressed it and it worked on my T_3.0 last night as noted in this linked post.

    I didn't get a solid confirmation that the '__attribute__((optimize("O1"))) ' decoration on the delayMicroseconds() was working - since there was nothing to fix - and not looking at the ASM output I could only go by code size which for that single function change didn't seem different, it did compile.

    NOTE: I noted before that when editing CORE files - where IDE used to force rebuild on each compile - now never seems to recompile with latest IDE's even when I edited the core_pins.h file, until forcing a full rebuild with a 'Tools' change.

  7. #7
    Senior Member Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    3,767
    I' playing with GCC 6, it has a some nice new features plus additions by ARM.
    It works, but i had not the time to do any benchmarks - i wonder if the new "ARM PURECODE support for ARMv7-M" is helpful (-> faster code?) for the Teensies ?
    Last edited by Frank B; 01-22-2017 at 05:25 PM.

  8. #8
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    14,249
    Quote Originally Posted by Frank B View Post
    I' playing with GCC 6, it has a some nice new [URL="https://gcc.gnu.org/gcc-6/changes.html"]i wonder if the new "ARM PURECODE support for ARMv7-M" is helpful (-> faster code?) for the Teensies ?
    I'm curious about this too. But my guess is it'll be slower if multiple instructions are needed to create literals.

    Wasn't there some option to specify different code generation for fast vs slow program/flash memory? Maybe that would make a difference, especially on 3.2 & 3.5 where there's very little cache for the flash memory.

  9. #9
    Senior Member Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    3,767
    Quote Originally Posted by PaulStoffregen View Post
    I'm curious about this too. But my guess is it'll be slower if multiple instructions are needed to create literals.

    Wasn't there some option to specify different code generation for fast vs slow program/flash memory? Maybe that would make a difference, especially on 3.2 & 3.5 where there's very little cache for the flash memory.
    -mslow-flash-data ?

    This should work with GCC5, too. I remember i did some tests, but the gain, if any, was very small.

  10. #10
    Senior Member KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    2,330
    Suggestion maybe put a quick summary of what these options are in the optimize menu. Like what is LTO...

    Also by the ordering in menu, is faster faster than fast?

    That is the order of the items in the menu (disregard LTO) are:
    Faster, Fast, Fastest, Debug, Smallest

  11. #11
    Senior Member duff's Avatar
    Join Date
    Jan 2013
    Location
    Las Vegas
    Posts
    790
    Quote Originally Posted by KurtE View Post
    Suggestion maybe put a quick summary of what these options are in the optimize menu. Like what is LTO...

    Also by the ordering in menu, is faster faster than fast?

    That is the order of the items in the menu (disregard LTO) are:
    Faster, Fast, Fastest, Debug, Smallest
    I agree the optimization menu needs to be fixed, it really makes sense to have logical order to this especially when testing the libraries against different options?

    Also is this the same toolchain build as 1.34? Now the pargma I was using "#pragma GCC optimize ("no-lto")" to disable LTO for certain code sections won't compile on this update and would for 1.34?

    Snooze does not work with this build as of yet with both LTO and NO-LTO, don't know whats going on but there are problems.

    Paul, is there any guidance on how you plan to proceed with this toolchain update? Are you planning on having all these optimization build options available to the user, why I ask is I don't want to spend a lot of time fixing things that won't be used anyway in the standard Teensyduino Arduino IDE update.

  12. #12
    Senior Member KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    2,330
    Yesterday in the SHT31 thread I mentioned, that it stopped working properly. I was finding that some of the SPI.endTransmission calls were failing with timeout (4)... Through looking at logic output on SCL/SDA, it appeared like the Stop bit was not done on the previous call to requestFrom... I put a hack into other library if I fail in the endTransmisison I do it again...
    More in the thread: https://forum.pjrc.com/threads/41252...l=1#post131258

    I am quite sure this worked on previous toolchain...

    I found if I went back to the released version of wire it worked... I checked my differences in these functions and the differences had to do with using a pointer to the I2C structure to get to the member variables like C1 and S instead of hard coded.

    What was interesting was I then changed a couple of functions that I had as inline to not be inline... Removed #if 0 code here...
    Code:
    uint8_t TwoWire::i2c_status(KINETIS_I2C_t  *kinetisk_pi2c)
    {
    	return kinetisk_pi2c->S;
    }
    
    void TwoWire::i2c_wait(KINETIS_I2C_t  *kinetisk_pi2c)
    {
    	while (1) {
    		if ((i2c_status(kinetisk_pi2c) & I2C_S_IICIF)) break;
    	}
    	kinetisk_pi2c->S = I2C_S_IICIF;
    }
    And it started working.
    I am using this Beta, T3.6 Fast

    Will investigate more... Example not sure which of these two may have changed...

    Update: No problem inline the i2c_status function, but inline of the wait causes the issue....
    Last edited by KurtE; 01-25-2017 at 03:00 PM. Reason: Update

  13. #13
    Senior Member Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    3,767
    might be there's a "volatile" missing somewhere(, and the compiler thinks it is enough to read it once, and the "break" will never occur ?) - (inside i2c_status? - i did not read the code)
    Last edited by Frank B; 01-25-2017 at 06:13 PM.

  14. #14
    Senior Member KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    2,330
    Thanks Frank,
    The kinetisk_pi2c pointer is of type KINETIS_I2C_t which is defined in kenetis.h:

    Code:
    typedef struct {
    	volatile uint8_t	A1;
    	volatile uint8_t	F;
    	volatile uint8_t	C1;
    	volatile uint8_t	S;
    	volatile uint8_t	D;
    	volatile uint8_t	C2;
    	volatile uint8_t	FLT;
    	volatile uint8_t	RA;
    	volatile uint8_t	SMB;
    	volatile uint8_t	A2;
    	volatile uint8_t	SLTH;
    	volatile uint8_t	SLTL;
    } KINETIS_I2C_t;
    And as I mentioned the i2c_status function appears to work inline, which is the one it dereferences the pointer... I will try again and try looking at generated code, to see if I see anything obvious.

  15. #15
    Junior Member
    Join Date
    Nov 2016
    Location
    San Diego
    Posts
    5
    Just wanted to let you know that when targeting Teensy 3.1/2, Snooze results in the following error only when using "Faster with LTO", "Fastest with LTO" and "Smallest Code with LTO":
    Code:
    mk20dx256.ld:45 cannot move location counter backwards (from 00000000000004a8 to 0000000000000400)
    collect2: error: ld returned 1 exit status
    It compiles successfully for the remaining 7 options for Teensy 3.1/2. When targeting Teensy 3.6, all 10 optimization options compile successfully. I find it odd that it works for some of the LTO options. To test this, I used the following sketch:

    Code:
    #include <Snooze.h>
    SnoozeTimer timer;
    SnoozeBlock config(timer);
    void setup() {
      timer.setTimer(1000);
    }
    void loop() {
      (void)Snooze.sleep( config );
      (void)Snooze.deepSleep( config );
      (void)Snooze.hibernate( config );
    }

  16. #16
    Senior Member duff's Avatar
    Join Date
    Jan 2013
    Location
    Las Vegas
    Posts
    790
    Quote Originally Posted by jeffreytcash View Post
    Just wanted to let you know that when targeting Teensy 3.1/2, Snooze results in the following error only when using "Faster with LTO", "Fastest with LTO" and "Smallest Code with LTO":
    Code:
    mk20dx256.ld:45 cannot move location counter backwards (from 00000000000004a8 to 0000000000000400)
    collect2: error: ld returned 1 exit status
    Yes, I know this issue quite well, it's a problem with externed variable in the wake.h file. I wanted fix it but since I haven't heard from paul on the direction of this toolchain update its on the back-burner for now since this error is not consistent through the different compile options and-or Teensies.

  17. #17
    Senior Member Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    3,767
    Code:
    mk20dx256.ld:45 cannot move location counter backwards (from 00000000000004a8 to 0000000000000400)
    collect2: error: ld returned 1 exit status
    A way that always works, is to edit the linker-file (attached). My edit moves the startup-code behind the flashconfig.
    This wastes some 100 Bytes flash, but we have 1MB.

    (just for the case you get this error, and need a quick way to fix it)
    Attached Files Attached Files
    Last edited by Frank B; 01-31-2017 at 09:24 PM.

  18. #18
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    14,249
    Would a noinline attribute fix this?

  19. #19
    Senior Member Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    3,767
    Paul, to prevent this error - it can occur any time with different libs with lto - would it be feasable to use the alternate startup-address (linker-file from above) if "lto" is enabled ?

  20. #20
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,272
    Frank- IIRC you ran your T_3.5's at (OC) speeds and found them to work? Would you suggest Paul might enable one or more T_3.5 OC speeds in this Beta/release?

  21. #21
    Senior Member Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    3,767
    Quote Originally Posted by defragster View Post
    Frank- IIRC you ran your T_3.5's at (OC) speeds and found them to work? Would you suggest Paul might enable one or more T_3.5 OC speeds in this Beta/release?
    I think the 3.6 is the better choice than overclocking a 3.5
    Next logical step is a CORTEX-M7 with twice the DMIPS/MHz.
    I dont'k know, if NXP has a MCU that can be used.
    Last edited by Frank B; 02-01-2017 at 07:59 PM.

  22. #22
    Senior Member KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    2,330
    Maybe we should a do for the T3.5 like we have for T3.1 and define some of the overclocked options for the T3.5, but leave the menu item entry for them commented out.

    Yes T3.6 is nicer for higher speeds, but if you need/want 5v... then maybe should be put in as option, that the user can if desired try out

  23. #23
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,272
    Boards.txt already has them under comment - I tried 144 the other day - now gone after installing this beta.

    Indeed running the T_3.6 at 240 is way cooler than OC on the T_3.5. But if it works? Maybe it isn't as reliable even for conditional OC support? Most of my time on T_3.5 was in beta when pushing it wasn't desired to assure KS release.

    Just bringing it up because of this thread: Teensy-3-5-overclocking-(not-3-6)

  24. #24
    Senior Member KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    2,330
    Quote Originally Posted by defragster View Post
    Boards.txt already has them under comment - I tried 144 the other day - now gone after installing this beta.

    Indeed running the T_3.6 at 240 is way cooler than OC on the T_3.5. But if it works? Maybe it isn't as reliable even for conditional OC support? Most of my time on T_3.5 was in beta when pushing it wasn't desired to assure KS release.

    Just bringing it up because of this thread: Teensy-3-5-overclocking-(not-3-6)
    I don't see any commented out ones in either my 1.8.0 with current released TD and 1.8.1 with this beta... They both look like:
    Code:
    teensy35.menu.speed.120=120 MHz
    teensy35.menu.speed.96=96 MHz
    teensy35.menu.speed.72=72 MHz
    teensy35.menu.speed.48=48 MHz
    teensy35.menu.speed.24=24 MHz
    teensy35.menu.speed.16=16 MHz (No USB)
    teensy35.menu.speed.8=8 MHz (No USB)
    teensy35.menu.speed.4=4 MHz (No USB)
    teensy35.menu.speed.2=2 MHz (No USB)
    teensy35.menu.speed.120.build.fcpu=120000000
    teensy35.menu.speed.96.build.fcpu=96000000
    teensy35.menu.speed.72.build.fcpu=72000000
    teensy35.menu.speed.48.build.fcpu=48000000
    teensy35.menu.speed.24.build.fcpu=24000000
    teensy35.menu.speed.16.build.fcpu=16000000
    teensy35.menu.speed.8.build.fcpu=8000000
    teensy35.menu.speed.4.build.fcpu=4000000
    teensy35.menu.speed.2.build.fcpu=2000000

  25. #25
    Junior Member
    Join Date
    May 2016
    Posts
    3
    Hello, this version when compiling with the option faster with LTO does not show any error when using the library https://github.com/orangkucing/analogComp/tree/teensy3, however it does not work in the teensy, all the other compile options work.

    I am using arduino 1.8.1 for mac os x.
    Last edited by manmoi01; 02-01-2017 at 09:53 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •