Teensyduino 1.50 Released

Status
Not open for further replies.
I have no idea why your Mac is saying Teensy Loader requires 10.14. It doesn't (or shouldn't).

I ran it this way just now on 10.12.6. Here's a screenshot.

screen.jpg

At least 2 people on Twitter have confirmed it ran on their old Macs with 10.10 Yosemite. One specifically said uploading code to a Teensy 3.2 worked.

https://twitter.com/bikerglen/status/1223766258723639296

https://twitter.com/Dithermaster/status/1223777431099793408
 
Now that's *really* confusing. They've built from same code base using the same Mac (running 10.14 Mojave).

Do not want to add to the confusion but did a get info on both files and there are some differences.
file size
creation date
enable app nap option

Screen Shot 2020-02-02 at 05.14.18.png

I consider for myself this problem fixed but if you want to get to the bottom of this I will gladly help troubleshooting.
Sadly my HighSierra installer is corrupted and cannot test on a clean OS install.
 
Because I was bored, because I put my debugger idea ad-acta and I had to wait for something else, I played with the compiler again - this time
GCC9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599]


GCC 5.4.1:
Coremark with -O2: 2313.57
Coremark with -O2 -fgcse-after-reload -finline-functions : 2344.67
Coremark with -O3: 2476.88

GCC 9.2.1:
Coremark with -O2: 2433.48
Coremark with -O2 -fgcse-after-reload -finline-functions : 2474.23
Coremark with -O3: 2399.04

-O2 is faster, and -fgcse-after-reload -finline-functions gives a little more speed.
Performance-wise GCC 9.2.1 still does not make much sense (if you choose the right flags - which depend on the version), and there is a regression with -O3.
Otherwise I couldn't find problems with 9.2.1 and it may have some bugs fixed.
 
I can confirm the compatibility with the core libs. I'm using GCC 9 as default for a couple of weeks now. Didn't notice any issues so far.

Advantages I found:
  • It is more strict (standard conform) on pointer conversions (it actually found some 'bugs' in my code which GCC 5.4 didn't complain about)
  • It also has a working objDump for the M7. The list files generated by GCC 5.4 are often not readable.
  • Compiler output (errors/warnings) is more nicely formatted
 
I am pleased to say that Teensyduino 1.50 and Arduino 1.8.11 with Teensy 4.0 work nicely on my Ubuntu 19.10. Software reboot works too.
 
So it looks like I have the same problem that @neurofun has with the Catalina version. When I try to open the Teensy.app it says it needs 10.14:
Screen Shot 2020-02-04 at 5.51.32 PM.png
T_ISSUE.png
Got it to work when I copied the 1.50 Teensy.app from another Arduino install I did. I'll try to download the Catalina version again now.

edit: Teensy.app is still is X out:(
 
Admin-less installation on Windows 10 64-bit over Arduino IDE 1.8.11 worked fine.

The bug with the supposedly required admin rights - probably for installing USB drivers on obsolete Windows versions - is still there though.
 
https://www.pjrc.com/teensy/td_libs_LedControl.html which is part of teensyduino uses shiftout(), which is too fast on T4.

I want to fix it - what is better ?

a) add own slow _shiftOut for T4 to the library
b) rewrite it to use SPI - which limits the usable pins
c) patch Teensyduino and insert the delay to the Core-shiftOut()?

What do you want?

Code:
void shiftOut_msbFirst(uint8_t dataPin, uint8_t clockPin, uint8_t value)
{
        uint8_t mask;
       [COLOR=#ff8c00] const int dly = 10;[/COLOR]
        for (mask=0x80; mask; mask >>= 1) {
                digitalWrite(dataPin, value & mask);
                [COLOR=#ff8c00]delayNanoseconds(dly);[/COLOR]
                digitalWrite(clockPin, HIGH);
                [COLOR=#ff8c00]delayNanoseconds(dly);[/COLOR]
                digitalWrite(clockPin, LOW);
                [COLOR=#ff8c00]delayNanoseconds(dly);[/COLOR]
        }
}

Edit: I would vote vor c) to keep Arduino-compatibility.
In LedControl, it happens for >= 396MHz - my display uses a MAX7219.
Might be that other chips are even slower..
 
Last edited:
well, the time is independent from f_cpu, so, the f_cpu should'nt matter (but I have tested up to 960MHz ;-) . (..and I hope the laws of nature have taken T4 into account)
 
Last edited:
well, the time is independent from f_cpu, so, even the speed should'nt matter (but I have tested up to 960MHz ;-) . (..and I hope the laws of nature have taken T4 into account)

Indeed the time is fixed - but the bus timing changes for the write speed?
 
The only issue I see that it is a little slower - but that's the intention.
For my max7219 it works without the delays if f_cpu is < 396MHz. So of course one could add a bunch of #ifdefs - would that be better?
I don't think so. I'd think we have to increase the delays someday and we will find chips (ATMEL? with "shiftIn()" ) where my added delay is too small.

the bus-mhz does not matter that much here..
 
With T4 having runtime changeable F_CPU an #ifdef would not be perfect - adding a runtime conditional would be time consuming too.

Thought a test of that code at 24 and 800 MHz would quickly show if the delay ended up being too much or too little since the problem comes from the speed of the T4. Assuming a T_3.6 at 256 MHZ doesn't have this trouble?
 
The delay is (almost) the same for 24 and 800Mhz (it will be a bit more with 24MHz due to the nature of the waiting loop).. But I understand your point - digitalWrite is slower with 24MHz. Edit: Adding an if-statement would slow it down even more, maybe
Is that an issue? It would be an issue if it was slower than 8-Bit Arduino. I don't think, it is.. anyway, I have no hardware to compare both.

I've not tried 3.6 - can do that in the next days (don't want to rip my hw now)
 
The delay is (almost) the same for 24 and 800Mhz (it will be a bit more with 24MHz due to the nature of the waiting loop).. But I understand your point - digitalWrite is slower with 24MHz.
Is that an issue? It would be an issue if it was slower than 8-Bit Arduino. I don't think, it is.. anyway, I have no hardware to compare both.

Opps - I thought you had hardware in front of you testing it to see it run to observe and select the delay time in action.
 
Opps - I thought you had hardware in front of you testing it to see it run to observe and select the delay time in action.

No i have no ATMEL-Arduino. I should buy some, some day..
Shiftout is just too fast for the display ( "too fast" is definitely NOT compatible with ATMEL-Arduino.. ) - it does not work.

Just try it - the display are sold for 1.- EUR at ebay.
Edit: Datasheet says, it is good for 10MHz. I doubt Atmel-Arduino can reach that.taken the usual overclocking-possibility into account it's much more.

We could make it like this, perhaps:
Code:
[B]if f_cpu_actual [/B]> XY //<-dunno...
        for (mask=0x80; mask; mask >>= 1) {
                digitalWrite(dataPin, value & mask);
                delayNanoseconds(shiftOutDly );
                digitalWrite(clockPin, HIGH);
                delayNanoseconds(shiftOutDly );
                digitalWrite(clockPin, LOW);
               delayNanoseconds(shiftOutDly );
        }
else 
  same without delays..
 
@defragster:
What do you think?
Code:
static const int shiftOutDly = 10;
static const int maxSpeedBeforeDelay = 300e6;

void shiftOut_lsbFirst(uint8_t dataPin, uint8_t clockPin, uint8_t value)
{
  uint8_t mask;
  if (F_CPU_ACTUAL > maxSpeedBeforeDelay)
    for (mask = 0x01; mask; mask <<= 1) {
      digitalWrite(dataPin, value & mask);
      delayNanoseconds(shiftOutDly );
      digitalWrite(clockPin, HIGH);
      delayNanoseconds(shiftOutDly );
      digitalWrite(clockPin, LOW);
      delayNanoseconds(shiftOutDly );
    }
  else
    for (mask = 0x01; mask; mask <<= 1) {
      digitalWrite(dataPin, value & mask);
      digitalWrite(clockPin, HIGH);
      digitalWrite(clockPin, LOW);
    }
}

void shiftOut_msbFirst(uint8_t dataPin, uint8_t clockPin, uint8_t value)
{
  uint8_t mask;
  if (F_CPU_ACTUAL > maxSpeedBeforeDelay)
    for (mask = 0x80; mask; mask >>= 1) {
      digitalWrite(dataPin, value & mask);
      delayNanoseconds(shiftOutDly );
      digitalWrite(clockPin, HIGH);
      delayNanoseconds(shiftOutDly );
      digitalWrite(clockPin, LOW);
      delayNanoseconds(shiftOutDly );
    }
  else
    for (mask = 0x01; mask; mask >>= 1) {
      digitalWrite(dataPin, value & mask);
      digitalWrite(clockPin, HIGH);
      digitalWrite(clockPin, LOW);
    }
}
 
Last edited:
Both F_CPU_ACTUAL and delayNanoseconds are new for T_4 and only #ifdef __1062__

But that seems reasonable to test with. Though not sure how close delayNanoseconds(10) is to accurate for that low value?

I wrote a test delayCycles() on T4 and min execution wasn't good until about 20 cycles - and then it went in groups of 4 cycles. That seems to agree with 10 and 3 ns request below?

Check my math/code but I get:
Code:
100 ns cycles == 71
100 ns time == 118.333333

and ::
Code:
10 ns cycles == 18
10 ns time == 30.000000

and:
Code:
3 ns cycles == 19
3 ns time == 31.666667

with::
Code:
uint32_t yy = ARM_DWT_CYCCNT;
#define iWait 10
delayNanoseconds(iWait);
yy = ARM_DWT_CYCCNT-yy;
  Serial.print(iWait);
  Serial.print(" ns cycles == ");
  Serial.println(yy);
  double xx=yy*1000000 ;
  xx = xx/(double)F_CPU_ACTUAL;
  Serial.print(iWait);
  Serial.print(" ns time == ");
  Serial.printf("%f\n",xx*1000);
 
Both F_CPU_ACTUAL and delayNanoseconds are new for T_4 and only #ifdef __1062__
Sure - it's in the T4 core...
But that seems reasonable to test with. Though not sure how close delayNanoseconds(10) is to accurate for that low value?
Don't know, is not important and does not matter - 10 is the lowest value that works. It's not the goal to get a reproducable timing. (SPI would be better, then) It should just shift out the data. But not too fast.
As said - we may have to increase this value even more. Depends on the receiving chips.
 
Good it works! I thought you had hardware to test on Teensy, I wasn't worried about ATMEL.

Wasn't sure if this was for generic lib change or for your purposes when #ifdef would be needed.
 
I have hardware to test on teensy. But I have no Atmel*.
shiftOut is in the core and is part of arduino-compatibility! (digital.c)


*By the way, I really do not care if it works on the Atmel or other Arduinos. But it should'nt be slower than Atmel - thank you for pointing out 24MHz...!
 
Last edited:
Status
Not open for further replies.
Back
Top