Teensy 4.0 First Beta Test

Status
Not open for further replies.
Ok - tweaked timings now they for the most part match the manual at 400/1M/3.3Mhz:
Code:
@100khz ---- 125khz
@400khz ---- 391khz
@1.0mhz ---- 961khz
@3.0mhz ---- 2.78mhz

Again this is for the MPU9250 so may be a bit different on your setup.

View attachment 17967

Just in case you want to give it a try.

Looks and works fine - THANKS @mjs513 !!!

I see this with T4 at 600 MHZ … temp is 48°C::

loopTime=26604964 :: @100khz ---- 125khz >> Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET, 100000, 100000);
loopTime= 8920821 :: default 400 >> Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET);
loopTime= 3959817 :: @1.0mhz ---- 961khz >> Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET, 1000000, 1000000);
loopTime= 1632100 :: @3.0mhz ---- 2.78mhz >> Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET, 3000000, 3000000);
loopTime= 1632100 :: same as 3MHZ - compiles okay >> Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET, 4000000, 4000000);

Code w/mod to run this on SSD1306 when ssd1306_128x64_i2c.ino works :

<edit> Add - runs SLOWLY fine at 100 Khz
 
Last edited:
@defragster

Thanks for testing - glad it worked :) Oh by the way the clocks are set up you only will get 100khz, 400khz, 1Mhz and 3Mhz settings. Anything above 3mhz defaults down to 3Mhz so didn't surprise me that 3 and 4mhz was the same.
 
@defragster

Thanks for testing - glad it worked :) Oh by the way the clocks are set up you only will get 100khz, 400khz, 1Mhz and 3Mhz settings. Anything above 3mhz defaults down to 3Mhz so didn't surprise me that 3 and 4mhz was the same.

Glad to test - more glad you did the "ARGH!" work making the RefMan make sense in code :)

As expected. I copied the code without looking - but assumed just those named cases were covered - as before. I had 4 MHz in the code and uploaded then noticed that and 3 MHz max clock, so it was just as easy to note it worked as ignore it.
 
@Paul - @other :: Is there a reason nothing higher than 1 MHz is tried in this WIRE file?:: \hardware\teensy\avr\libraries\Wire\WireIMXRT.cpp

Well, yes there's a reason. But that reason is simply that I didn't feel like doing more than the 3 standard speeds when so many other things were needing my attention during the beta test.

I did spend quite a lot of time fiddling with the many parameters while watching the results on my oscilloscope.

With the addition of 2.2k pullups got the following results:

Wow, that's pretty amazing you got 2.3 MHz to work with 2.2K pullups. Normally much lower value resistors would be used for that speed.
 
Well, yes there's a reason. But that reason is simply that I didn't feel like doing more than the 3 standard speeds when so many other things were needing my attention during the beta test.

I did spend quite a lot of time fiddling with the many parameters while watching the results on my oscilloscope.



Wow, that's pretty amazing you got 2.3 MHz to work with 2.2K pullups. Normally much lower value resistors would be used for that speed.

Opps - should have asked if there was a 'non-obvious' reason :)

Looking at SSD1306 running here almost 2hours - that seems okay with 2.78 Mhz:: DVM gives me the idea the resistors on the board are 4.68 K Ohm.
 
Hi defragster, just following up on older post about flexcan memory

Is there any interest in using unused flexcan controllers for RAM storage? If so, on T3, 48 dwords are accessible for 16 MB space, plus maybe 3-4 registers as 32 bytes read/write
Obviously on a 64MB 1062, effectively gives you 192 DWORDs usage.

Don't know if this is useful for temporary fast ram accesses vs slower eeprom usages, obviously storage is not saves on resets....
Also accessing would be another idea to persue, RAM.read(x) RAM.write(x)..?
Probably have an increment system and the library would handle the proper locations of the registers

We could also probably do something along the lines of:

RAM.data[0] = 1234;
Serial.println(RAM.data[0]);
 
Last edited:
Paul:
in :: \hardware\teensy\avr\libraries\Wire\WireIMXRT.h
I see this which seems out of place for T4?
Code:
extern TwoWire Wire3;

Also on the rear of the T4 code the only place I see "SDA2" is on pin #25. But that text looks oddly to be in the 'GRAY' font not black … like Pin #24 SCL2

Also reminder - Updated T4 cards need to indicate that SPI1 uses Pins #0 and #1. Recent post lost Serial1 with Spi1.begin() and it wasn't clear why.
 
PaulStoffregen said:
Well, yes there's a reason. But that reason is simply that I didn't feel like doing more than the 3 standard speeds when so many other things were needing my attention during the beta test.

I did spend quite a lot of time fiddling with the many parameters while watching the results on my oscilloscope.
After the last couple of days trying to understand the timings and playing with the parameters I can commiserate with you on what you had to go through.

Wow, that's pretty amazing you got 2.3 MHz to work with 2.2K pullups. Normally much lower value resistors would be used for that speed.
Well maybe not that amazing I did forget to mention that the mini-board that I am using has 4.7k pullups already on SDA/SCL. Since you mentioned this is there some sort of rule of thumb for the size of the resistors vs speed.
 
Hi defragster, just following up on older post about flexcan memory

Is there any interest in using unused flexcan controllers for RAM storage? If so, on T3, 48 dwords are accessible for 16 MB space, plus maybe 3-4 registers as 32 bytes read/write
Obviously on a 64MB 1062, effectively gives you 192 DWORDs usage.

Don't know if this is useful for temporary fast ram accesses vs slower eeprom usages, obviously storage is not saves on resets....
Also accessing would be another idea to persue, RAM.read(x) RAM.write(x)..?
Probably have an increment system and the library would handle the proper locations of the registers

We could also probably do something along the lines of:

RAM.data[0] = 1234;
Serial.println(RAM.data[0]);

Thx - Was that like forever ago? Indeed I wondered about that misreading/looking at the RefMan memory map. MB == Mailbox not MegaByte :) So if any CAN was active that potion area would be reserved.

Would have been cool if it was Battery backed.

I never got an understanding of the 4 DWords the RefMan called battery backed I found where two seemed to be in use? Though 64 'static' bits could save the day.
 
Yes it would be for inactive CAN controllers.
If a user uses CAN2, he can optionally use CAN1 or CAN3 for non-volatile ram storage which should be pretty fast, the controller will remain in freeze mode so the data wont be meddled with.
 
@defragster - @KurtE

Found my SSD1306. Wouldn't work initially at 3Mhz so I added 4k7 pullups to match what you were seeing with your display, then it started working at 3Mhz. This is what I was reading on the scope:
Code:
@100khz  ---- 126khz
@400khz  ---- 390khz
@1Mhz     ----  909Mhz
@3Mhz     ----  2.5Mhz (a bit slower than on the MPU-9250
 
Good morning, looks like great progress. Will try to find some of my I2C devices and play. But it might be a few days as I still am playing with some other things like servos...
 
@Paul - there is a problem with millis() when set_arm_clock jumps across certain values? See Here : T4-set_arm_clock-and-micros()
> when I was testing for micros() I did smooth increments DOWN and that works. But jumping from perhaps 600 to 24 fails to get millis() clock counter adjusted? and 50ms reports as 1ms in change. Perhaps when executed at Hi F_CPU a wait is needed for a change to effect - and when clock is slower the wait naturally happens? There is a sketch {showing working and non-working transitions} and example output there in case you didn't catch it.


@KurtE - @defragster - @Paul

Just put in a PR: https://github.com/PaulStoffregen/Wire/pull/17, for wire library to make the clock changes before I forgot or did something stupid. :)

Cheers

Nice @mjs513:: I only tested with a Single SSD1306 device - but I did that on Wire and Wire1 and it gave good reliable speed increase/range to that display on all 4 available speeds - see p#4326
 
vcore vs overclock vs life

am i putting this in the right place?

just some food for thought... in attempting to fit the data in NXP app note AN12170,
to an equation, it seems that when warm, T4 mcu life is on average in inverse
proportion to the 12th power of vdd_soc_in - i'll call it vcore. that given, i
played with trying to lower vcore at several oveclock freqs until cpu seemed to
have problems (changed the 2 constants in clockspeed.c that set the slope and upper
clamp). near 600 no big change and at 1000 no big change, but at 816 and 912 vcore
could be set lower to the point where life could be multiplied by about 3 at 816
and 2 at 912 relative to the default schedule, ie the default was 1.443 v but 1.327
seemed ok at 816 and the default was 1.529 v but 1.445 seemed ok at 912.

default :................600=1.250, 816=1.443, 912=1.529, 1008=1.575 (one slope)

worked on one unit: 600=1.250, 816=1.327, 912=1.445, 1008=1.553 (two slope, one
from 600 to 816 shallower than default, one from 816 to 1008 steeper than default)

the testing i did was NOT comprehensive and only encompassed 1 unit. maybe others
have already looked at this.

i am sure that only a VERY SMALL percentage of applications could benefit from
this, and i only bring it up because if you extrapolate the numbers from AN12170,
life can get pretty short in the higher overclock ranges.

what i do not know is if the trend shown in AN12170 between 528 and 600 keeps a
similar slope up near 900 (could be better up there, could be worse).
 
@Paul
Pulled Github\CORES for new USB code. Got fresh copy of github.com/PaulStoffregen/USB-Serial-Print-Speed-Test

Not sure if this helps - other than yes - it can RUN faster USB to PC it looks to have at least DOUBLED prior test here - if the T4 @600 MHz is all consumed with printing - it can overwhelm the PC, but T4_USB can also get STALLED in the process until the PC's USB stack is tickled 'Deja-Vu'.

Also shows that a delayMicroseconds( 2 ) per loop is enough to have T4_USB to PC persist at a rate on the PC that is a bit faster that before these github code changes.

Windows 10 - Opened IDE 1.8.9 T_SerMon and I numbers like this and then is halts any updates - even with AutoScroll on.
This run I pushed Program on a second T4 - thinking I was restarted the T4 under test:
Code:
count=47705587, lines/sec=438034
count=47705588, lines/sec=438034
count=47705589, lines/sec=438034
     // ...
count=47882144, lines/sec=438034
count=47882145, lines/sec=438034
count=47882146, lines/sec=438034
count
After running to a freeze pushing Program on 2nd T4 again I can see number like this for a second or so:
Code:
count=20437490, lines/sec=506759
count=20437491, lines/sec=506759
c

So 'Deja-Vu' - if I do anything on USB - add remove Teensy - it picks up again. Even putting another T4 into Bootloader, or just pushing button again when in bootloader.

It is quitting close to the same point each time from a fresh start after upload:
Run #1:
Code:
count=10401760, lines/sec=0
c
Run #2
Code:
count=10397381, lines/sec=0
count=10397382,

I added a 'delayMicroseconds( 1 );' to the loop and it runs this far on a fresh start after Upload:
Code:
count=25685282, lines/sec=377090
count=25685283, l
Run#2:
Code:
count=11002308, lines/sec=378085
count=11002
Run #3
Code:
count=11932522, lines/sec=378034
count=11932523, l

Pushing to delayMicroseconds( 2 ); in loop() seems to get toward a steady state with this lower lines/sec count that is just above what T4 was showing before when it was ~225K lps on last testing.
This are snippets from an uninterrupted test - so it only 'STALLS' above when there is a 'backup'? - Code is functional T4>PC in steady state
Code:
count=54556500, lines/sec=271626
count=54556501, lines/sec=271626
  // …
count=125881099, lines/sec=270751
count=125881100, lines/sec=270751
  // …
count=328396175, lines/sec=269904
count=328396176, lines/sec=269904
  // …
count=405324087, lines/sec=269784
count=405324088, lines/sec=269784
  // USER Stopped T4 while still running

Another run with delayMicroseconds( 2 ) gave this and stalled::
Code:
count=52981933, lines/sec=271604
and another:
Code:
count=21389847, lines/sec=273002


Misc Notes for ref on T4 loop() cycle rate and ...:

>> With delayMicroseconds( 2 ) in loop() the loop count is MAX 457,216 with only printing the count once per second. Yields Print count of 270K lps.

>> Removing delayMicroseconds( 2 ) from loop() the loop count is MAX 7,519,378 with only printing the count once per second. Short Term Yield of Print count of 440K to 500K lps.

>> Putting a void yield() {} in the sketch does not affect above cases of stalling

>> TaskMan shows JAVA overall using 13 to 21% of CPU, and T_SerMon is under 0.3 to 1.1%, and memory used after this time is 277MB.

>> TyCommander as SerMon is still slow limited 140K to 230K - so it seems to keep running, even with no delayMicroseconds() in loop(), somehow it properly slows the T4's USB stream without Stalls though.
 
am i putting this in the right place?

just some food for thought... in attempting to fit the data in NXP app note AN12170,
to an equation, it seems that when warm, T4 mcu life is on average in inverse
proportion to the 12th power of vdd_soc_in - i'll call it vcore. that given, i
played with trying to lower vcore at several oveclock freqs until cpu seemed to
have problems (changed the 2 constants in clockspeed.c that set the slope and upper
clamp). near 600 no big change and at 1000 no big change, but at 816 and 912 vcore
could be set lower to the point where life could be multiplied by about 3 at 816
and 2 at 912 relative to the default schedule, ie the default was 1.443 v but 1.327
seemed ok at 816 and the default was 1.529 v but 1.445 seemed ok at 912.

default :................600=1.250, 816=1.443, 912=1.529, 1008=1.575 (one slope)

worked on one unit: 600=1.250, 816=1.327, 912=1.445, 1008=1.553 (two slope, one
from 600 to 816 shallower than default, one from 816 to 1008 steeper than default)
...

If anywhere this would be a good place. The IDE warning at/above 912 MHz: "(overclock, cooling req'd)" needs to be heeded. **
The TD_1.48 Voltage adjustment is enough to keep CPU operating - but may be higher than needed - extra wear and heat.
A complete table nor code was left to generate the new numbers. I got this table to print matching at 912, and reduced but high at 812 - using three jumps.

Is this what it could look like?::
Code:
Table:	Frequency	Voltage	Voltage2
	[U]600000000	1250	1250[/U]
	624000000	1275	1265
	648000000	1300	1280
	672000000	1325	1295
	696000000	1350	1310
	720000000	1375	1325
	744000000	1400	1340
	768000000	1425	1355
	792000000	1450	1370
[B]	816000000	1475	1385[/B]
	840000000	1500	1400
	864000000	1525	1415
	888000000	1550	1430
[B]	912000000	1575	1445[/B]
	936000000	1575	1544
	960000000	1575	1565
	984000000	1575	1575
	1008000000	1575	1575

Not tested but the logic for the table looks like this where Voltage is the current calculation and Voltage2 is the modification toward the suggested numbers:
Code:
#define OVERCLOCK_STEPSIZE  24000000
#define OVERCLOCK_MAX_VOLT  1575
    uint32_t frequency;
    Serial.printf( "Table:\tFrequency\tVoltage\tVoltage2\n" );
    for ( frequency = 600000000; frequency <= 1008000000; frequency += OVERCLOCK_STEPSIZE ) {
      // compute required voltage
      uint32_t voltage = 1150; // default = 1.15V
      uint32_t voltage2 = 1150; // default = 1.15V
      if (frequency > 528000000) {
        voltage = 1250; // 1.25V
        voltage2 = 1250; // 1.25V
        voltage += ((frequency - 600000000) / OVERCLOCK_STEPSIZE) * 25;
        if (voltage > OVERCLOCK_MAX_VOLT) voltage = OVERCLOCK_MAX_VOLT;
        if (frequency < 900000000)
          voltage2 += ((frequency - 600000000) / OVERCLOCK_STEPSIZE) * 15;
        else if (frequency < 920000000)
          voltage2 += ((frequency - 600000000) / OVERCLOCK_STEPSIZE) * 15;
        else
          voltage2 += ((frequency - 600000000) / OVERCLOCK_STEPSIZE) * 21;
        if (voltage > OVERCLOCK_MAX_VOLT) voltage = OVERCLOCK_MAX_VOLT;
        if (voltage2 > OVERCLOCK_MAX_VOLT) voltage2 = OVERCLOCK_MAX_VOLT;
      }
      Serial.printf( "\t%u\t%u\t%u\n", frequency, voltage, voltage2 );
    }
  }


It would need testing to make sure it provides function. I know before the current rescaling, of the two PJRC posted benchmarks one ran at 800 and the other ran at 700 but not 800 MHz.

** I did some bad welding inside a 1062 while starting a new sketch and looking into i2c speed in Arduino and PJRC code for some time without a heatsink yet added - the IDE was left set for 916 or above for another T4 w/heatsink test in the process.
 
your voltages look very much like what worked here - my final ver tested like this:

720 75 mv lower than std, 2% reduction in current, about 2.0:1 longer life
816 100 mv lower than std, 5% reduction in current, about 2.4:1 longer life
912 75 mv lower than std, 11% reduction in current, about 1.8:1 longer life

of course the longer life is just calculated from the voltage and is of unknown accuracy

here is the code change i used in clockspeed.c for the testing - also very similar
to yours.

Code:
.............................
	// compute required voltage
	uint32_t voltage = 1150; // default = 1.15V
	if (frequency > 528000000) {
		voltage = 1250; // 1.25V
#if defined(OVERCLOCK_STEPSIZE) && defined(OVERCLOCK_MAX_VOLT)
		if (frequency >  600000000) {
			voltage += ((frequency - 600000000) / 70000000) * 25;
			if (voltage > OVERCLOCK_MAX_VOLT) voltage = OVERCLOCK_MAX_VOLT;
		}
		if (frequency >= 816000000) {
			voltage += ((frequency - 816000000) / 20000000) * 25;
			if (voltage > OVERCLOCK_MAX_VOLT) voltage = OVERCLOCK_MAX_VOLT;
		}
#endif
	} else if (frequency <= 24000000) {
		voltage = 950; // 0.95
	}
....................
 
@Paul - I put the current Github/CORES on this system and building 'USB Host Ethernet Driver' sample that uses TeensyThreads it breaks with this.

What there a rename of something?

Code:
T:\tCode\libraries\TeensyThreads\TeensyThreads.cpp: In function 'bool gtp1_init(unsigned int)':

[B][COLOR="#FF0000"]T:\tCode\libraries\TeensyThreads\TeensyThreads.cpp:178:45: error: 'CCM_CCGR1_GPT' was not declared in this scope[/COLOR]
       CCM_CCGR1 |= [U]CCM_CCGR1_GPT[/U](CCM_CCGR_ON) ;  // enable GPT1 module
[/B]                                             ^

T:\tCode\libraries\TeensyThreads\TeensyThreads.cpp: At global scope:

Also there was an update to TeensyThreads where T4 did a DOUBLE INTERRUPT - not sure if that got on list for TD 1.49 updates?

I put the github/cores on my IDE 1.8.9 - going to TD 1.48 on IDE 1.8.10 it builds and runs as it did before
 
Okay,that is what i saw on my machine ...didn't hit the source... See now which it changed to

Will need change in @ftrias TeensyThreads
 
Status
Not open for further replies.
Back
Top