Teensy 4 - SPI bus > 38mhz possible?

Status
Not open for further replies.

1of9

Member
It looks like 38mhz is what the spi bus tops out at on T4, is it possible to go higher? I saw other posts talking about T3 F_BUS /2 = SPI clock, is it the same in T4?

edit: looks like it is f_bus /2 = max spi clock so can you increase the f_bus w/o overclocking the cpu or is that the only way?
 
Last edited:
It is a little more complicated than that...

The SPI buss speed can be based off of 4 different clocks and have an initial divider set. This part is controlled by the CBCMR register.
Not sure if I am looking at the same version of the reference manual that is up on PJRC, but the CCM clock tree is shown at about page 1072
CCM-CBCMR is described at about page 1110

The 4 clock selects come out to the different speeds:
Code:
		static const uint32_t clk_sel[4] = {664615384,  // PLL3 PFD1
					     720000000,  // PLL3 PFD0
					     528000000,  // PLL2
					     396000000}; // PLL2 PFD2

And the clock divider can be a value between 1-8

So then the actual SPI speed is computed, by finding a divisor of the resultant above that gives us the highest rate it can that does not exceed the value you requested.

Currently it looks like we initialize the CBCMR in SPIClas::begin:
Code:
	CCM_CBCMR = (CCM_CBCMR & ~(CCM_CBCMR_LPSPI_PODF_MASK | CCM_CBCMR_LPSPI_CLK_SEL_MASK)) |
		CCM_CBCMR_LPSPI_PODF(6) | CCM_CBCMR_LPSPI_CLK_SEL(2); // pg 714
You are free to update these two fields after this and the SPI.beginTransaction with an SPISettings will give you the setting based off of the current settings. Note: If you do an SPI.beginTranaction(SPISettings(12000000, MSBFIRST, SPI_MODE0);
Then change the clock and then ask for the same speed, in this case 12mhz, it may not catch that something changed, so you may want to ask first for a different speed and then ask for the speed you want...

FYI - disregard the page number mentioned in code as this is probably the page number of the first version of the 1052 manual and has not been updated to the 2nd (or more) edition of the 1062 pdf...

Also don't remember why PODF=6 and CLK_SEL=2...
 
edit: looks like it is f_bus /2 = max spi clock so can you increase the f_bus w/o overclocking the cpu or is that the only way?

If you're concerned about "overclocking", please consider the electrical specs datasheet (footnote under table 57 on page 67) says 30 MHz is the max SPI speed regardless of how you internally configure the chip.

SPI probably can run faster (especially if the chip isn't used only near room temperature), but anything over 30 MHz is considered overclocking.
 
If you're concerned about "overclocking", please consider the electrical specs datasheet (footnote under table 57 on page 67) says 30 MHz is the max SPI speed regardless of how you internally configure the chip.

SPI probably can run faster (especially if the chip isn't used only near room temperature), but anything over 30 MHz is considered overclocking.

I am trying to trace the code backwards and see what is going on, but if 30mhz is supposed to be the max spi, then it is "overclockable" out of the box because you can send in up to 38000000 into
SPI.beginTranaction(SPISettings(38000000,..... and you get a faster clock (as confirmed by timing the code and looking at it on an oscilloscope). if you send > 38m in you don't get anything faster. I don't know if it is intentional or not that this is the way it currently works. The registers that control the spi clock are kind of cryptic and I don't really understand what each bit does.
 
Yep - they are sort of cryptic, which is why I pointed you back to PDF file for where the bits are defined and a sort of chart showing you how all of the clocks relate to each other...

And I know there are some others who like to go as fast as possible, but I usually don't try to hit that high as things start not working as maybe my jumper wires are not short enough or ...
 
I know I ran the one display (st7789) in a thread to 48 to 64 and saw faster throughput up to that speed on T4. It was hardcoded lower - probably to device spec - but going to a supported multiple and trying it is YMMV
 
I put up a new Branch of my SPI Fork up at: https://github.com/KurtE/SPI/tree/T4_FASTER

With this I changed the clock settings in begin:
Code:
	hardware().clock_gate_register &= ~hardware().clock_gate_mask;

	CCM_CBCMR = (CCM_CBCMR & ~(CCM_CBCMR_LPSPI_PODF_MASK | CCM_CBCMR_LPSPI_CLK_SEL_MASK)) |
		CCM_CBCMR_LPSPI_PODF(2) | CCM_CBCMR_LPSPI_CLK_SEL(1); // pg 714
//		CCM_CBCMR_LPSPI_PODF(6) | CCM_CBCMR_LPSPI_CLK_SEL(2); // pg 714
So now using the Clock selection option 1 with PODF 2 instead of 6:

Now translated into English:
Before Clock Select 2 was choosing PLL2 which is 528mhz
Now choosing option 1 which is PLL3 PFD0 whch is 720mhz.

Then the PODF (is a divisor of this value plus 1:
So clock going into lpspi was 528/7 = 75.42mhz
Now the clock is: 720/3: = 240mhz (I thought about passing in 5 which would be 120mhz...

Now the actual clock speed is even divide of the passed in clock with minimum of 2 (actually value in register is divide-2).
So before max SPI speed was about 37.7mhz...
With the above the max is 120mhz... I have not tried 120... Probably will screw up. I did try 60mhz and found that there may be issue with going that fast with how the IO pins were configured...

So I have tried changing the pin configuration from:
uint32_t fastio = IOMUXC_PAD_DSE(7) | IOMUXC_PAD_SPEED(2);
//uint32_t fastio = IOMUXC_PAD_DSE(6) | IOMUXC_PAD_SPEED(1);
That is I think it drives the pins with more Drive strength and I think says should work for higher speeds.

And I have done a couple of tests that appear to work.

Example with our ili9488_t3 library (current stuff with 16 bit frame buffer on T4) and running at the default SPI speed, the speed test gave me times of:
Code:
ILI9488_t3n: (T4) SPI automatically selected

MOSI:11 MISO:12 SCK:13

ILI9488 Test!
Display Power Mode: 0x0
MADCTL Mode: 0x0
Pixel Format: 0x0
Image Format: 0x0
Self Diagnostic: 0x0
Benchmark                Time (microseconds)
Screen fill              615176
Text                     11924
Lines                    151482
Horiz/Vert Lines         50646
Rectangles (outline)     28136
Rectangles (filled)      1486768
Circles (filled)         178811
Circles (outline)        121726
Triangles (outline)      32838
Triangles (filled)       456314
Rounded rects (outline)  54180
Rounded rects (filled)   1618679
Done!

I then used the optional parameter on our begin method to pass in requested SPI speed of 60mhz and the times are now:


Code:
ILI9488_t3n: (T4) SPI automatically selected

MOSI:11 MISO:12 SCK:13

ILI9488 Test!
Display Power Mode: 0x0
MADCTL Mode: 0x0
Pixel Format: 0x0
Image Format: 0x0
Self Diagnostic: 0x0
Benchmark                Time (microseconds)
Screen fill              307987
Text                     7466
Lines                    92690
Horiz/Vert Lines         25840
Rectangles (outline)     14425
Rectangles (filled)      744867
Circles (filled)         99820
Circles (outline)        79335
Triangles (outline)      19801
Triangles (filled)       235967
Rounded rects (outline)  32342
Rounded rects (filled)   817098
Done!

It would be great of others could test these changes out and if they work OK, I will issue a Pull Request back to @PaulStoffregen...
 
Works with ILI9341 with edit of : \hardware\teensy\avr\libraries\ILI9341_t3\ILI9341_t3.h
#define ILI9341_SPICLOCK 60000000

I get this result in graphicstest.ino before ar 30M then 60M
{ EDIT: BUMPED to ILI9341_SPICLOCK 80000000 :: DemoSauce ili9341 also runs - as does Buddabrot }


Code:
ILI9341 Test!
Benchmark                Time (microseconds)
Screen fill              205428
Text                     10433
Lines                    69293
Horiz/Vert Lines         17271
Rectangles (outline)     11091
Rectangles (filled)      421862
Circles (filled)         68026
Circles (outline)        58521
Triangles (outline)      16341
Triangles (filled)       145309
Rounded rects (outline)  25290
Rounded rects (filled)   464878
Done!

[B]ILI9341 Test!
// #define ILI9341_SPICLOCK 60000000

Benchmark                Time (microseconds)
Screen fill              102996
[/B]Text                     6728
Lines                    43180
Horiz/Vert Lines         8918
Rectangles (outline)     5776
Rectangles (filled)      211888
Circles (filled)         39060
Circles (outline)        39000
Triangles (outline)      9923
Triangles (filled)       76612
Rounded rects (outline)  15668
Rounded rects (filled)   236376
Done!

And with 80Mhz - FAILS at 120 MHz - 90 M the SAME:
Code:
[B]// #define ILI9341_SPICLOCK 80000000
Benchmark                Time (microseconds)
Screen fill              77492
Text                     5807
Lines                    36751
[/B]Horiz/Vert Lines         6839
Rectangles (outline)     4466
Rectangles (filled)      159671
Circles (filled)         31954
Circles (outline)        33850
Triangles (outline)      8406
Triangles (filled)       59498
Rounded rects (outline)  13201
Rounded rects (filled)   179487
Done!
 
Last edited:
And BONUS TEST INFO Prior was running T4 at 600 MHz.

This change to SPI also works well - clock speed adjusts - for Valid Screen Display with the ili9341 GraphicsTest.ino at 150 and 24 MHz - results:
Code:
ILI9341 Test!
Benchmark                Time (microseconds)
Screen fill              153794
Text                     13711
Lines                    80781
Horiz/Vert Lines         12917
Rectangles (outline)     9536
Rectangles (filled)      316809
Circles (filled)         67104
Circles (outline)        77072
Triangles (outline)      17917
Triangles (filled)       115327
Rounded rects (outline)  28989
Rounded rects (filled)   351837
Done!

ILI9341 Test!
Benchmark                Time (microseconds)
Screen fill              582161
Text                     56286
Lines                    332034
Horiz/Vert Lines         49205
Rectangles (outline)     38512
Rectangles (filled)      1200165
Circles (filled)         269343
Circles (outline)        301194
Triangles (outline)      72822
Triangles (filled)       443262
Rounded rects (outline)  113318
Rounded rects (filled)   1336485
Done!
 
@KurtE

Nice @KurtE - guess you are back up to speed. I can't seem to find the lib changes to t3n with the clock in the begin?
 
Downloaded @KurtE's latest SPI changes and running Buddabrot and graphicsTest.ino while I had a scope on pin 13:

Code:
@600Mhz
w/60Mhz clock  ---- scope showed 57.14Mhz
w/40Mhz clock ----- scope showed 40mhz
w/80Mhz clock -----[COLOR="#FF0000"] FAILED[/COLOR]

Note: This test was run on a board with a tri-state buffer chip on the MISO line. Tested at 816Mhz CPU and got same results.

Repeated 80Mhz SPI Clock on a board with no-MISO tri-state buffer on MISO:
Code:
80Mhz ---- PASSED: scope 83Mhz
Probes are really rated to go higher:
@90 Mhz
Code:
Benchmark                Time (microseconds)
Screen fill              77323
Text                     5744
Lines                    36367
Horiz/Vert Lines         6837
Rectangles (outline)     4471
Rectangles (filled)      159244
Circles (filled)         31817
Circles (outline)        32787
Triangles (outline)      8354
Triangles (filled)       59390
Rounded rects (outline)  12918
Rounded rects (filled)   179058
Hit key to continue
Done!
Rectangles (filled) FB     19006
Hit key to continue
Done!
Rounded rects (filled) FB   21046
Hit key to continue
Done!

@110Mhz
Code:
Benchmark                Time (microseconds)
Screen fill              77323
Text                     5744
Lines                    36368
Horiz/Vert Lines         6837
Rectangles (outline)     4472
Rectangles (filled)      159246
Circles (filled)         31817
Circles (outline)        32766
Triangles (outline)      8354
Triangles (filled)       59390
Rounded rects (outline)  12921
Rounded rects (filled)   179055
Hit key to continue
Done!
Rectangles (filled) FB     19024
Hit key to continue
Done!
Rounded rects (filled) FB   21051
Hit key to continue
Done!
 
Downloaded @KurtE's latest SPI changes and running Buddabrot and graphicsTest.ino while I had a scope on pin 13:

Code:
@600Mhz
w/60Mhz clock  ---- scope showed 57.14Mhz
w/40Mhz clock ----- scope showed 40mhz
w/80Mhz clock -----[COLOR="#FF0000"] FAILED[/COLOR]

Note: This test was run on a board with a tri-state buffer chip on the MISO line. Tested at 816Mhz CPU and got same results.

Repeated 80Mhz SPI Clock on a board with no-MISO tri-state buffer on MISO:
Code:
80Mhz ---- PASSED: scope 83Mhz
Probes are really rated to go higher:
@90 Mhz
Code:
Benchmark                Time (microseconds)
Screen fill              77323
Text                     5744
Lines                    36367

@110Mhz
Code:
Benchmark                Time (microseconds)
Screen fill              77323
Text                     5744
Lines                    36368

90 and 110 MHz times are the same - what were they at 80 MHz?
 
Hi @mjs513 @defragster @... Was out for the evening... Will play more tomorrow.

I would suspect with current settings of 240 mhz as the raw clock (after initial divisor), and the clocks work by dividing by some register value (plus 2).

That the valid speeds are (120mhz, 80mhz, 60mhz, 48, 40, 34.29, 30mhz, 26.67, 24mhz, ..., I believe the slowest is about 934K.

Question will be is this slow enough? The current released clock gets us down to maybe 293KB.

Of course we could make this more configurable, that if the user asks for something slower than we can give, we could reconfigure the clock. Would maybe need to then hack it up that any other SPI busses that had already calculated SPI register value for speed would recompute on the next SPIx.beginTransaction...
 
Cool. My ili9341 display went white with 120 MHz - so the code did it in some fashion but the display could not - and as noted other speeds over 80 indeed seemed to have just factored out to the same.

Having seen PJRC notes in code - I wasn't sure the same PLL clocks would be running at 24 MHz - but they seemed to be as tested.
 
KurtE said:
That the valid speeds are (120mhz, 80mhz, 60mhz, 48, 40, 34.29, 30mhz, 26.67, 24mhz, ..., I believe the slowest is about 934K.
Now that makes sense - was seeing 83Mhz at 90 and 110Mhz as well.
 
Good morning, (I think it is looking promising) I guess the real question is should we try to allow it to go this high at the limit of not begin able to go lower?

That is I did the initial divide to get us 240Mhz clock coming in, sort of guessing 120 would not likely work, but was hoping 80 would as well as 60, and then it also gives us lots of nice even speeds, 60, 40, 30, 20, ... (with some in between).

Or maybe does it make sense to maybe restrict it lower and change, the PODF from the 2 to maybe (5)?
Code:
CCM_CBCMR = (CCM_CBCMR & ~(CCM_CBCMR_LPSPI_PODF_MASK | CCM_CBCMR_LPSPI_CLK_SEL_MASK)) |
        CCM_CBCMR_LPSPI_PODF(5) | CCM_CBCMR_LPSPI_CLK_SEL(1);
which would give us 120mhz going into I2C system. So possible speeds would be:
120mhz/(n+2) so Max 60mhz, then 40, 30, ... to 120/257 or: 467KHz..

But of course the whole code could get a little smarter, and in beginTransaction if the requested speed is different than the last, it computes the new value. If the new value is not low enough, we could updated the PODF to higher value and/or could choose different clock.

This could again impact other SPI ports if they have active SPI transactions. For ones that earlier initialized, the could could clear the last speed requested from all SPI objects, such that their next call to beginTransaction would recompute...

The question is do we need to go there?
 
Morning @KurtE - @defragster.

To be honest I don't have many devices that I use SPI on except for the different displays that we have been playing with. Right now I know that 80Mhz will work on a ILI9341 without a tri-state buffer chip, with the buffer chip it seems to be restricted to 60Mhz. This means to me that if I want to use a ILI9488 with touch the max SPI would be 60Mhz. We also know that the RA8875 won't support those high clock speeds. Don't know about the ST7735/ST7789 as well. Also remember that the XPT lib is set at 20Mhz so that has to be supported. Don't remember seeing anything below 1Mhz for SPI though - but that doesn't mean its not possible.

What does my rant mean - probably best to use the 120Mhz. I am concerned about changing clocks mid application between beginTransactions - guess I don't really understand the impact to the specific application
 
Me neither on all possible issues, but I do think can handle most...

That is suppose: You initialize a display at 60mhz. So with the current stuff in my Branch:
With settings: CCM_CBCMR_LPSPI_PODF(2) | CCM_CBCMR_LPSPI_CLK_SEL(1)

The code would compute the CCR registers SCKDIV to have a value of: 240/(N+2) so N=4

Now suppose on the same or different SPI Buss, some code wants to talk to a device at 250kb (I think old AVR could go as low as 125Kb).
So code would try to compute N: 960-2=958 But that won't fit in 8 bits... So we will fail and end up giving them I believe 934Khz...

Now we could look to update PODF field, which has the range 0-7 (divide 1-8): So with this clock source the slowest we could give is:
720000000/8=90mhz with CCR divider of 255 = 350Khz, which is closer but still higher...

We could look at changing to PLL2 528Mhz -> 66mhz / 257 = 257khz which is pretty close...

And if we really wanted that low, could switch to PLL2 PFD2 of 396Mhz ->49.5Mhz where 250khz would have a CCR setting of 196 which would make this one work.

However the expense of doing this, is that then next time beginTransaction asking for 60Mhz, your display will now get at max speed of 24.75Mhz Again it should still work, but not as fast...

So again the question is do we think this is needed? And again what trade off should we do for Low Speed versus High Speed?
 
Don't know just bothers me that once we do the switching then we can't back to the high speed mode "However the expense of doing this, is that then next time beginTransaction asking for 60Mhz, your display will now get at max speed of 24.75Mhz Again it should still work, but not as fast."

But here is a crazy idea - I doubt that I am going to be doing mixing and matching of high speed and low speed modes- probably wrong on this one. But what if we give the user a choice in one of the constructors to use either HIGH or LOW speed modes - know this is a departure but we are in a realm that I don't think needed to addressed before with the T4 and probably future versions :)
 
If adding code could allow both on alternate clocks that seems like a call for FLASHMEM. A shame both to limit high end - tonton81's T4 to t4 SPI link at 120 MHz or 80 M display - or throw away some old device needed to make/upgrade a system to T4 if under 1 M was needed. Though high end limit seems worse for T4 future with T4.1 pending.
 
@mjs513 - I agree, just saying we could do it...

As for doing it using constructor, that probably won't work as the user does not create these objects. Also these settings are global. That is they impact the SPI, SPI1 and SPI2 objects on T4... But we could have a static member function, which is used to setup these two values (which clock and divisor) to setup for different options: could be by some form of keyword, (crawling, Slow, Medium, Fast, Blazing) or could pass in range of values you would like working...

Or could simply document to people, if you don't like the defaults: Set this one register and then call beginTransaction... Minor caveat with this if you
do something like call SPI.beginTransaction at 30mhz , then change the register and then call beginTransaction again with 30 mhz it won't update properly as sees the same speed as last time and does not do the CCR calculations again... So in this case need to first call with different speed and then call with desired speed to have it recompute...
 
13.7.6 CCM Bus Clock Multiplexer Register (CCM_CBCMR)
NOTE
Any change on the above multiplexer will have to be done
while the module that its clock is affected is not functional and
the respective clock is gated in LPCG. If the change will be
done during operation of the module, then it is not guaranteed
that the modules operation will not be harmed.
i.MX RT1060 Processor Reference Manual, Rev. 1, 12/2018

Not sure if this applies and what "be harmed" means. There seem to me to be some issues with changing the pll and divider on the fly - it affects everything and can have unexpected and undesired consequences to anyone running more than a single bus and/or spi device. I would vote for object instantiation time as when you choose these values - then they are locked for the object lifetime.
 
Status
Not open for further replies.
Back
Top