Driving 20 OLEDs, SPI termination, max output current

cinhcet

Active member
Hi,

I want to drive 20 OLED displays (SSD1306 type) with a teensy 4.1 from a single SPI port (in total I have 40, using the two available SPI ports). The desired SPI speed is 20 Mhz.

The initial design had two chains of 10 displays connected to the teensy 4.1. This has caused the SPI.transfer(...) function to hang. The cable length of each chain was 40 cm. Interestingly, with a teensy 3.6 this does not happen.

I then connected all 20 displays in one chain, which eliminated the SPI.transfer(...) hang issue. However, the communication still is not perfect. The two displays that are closest to the teensy don't work properly or not at all.

It seems that I am having termination issues/reflections etc.

I am now considering options:

1) Adding a series resistor in the clock and MOSI line at the teensy source. Is 20 Mhz possible with a series resistor? Which values should I check? 30R, 100R?

2) Adding two resistors, one to ground, one to 3.3V at the end of the chain. I tried a value of 100R, which improved the situation a lot, it is basically working, but sometimes still has occasional glitches. I wonder if a smaller resistor value would improve things even more. The reason for me not trying it is that here https://www.pjrc.com/store/teensy41.html it says "The recommended maximum output current is 4mA." Here https://www.pjrc.com/teensy/techspecs.html it says 10 mA. Which one is correct? With the 100R resistors, I am already outside of the 10 mA.

3) Doing both 1 and 2.

4) Any other ideas?

The SPI lines for the most part go over a nice large ground plane.

Further question: Why is the single chain better than the two, smaller chains?
 
I tried adding 33R and 100R resistors in series close to the teensy at the clock and MOSI line. Unfortunately, this made half of the chain (starting from the source) now inoperative. Anyone an idea why?
 
Are you sure the T4.1 pins have enough fan-out for that may loads at high speed? This sounds like a case for a high-current logic buffer, or a distribution-tree (T4 drives 4 buffers, each drives 5 loads?)
 
Thanks for you answer!

I am not sure at all, this is why I am asking.

The fact that with the termination resistors at the end of the chain it improves the situation to a near working state is an observation that is against the fact that the pins don't have enough strength, but I am not sure. Is there a way of figuring this out?

Which buffer is recommended for high-speed SPI? I also wonder how I could test the setup, as building a buffer on a breadboard might not reflect the same HF characteristics as later on the PCB and it also increases cable length again.
 
A 'scope allows you to check the signal integrity on the bus - needs to be reasonable bandwidth for high speed logic, 100MHz+ really.
This is the sort of problem a logic analyzer will not help with (in fact it may cause more confusion!).

Any 3.3V capable modern CMOS family will be fast enough, HC, LVC, LCX, AHC etc etc. For fanout calculations you need to
consider the input capacitances, cable/trace capacitances and check against the output pin performance (usually only quoted at
one capacitance value alas). With CMOS DC fanout is usually not an issue, but for high speed it can be.

A buffer can just be a logic gate, though high current line-drivers often perform better.

It common to have 74HC14's or equivalent for such roles, even though they are inverting, because they are Schmitt-trigger inputs,
and generally handy to have for signal conditioning.
You chain two inverters together for a non-inverting buffer (to prevent timing skew you have to do this for all the signals the same).

However there are some non-inverting buffers in these families, or you can use AND or OR gates, or XOR/XNOR gates can be used with
one input fixed (configurable as either inverting or not). The classic bus driver chips are 74xx244 (pinout is messy) and 74xx245 (pinout is
sensible, and it is bidirectional should you need this). However they are 20 pin devices which might be overkill.
 
Thank you!
I do have a scope, but I suspect what I see is more a factor of my probing technique.

I actually had a 74HC244 lying around and tested it. Very strange results, with the buffer the displays closest to the teensy work (which they have never before), but exactly the last one in each chain stopped working. I honestly think that building such a circuit on a breadboard, which increases the total line length by at least 20 cm, is of little value.

After some trial and error, I finally found a working setup. Each SPI port of the teensy now drives two chains of 10 displays each, which is close to my initial setup. However, now the clock and MOSI line of BOTH subchains each have 33R resistors in series close to the teensy. No idea why this works and all the other things I tried not...

I can drive 40 SSD1306 displays, i.e. 5120 x 64 pixels, using the 2 SPI ports of the teensy 4.1 with around 33 FPS (while still polling 20 i2c devices for encoder inputs and LED outputs). This is amazing. With the speed of the 4.1, communication really becomes the bottleneck. One might be able to achieve an even higher framerate with async SPI transfers.

Thanks again!
 
Do these displays really support 20 MHz? Just because it works doesn’t mean it’s not overclocking. Might be worthwhile to look up the actual display specs.

Also check the buffer chip delay spec. If using a buffer, that extra delay needs to be taken intro account for the max clock speed.
 
ahhh, you are right. The datasheet gives a minimum clock cycle time of 100 ns => 10 Mhz.

However, I have tried 400 kHz and 10 Mhz with the initial setup, i.e. without the series resistors, and this also did not work. Isn't usually the rise-time of the output the reason for reflections etc and not the actual clock speed?

With 10 Mhz I get around 21 FPS. The displays seem to work totally fine even with 25 Mhz. With 30 Mhz, a few displays stop working. If it works with 20 Mhz, is there any disadvantage in overclocking?

Here an image of the setup. The small glitches one can see are due to the update of the display itself and cannot be seen with the human eye. There are 40 OLEDs, 40 rotary encoder with buttons and RGB LEDs. All driven by a teensy 4.1 and 20 atmega 328p (controlled by the 4.1 via i2c).
panel.jpg
 
Thank you!
I do have a scope, but I suspect what I see is more a factor of my probing technique.

I actually had a 74HC244 lying around and tested it. Very strange results, with the buffer the displays closest to the teensy work (which they have never before), but exactly the last one in each chain stopped working. I honestly think that building such a circuit on a breadboard, which increases the total line length by at least 20 cm, is of little value.
20MHz is just about doable on a breadboard, but you must use good layout - short direct cut-to-length wires, form a ground-plane
(well, ground-grid) using regularly spaced ground links. Decoupling capacitors for all logic chips as close as possible, shortest
leads possible. I've managed a 40MHz clocked SDRAM circuit running on a breadboard just, but its was right at the limit.

After some trial and error, I finally found a working setup. Each SPI port of the teensy now drives two chains of 10 displays each, which is close to my initial setup. However, now the clock and MOSI line of BOTH subchains each have 33R resistors in series close to the teensy. No idea why this works and all the other things I tried not...
Low value resistors damp ringing (turn stray LC into LCR circuit) and reduce reflection (lower the reflection coefficient at the
impedance discontinuity at the end of a line), and for a clock signal in particular such artifacts can produce runt pulses
(causing double-clocking).

Reducing stray inductance on signal wires helps with ringing too - good layout can make a big difference to stray inductance, and
less inductance pushes the ringing frequency up and Q down, eventually becoming too fast or damped for the high speed logic gate
to register.

Standard 'scope probes have a big problem with ringing due to the ground lead loop which has a lot of inductance. Specialized 'scope
probing techniques are needed for actually measuring the true ringing of a signal in high-speed logic, which is confounded/obscured
by the probe ringing.
 
Just out of curiosity for SPI0 try to run this after SPI.begin() using default SPI pins on the T4.1 master:

Code:
#if defined(__IMXRT1062__)
    IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_01 = IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3) | IOMUXC_PAD_PKE; /* LPSPI4 SDI (MISO) */
    IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_02 = IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3) | IOMUXC_PAD_PKE; /* LPSPI4 SDO (MOSI) */
    IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_03 = IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3) | IOMUXC_PAD_PKE; /* LPSPI4 SCK (CLK) */
    IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_00 = IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3) | IOMUXC_PAD_PKE; /* LPSPI4 PCS0 (CS) */
#endif

When I made a Teensy 4 SPI Slave library it worked only with thIS DSE setting on the master while the Teensy 3x/LC worked as-is
 
Back
Top