SPI data via DMA corrupt

JustAUser

Member
For my current project I want to send out data via the SPI bus using DMA. In my original tests on a prototyping board everything seemed fine. Now I switched to a PCB and somehow everything started breaking.
I attached the full code to the post. Note that I'm using platformio to compile and upload the code.

So in my current code I do 2 steps. The 1st is filling a buffer with the data
Code:
            if(bitPosition % 2 == 0)
                memset(buffer, 0, sizeof(buffer));
            else
                memset(buffer, 0xFF, sizeof(buffer));

The buffer is used as a function argument and copied into the output buffer (in reverse)

Code:
    DMAMEM uint8_t outputBuffer[50];

[...]

    for (int i = bufferSize - 1; i >= 0; i--)
    {
        outputBuffer[maxSize - 1 - i] = buffer[i];
    }
    
    LOG_TRACE("Size", bufferSize, "data", LOG_AS_ARR(outputBuffer, bufferSize));
    
    SPI1.transfer(outputBuffer, NULL, maxSize, _spiDoneEventResponder);

But if I look at my scope I get the following output.
1772799666678.png


If you look in the actual code, there is some more stuff but that is mostly to control the timing or help me isolating the issue.

Any suggestions are appreciated.
 

Attachments

  • Project.zip
    3.3 KB · Views: 12
Since things changed when you changed the hardware it would be useful if you can post your schematic and pictures of your current hardware. Also give any circuit changes between the prototype and the PCB setup.
 
Since it's work related I have to be a bit careful here. But there haven't been any changes in the pinout. Only hardware was added.
Now that I read my message back I also realize that I forgot to add that I'm using a teensy 4.1 🤦‍♂️.

Edit: The only new thing is the OE pin (Output Enable). But that is just set to high.

Below is the pinout I am using (partially obscured). I did cut some stuff from the picture but I'm measuring on the SCKEXT and MOSIEXT signals. (I am missing a 3rd channel on my scope right now)
1772800547396.png
 
Last edited:
So, just to check if I'm not crazy I switched back to my old breadboard and the results are interesting. On my PCB version the SPI always outputs 6 bytes with no consistency. On my breadboard it behaves as intended with the same code.
Breadboard:
1772802054782.png


PCB:
1772802247770.png


Did I break the teensy or something?
 
I didn't pay attention to this at first but since I have so many issues, I wanted to rule out everything. When I took a closer look at the PCB I noticed something strange. Somehow the 2 versions are different!?!?
My original on the breadboardThe new on the PCB
breadboard.jpgMounted on PCB.jpg

I get this from an external supplier (build and soldered). Does anyone have any clues on what this could be?
My breadboard version has the freescale logo (as does the one on the store https://www.pjrc.com/store/teensy41.html), but the PCB has the NXP logo?!?!

Sorry for all the new posts, I'm just trying to keep everything separated by topic.
 
It's exactly the same chip. They just changed the logo.

NXP acquired Freescale Semiconductor in 2015. But they waited 10 years to finally change the logo on the chips Freescale made, even ones that were released a few years after the merger. The main reason for the delay was because Qualcomm made moves the acquire NXP in late-2016. But by mid-2018, the Qualcomm acquisition fell through. The main obstacle was lack of approval in China. No official reason was ever given (as far as I know) but it was widely viewed in the context of worsening international relations due to new tariffs from Trump's first presidency.

Then on the heels of the Qualcomm deal falling through, whole world suffered shutdowns and then supply chain disruptions from the Covid-19 pandemic. As the pandemic eased up, NXP's main focus was their (then) new MCX microcontroller product line, which combined stuff from NXP's "LPC" and Freescale's "Kinetis" microcontrollers.

Finally in early-2025 they got around the updating the logo from Freescale to NXP. The change is only the logo printed on the chip. Inside it's the same silicon.
 
Can you state what the actual problem is? Posting a picture of the scope and saying "this is what it's doing" is not a description of an issue.
 
Can you state what the actual problem is? Posting a picture of the scope and saying "this is what it's doing" is not a description of an issue.
The issue is that there is a clear difference in the signals SPI coming from the Teensy that is mounted on the PCB vs the one I have on the breadboard. Both teensy IC's are running the same code. (Also, sorry for being unclear. I have spend all day on this issue and I still don't have a clue what is wrong)

The one that is mounted on the breadboard sends 2 bytes with one bit high and the one that is mounted on the PCB sends out (at least) 6 bytes with one bit high but also some "junk" data. The "breadboard" IC also has a higher clock, 8.25 us for 1 byte vs 10.1 us for 1 byte on the PCB.
See post 4 for the screenshots

It's just confusing to be honest.

It's exactly the same chip. They just changed the logo.

NXP acquired Freescale Semiconductor in 2015. But they waited 10 years to finally change the logo on the chips Freescale made, even ones that were released a few years after the merger. The main reason for the delay was because Qualcomm made moves the acquire NXP in late-2016. But by mid-2018, the Qualcomm acquisition fell through. The main obstacle was lack of approval in China. No official reason was ever given (as far as I know) but it was widely viewed in the context of worsening international relations due to new tariffs from Trump's first presidency.

Then on the heels of the Qualcomm deal falling through, whole world suffered shutdowns and then supply chain disruptions from the Covid-19 pandemic. As the pandemic eased up, NXP's main focus was their (then) new MCX microcontroller product line, which combined stuff from NXP's "LPC" and Freescale's "Kinetis" microcontrollers.

Finally in early-2025 they got around the updating the logo from Freescale to NXP. The change is only the logo printed on the chip. Inside it's the same silicon.
Ok, that is good to hear.

I just can't seem to figure out what is wrong or if i'm missing something.
 
Are you using the exact same code on both the test bench and your PCB assembly?
How long are the wires between the Teensy and the slave device on both setups? Do you have good grounding on your PCB?

I was having some trouble with DMA enabled SPI Master/Slave between two Teensy 4's. I couldn't get it to work right on the final setup, so I moved to DMA UART that works very well for my application
 
Yes, I use the exact same code. In VSCode I just hit "upload" while connected to the test setup and to the PCB (not at the same time, I do swap cables).

Unfortunately I can't use UART since there are shift registers connected to the SPI bus.

The wires are a bit long, but if there is nothing connected to the SPI bus and I'm just using my scope I don't see how that could affect things. But I should be able to try some long wires next week.
 
Can you reproduce the problem using Arduino IDE rather than PlatformIO?

Does any special hardware need to be connected to measure waveforms like msg #4? If all that is needed is an oscilloscope connected to SCK and MOSI, can you share a program I can copy into Arduino IDE and upload to a Teensy 4.1 to recreate the problem?
 
One thing that is different between your setups are the SPI signal routing and wire lengths.

Your signals looks clean, so this may not be your issue, but I am wondering if you are using series dampening resistors (50-100ohm) on the SPI CLK and MOSI lines near the Teensy as they don't show in the schematic snippet. The junk data could possibly be signal reflections?

I have seen many strange SPI behavior problems get fixed when those are added.
 
One thing that is different between your setups are the SPI signal routing and wire lengths.

Your signals looks clean, so this may not be your issue, but I am wondering if you are using series dampening resistors (50-100ohm) on the SPI CLK and MOSI lines near the Teensy as they don't show in the schematic snippet. The junk data could possibly be signal reflections?

I have seen many strange SPI behavior problems get fixed when those are added.
This was going to be my next question

I've seen this in commercial hardware (Pioneer DJ) where long traces and/or wires are used between MCUs that communicate over SPI
 
But if I look at my scope I get the following output.
Assuming that the blue trace is the SPI clock and the red one is data, I see the 0x00 followed by 4 copies of 0x07 clocked out. So this isn't a problem with signal levels. It looks like incorrect data being placed in the SPI transmit buffer.

Figuring that out without a debugger and breakpoints is tricky. There is a helper function to print out various DMA registers so you could start there.
 
Can you reproduce the problem using Arduino IDE rather than PlatformIO?

Does any special hardware need to be connected to measure waveforms like msg #4? If all that is needed is an oscilloscope connected to SCK and MOSI, can you share a program I can copy into Arduino IDE and upload to a Teensy 4.1 to recreate the problem?
As far as I know, no additional hardware is required. So, as requested, an Arduino Sketch. I'm using IDE version 2.3.8 and Teensy version 1.60.0.

C:
#include <Arduino.h>
#include <SPI.h>
#include <atomic>
#define LOOP_DELAY 100
#define CS1_PIN 0
#define MOSI1_PIN 26
#define SCK1_PIN 27
#define OE_PIN 14
elapsedMillis _spiDelay;
EventResponder _spiDoneEventResponder;
uint8_t buffer[2] = {0};
uint16_t bitPosition = 0;
uint8_t state;
std::atomic<bool> _spiDone;
void setup()
{
    pinMode(OE_PIN, OUTPUT);
    digitalWriteFast(OE_PIN, HIGH);
  
    pinMode(CS1_PIN, OUTPUT);
  
    //Start the SPI interface and set all the other pins
    SPI1.begin();
    SPI1.setMISO(-1);
    SPI1.setMOSI(MOSI1_PIN);
    SPI1.setSCK(SCK1_PIN);
    SPI1.beginTransaction(SPISettings(1000000, MSBFIRST, SPI_MODE2));
    //Attach the lambda expression to the event responder
    _spiDoneEventResponder.attachImmediate([](EventResponderRef eventResponder)
    {
      _spiDone = true;
    });
}
void loop()
{
  switch (state)
  {
    case 0:
      if(_spiDelay >= LOOP_DELAY)
      {
        _spiDelay -= LOOP_DELAY;
        state++;
      }
      break;
  case 1:
    //Clear the buffer
    memset(buffer, 0, sizeof(buffer));
    // Set the single bit
    buffer[bitPosition / 8] |= (1 << (bitPosition % 8));
    //Go the next position
    bitPosition = (bitPosition + 1) % (sizeof(buffer) * 8);
    //Transfer the buffer  
    SPI1.transfer(buffer, NULL, sizeof(buffer), _spiDoneEventResponder);
    state++;
    break;
  case 2:
    if(_spiDone)
      state++;
  
    break;
  case 3:
      delayNanoseconds(24);
      digitalWriteFast(CS1_PIN, LOW);
      delayNanoseconds(24);
      digitalWriteFast(CS1_PIN, HIGH);
      state = 0;
      break;
  }
}

On my breadboard this works as expected and I get the output as in the breadboard in post 4.

In total I have 5 PCB's but I want to keep it to 2 at the moment.

On my 1st and 2nd PCB I get the following outputs.
1773043908540.png
1773043923101.png


The 2nd set of 8 bits isn't always there (below only at trace 16).
1773044043000.png


For completeness on the PCB, there are a few additional components bit those are only for the protection of the teensy (the OE should also get these components as well).

The distance on the PCB from pins 26/27 to the connector is about 12 cm.
1773044309973.png


I hope this helps.

Edit: I also forced in the DMA debugging. On my breadboard the output is the following

Code:
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:8 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:8 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:8 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:8 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:8 BI:2

And on the PCB it's the following
Code:
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:88 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:88 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:88 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:88 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2
20203088 400e9000:SA:20001bd0 SO:1 AT:0 NB:1 SL:0 DA:4039c064 DO: 0 CI:2 DL:0 CS:88 BI:2
20203098 400e9020:SA:4039c074 SO:0 AT:0 NB:1 SL:0 DA:20001c86 DO: 0 CI:2 DL:0 CS:a BI:2

So, on my PCB the MAJORLINKCH seems to be 1, where as on my breadboard it's off.
Major Loop Link Channel NumberIf (MAJORELINK = 0) then:
• No channel-to-channel linking, or chaining, is performed after the major loop counter is exhausted.Otherwise:
• After the major loop counter is exhausted, the eDMA engine initiates a channel service request atthe channel defined by this field by setting that channel's START bit.

Edit: Just for fun I updated the library to clear the MAJORLINKCH and adding a buffer for the RX, however this didn't change the behavior.
 
Last edited:
Assuming that the blue trace is the SPI clock and the red one is data, I see the 0x00 followed by 4 copies of 0x07 clocked out. So this isn't a problem with signal levels. It looks like incorrect data being placed in the SPI transmit buffer.

Figuring that out without a debugger and breakpoints is tricky. There is a helper function to print out various DMA registers so you could start there.
I agree, the issue isn't (at the moment) with the signal levels. It's the transmission from the Teensy that does weird stuff.
 
Looks to me like signal reflections on your SPI clock line. If the SPI clock signals wire gets longer then the problem starts to bite at some point.
I can see the issue emerging already here in your scope image. If you had a better scope (higher bandwidth) then you would have seen that both the falling and the rising edges will in fact be not just 0->1 but 0->1->0->1, with tens of ns in between the latter two. The Teensy SPI master interprets that as two clocks instead of just one. Therefore the DMA transfer finishes way faster than you'd think.

You may also see that the problem disappears when you have a scope probe on SPI CLK, but comes back when the probe is detached. That's just the capacitive loading that 'fixes' it. But adding capacitance isn't a real fix here obviously.

The SPI clock that the shift registers use is effectively wired to the iMRX1062 pin. So not to the internally generated SPI master clock, not from before the output buffer that drives the same pin. The echo signal from a 'long' wire can become so strong that it 'wins' from output driver.

Some resistance in series with the SPI CLK 'output' pin as suggested above may help. But possibly not for you now. Elimination of the reflection by impedance matching at the clock receiving SPI part is a better approach.
Also: often overlooked is the GND wire, how long that is, and how little is left of an aspired decent ground plane that connects the GNDs for SPI master and slave ic's.

1773069558170.png
 
Seems like an interesting idea. (And you are correct in the fact that I have a low bandwidth scope, 25MHz, Picotech 2205A).

Please correct me if I'm wrong here. But since the Teensy is the master (e.g. it determines the speed of the SPI bus), why would reflections on the bus cause wrong data outputs?
Assuming you are correct that would mean that if I change the following in my code:
C:
//From
SPI1.beginTransaction(SPISettings(1000000, MSBFIRST, SPI_MODE2));

//To
SPI1.beginTransaction(SPISettings(1, MSBFIRST, SPI_MODE2));
This will cause the SPI bus to run on it's slowest speed (it probably won't make this speed) but it will be significantly slower and the error should disapear. Correct?

Something I have to try tomorrow.
 
The SPI clock speed in Hz has no impact. What matters are the ramps. So how fast the signal rises and falls. Unfortunately these are very fast transients and despite the many pin property settings i think there is no way to adjust any of that when the pin is in SPI clock out mode. And in that mode the pin is also an input actually…

The extra problem with this processor chip is that the pin is both output for the SPI clock AND input for the SPI in and out shift registers. A pulse that bounces back from the other end of a long wire can make the shift registers see multiple edges. Because the output driver isn’t strong enough to maintain its high or low value. Which is visible also btw in the scope traces.
 
I don't think this is a signal problem at all, there's no way that could affect the number of bytes being transmitted. It has to be a software issue.
 
The DMA channels used by the SPI class are dynamically allocated, that means there is a chance of this bug being triggered.

To rule it out, change this line to uint8_t channel = DMA_NUM_CHANNELS;

(The difference you're seeing in the CSR registers is the DONE bit, not the MAJORELINK bit. Somehow the DMA channel is "done" before SPI has its DMA requests enabled... This a hint that channel is being used by something else.)
 
Last edited:
Well, it keeps on getting stranger and stranger. Today I decided to grab a different scope (Rigol DS1042CD, 40MHz) and to also get the CS1 pin.

So what we have:
1 (Yellow): CS1
2 (blue/green): MOSI
D0 (red): CLK

NewFile0.png


Note that the overall pattern is the same (post 15) but the CS1 is acting WEIRD. I would expect it to show up after the SPI, bit somehow it's before.
There is one other thing thing. If I change the SPI to SPI1.transfer(buffer, sizeof(buffer)); (and remove if(_spiDone)) the code only loops once and crashes after that.

The full code (modified with DEBUG_DMA_TRANSFERS and the suggestion above)
C:
#include <Arduino.h>
#include <SPI.h>
#include <atomic>

#define CS1_PIN 0
#define MOSI1_PIN 26
#define SCK1_PIN 27
#define OE_PIN 14

EventResponder _spiDoneEventResponder;

uint8_t buffer[2] = {0};
uint8_t rxBuffer[2] = {0};
uint16_t bitPosition = 0;
uint8_t state;

std::atomic<bool> _spiDone;

void setup()
{
    Serial.begin(921600);
    while (!Serial && millis() < 4000) {
        // Wait for Serial
    }
  
    Serial.print("Start\r\n");
  
  
    pinMode(OE_PIN, OUTPUT);
    digitalWriteFast(OE_PIN, HIGH);
  
    pinMode(CS1_PIN, OUTPUT);
  
    //Start the SPI interface and set all the other pins
    SPI1.begin();
    SPI1.setMISO(-1);
    SPI1.setMOSI(MOSI1_PIN);
    SPI1.setSCK(SCK1_PIN);
    SPI1.beginTransaction(SPISettings(1000000, MSBFIRST, SPI_MODE2));

    //Attach the lambda expression to the event responder
    _spiDoneEventResponder.attachImmediate([](EventResponderRef eventResponder)
    {
        _spiDone = true;
    });
}

void loop()
{
    switch (state)
    {
        case 0:
            delay(10);
            state++;
            break;

        case 1:
            //Clear the buffer
            memset(buffer, 0, sizeof(buffer));

            // Set the single bit
            buffer[bitPosition / 8] |= (1 << (bitPosition % 8));

            //Go the next position
            bitPosition = (bitPosition + 1) % (sizeof(buffer) * 8);

            //Transfer the buffer 
            SPI1.transfer(buffer, rxBuffer, sizeof(buffer), _spiDoneEventResponder);
            //SPI1.transfer(buffer, sizeof(buffer));

            state++;
            break;

        case 2:
            if(_spiDone)
                state++;

            break;

        case 3:
            delayNanoseconds(500);
            digitalWriteFast(CS1_PIN, LOW);
            delayNanoseconds(500);
            digitalWriteFast(CS1_PIN, HIGH);
            state = 0;
            break;
    }
}
 
Last edited:
🤦‍♂️Facepalm moment. I updated that part to the following. However it doesn't change the output pattern.
C:
        case 2:
            if(_spiDone)
            {
                state++;
                _spiDone = false;
            }

To make sure the correct code was loaded I also did a factory reset. That didn't matter.
 
I connected my scope to pins 0, 26, 27 while running the code from msg #22 with the fix from msg #24.

This is what I'm seeing. I turned on "persistence" so you can get an idea of the flickering nature of the blue trace which changes every update.

file.png


If I do a single scan capture, the blue trace usually has a single high pulse which appears at a different position every update.

file2.png


Not sure what I'm really looking at here, haven't followed this thread closely so far. But hopefully this helps give another view from a known-good oscilloscope.

Here's the hardware on my workbench, so you can see how I tested and where the scope probes connected.

photo.jpg
 
Back
Top