Different SPI Behavior on Teensy 4.0 and 4.1

Status
Not open for further replies.

eeforeveryone

New member
Hello,

I'm running into a wall with a strange SPI glitch.

Teesyduino Version: 1.53
Arduino IDE Version: 1.8.12

Observed Behaviors:
SPI.transfer(); appears to block on Teensy 4.0, as expected.
SPI.transfer(); does not appear to block on Teensy 4.1.
Uncommenting delays shown below causes crazy performance on 4.1 (clocking out 32+ bits of data), but behaves as expected on Teensy 4.0.
Code has been executed with and without the .begin() call before starting a transaction. it does not change the observed behavior.
I sometimes observe the MISO data received in call #1 being immediately returned on the second call of SPI.transfer.

Code:
SPI.begin();
    digitalWrite(self->ADC_CS, HIGH);
    SPI.beginTransaction(SPISettings(ADCEXT_SPI_CLK, MSBFIRST, SPI_MODE0)); //Fire up SPI interface, defined in adcext.h (14Max)
    digitalWrite(self->ADC_CS, LOW);
    //delayMicroseconds(2);
    upperdata = SPI.transfer((uint8_t) 0x00); //Send 0 while reading a byte
    //delayMicroseconds(10);
    lowerdata = SPI.transfer((uint8_t) 0x00); //Send 0 while reading a byte
    //delayMicroseconds(10);
    digitalWrite(self->ADC_CS, HIGH);
    SPI.endTransaction();
SPI.end();


Scope capture on Teensy 4.0:
TEENSY4_0_capture.png

Scope capture on Teensy 4.1:
TEENSY4_1_capture.png

So... this appears to be a blocking issue with the DMA-offloading of SPI transactions on Teensy 4.1.
I did some searching, but can't seem to find anything relating to SPI Issues on Teensy 4.1.

SPI.transfer16() appears to behave the same.
SPI.transfer(rxbuf* txbuf* numbytes) crashes on the second call. I suspect a memory access issue (attempting to write to a now obsolete pointer, because the function did not effectively block)
DMA is used for SPI in both 4.0 and 4.1 Teensy... since they are the same micro-controller, leaving me rather confused.

Thank you in Advance for those who have crossed this behavior before!
 
Interesting - Not Sure what is going on here.

As Same code and same processor?
The code for simple transfer is:
Code:
	// Write to the SPI bus (MOSI pin) and also receive (MISO pin)
	uint8_t transfer(uint8_t data) {
		// TODO: check for space in fifo?
		port().TDR = data;
		while (1) {
			uint32_t fifo = (port().FSR >> 16) & 0x1F;
			if (fifo > 0) return port().RDR;
		}
		//port().SR = SPI_SR_TCF;
		//port().PUSHR = data;
		//while (!(port().SR & SPI_SR_TCF)) ; // wait
		//return port().POPR;
	}

Transfer16 is more ore less identical except it switches to and from 16 bit mode:
Code:
	uint16_t transfer16(uint16_t data) {
		uint32_t tcr = port().TCR;
		port().TCR = (tcr & 0xfffff000) | LPSPI_TCR_FRAMESZ(15);  // turn on 16 bit mode 
		port().TDR = data;		// output 16 bit data.
		while ((port().RSR & LPSPI_RSR_RXEMPTY)) ;	// wait while the RSR fifo is empty...
		port().TCR = tcr;	// restore back
		return port().RDR;
	}
Note: the version of code that sends a buffer:
Code:
void SPIClass::transfer(const void * buf, void * retbuf, size_t count)
{

	if (count == 0) return;
    uint8_t *p_write = (uint8_t*)buf;
    uint8_t *p_read = (uint8_t*)retbuf;
    size_t count_read = count;

	// Pass 1 keep it simple and don't try packing 8 bits into 16 yet..
	// Lets clear the reader queue
	port().CR = LPSPI_CR_RRF | LPSPI_CR_MEN;	// clear the queue and make sure still enabled. 

	while (count > 0) {
		// Push out the next byte; 
		port().TDR = p_write? *p_write++ : _transferWriteFill;
		count--; // how many bytes left to output.
		// Make sure queue is not full before pushing next byte out
		do {
			if ((port().RSR & LPSPI_RSR_RXEMPTY) == 0)  {
				uint8_t b = port().RDR;  // Read any pending RX bytes in
				if (p_read) *p_read++ = b; 
				count_read--;
			}
		} while ((port().SR & LPSPI_SR_TDF) == 0) ;

	}

	// now lets wait for all of the read bytes to be returned...
	while (count_read) {
		if ((port().RSR & LPSPI_RSR_RXEMPTY) == 0)  {
			uint8_t b = port().RDR;  // Read any pending RX bytes in
			if (p_read) *p_read++ = b; 
			count_read--;
		}
	}
}

Again simply loops putting stuff out on the Transfer fifo and wait for that many bytes to come back on the RX fifo before it returns.

In all of the cases above there is no DMA involved.

The only case that uses DMA is if use the version of transfer with an EventResponder:
Code:
bool SPIClass::transfer(const void *buf, void *retbuf, size_t count, EventResponderRef event_responder)

The only differences in this SPI code between T4 and T4.1, is which IO pins can be used for certain functions. Like SPI1 having 2 possibilities for MISO and hardware CS pins.

So I have no idea what is different between your two setups? Which version of Arduino IDE, Teensyduino?

Would help to see full example sketch that shows this? Like you said a crash?

Kurt
 
As Kurt said - the Code is identical.

It would be good to see a complete program that we can copy&paste to Arduin( "Forum Rule: Always post complete source code & details to reproduce any issue! " (above)). Without that, Kurt already said all, and we can't help more.
 
Fair points, Kurt & Frank. The true source is a mess at the moment. many files stitched together.

I'll try to clean this up and get it loaded on Git tonight.
I attempted to duplicate this issue out of circuit, no dice.

I think I heard something about clock reflections causing issues with this IC, though I can't find the PJRC forum post at the moment.

The fact this code runs differently in / out of circuit is strange. I'll attempt adding a small series resistor near the clock output of this IC, and see if that fixes the behavior.
 
I'll try to clean this up and get it loaded on Git tonight.

Rather than trying to clean up a large project, maybe try writing just the smallest possible program which can show the problem. If there really is something different between Teensy 4.0 and 4.1 which needs fixing (hard as that is the imagine), a small & simple program to demonstrate the problem will greatly improve the odds that we can reproduce the issue and ultimately fix it.
 
Hello all,

Thank you for the responsiveness and support! At this point, I'm very confident that it is not a software or library issue.

To update this, I did two tests:

1. Run the original code, on a teensy out of circuit
2. Run the original code on a teensy in circuit, with a series 100ohm resistor electrically near the teensy on the clock pin.

Both of these tests performed as expected (same performance on teensy 4.0 and 4.1, function blocked as expected), so it appears that clock reflections were causing glitches in the SPI hardware!
 
Hi eeforeveryone.
I currently using a T4.0 but need two SPI channels. As the T4.0 SPI1 and SPI2 channels are not easily accessible (requiring additional connectors to my main PCB), I'm considering moving over to a T4.1. As it seems that your issues were caused by external hardware, can you give me some advice on the wiring to your T4.1? As KurtE mentioned, one difference with the T4.1 is the two CS1 and MISO1 lines. Did you connect these together on your PCB (i.e. CS1 pin0 --> CS1 pin38 and MISO1 pin 1 --> MISO1 pin39), or did you leave one of each pair floating?
Did you manage to fix whatever was causing the clock reflections?

Finally, does anyone know if the SPI1 or SPI2 pins on the T4.0 can be redefined to ones which are more accessible. I don't think they are, but it's worth asking.
Thanks.
 
Simple answers first:
T4 - There are no alternative pins that are exposed on the boards for the SPI functions.

Warning: I only do boards for my own enjoyment and typically one offs, and have never sold, so take with grain of salt!

As for getting to SPI1 pins on T4 and optionally SPI2... I have played around with a couple of different ways. 1) you can use an SMT header pins like you would on T3.x to get to the pins 24-33 and then depending on your board either headers to plug into or holes for it to go through to solder... SPI2/SDIO - Could go with the Ribbon cable/connectors route, but I found this problematic for my 10 left thumbs...

I had better luck soldering on an extender board 4236... Which solders to the bottom pads and then makes the T4 into a T3.6(now T4.1) form factor... Which works pretty well. But now I don't bother and go directly to T4.1

T4.1 - CS pins - Yes there are some alternatives, but again in most cases unless you are using a library which does hardcore SPI stuff that uses the hardware CS pins, than these don't matter. Most libraries just need any digital pin to work as their CS pin. MISO1... Yes there are two possible ones.
When you do something like SPI1.begin()... It will only activate one of them. Default is pin 1 as this was common with T4 so left it as default.
But if you prefer pin 39 (I do). Then your code that will use SPI1, needs to do something like:
Code:
SPI1.setMISO(39);
SPI1.begin();

Again only one of these pins will be converted (mode set on the actual pin) to be MISO. The other pin can be used for anything else, that is SPI won't touch it...
 
Thanks KurtE.
So the duplicate MISO1/CS1 pins and not active at the same time. That's good to know. Hopefully eeforeveryone can enlighten me on the cause of his original problem so I don't stumble in to the same trap. Hopefully it was just an issue caused by running on a breadboard instead of a PCB.

Currently, on the T4.0, I just use several GPIO lines as CS0, CS1... and set the desired one to low and talk to the device through the MISO/MOSI pins.
My issue stems from the need to talk to two different chips. Several ADCs which use SPI_Mode 2, and several DACs which use SPI_Mode 3. I think (please correct me if I'm wrong) I could set the relevant mode, depending on which devices I am talking to, squirt out a dummy word and then write as normal. But I worry that this would slow things down too much and potentially cause instabilities.

I think moving to the T4.1 is a nicer option. From your experience, are there any pitfalls in running SPI code written for the T4.0 on the T4.1. Do they use the same SPI library, for example. I'm using one downloaded from Paul Stroffrogen at the moment (Thanks Paul)
 
It is the exact same code on T4 and T4.1, Except the the tables which say which pins are valid... Example, the only differences are in the Hardware tables that define the code, example for SPI1:
Code:
#if defined(ARDUINO_TEENSY41)
const SPIClass::SPI_Hardware_t  SPIClass::spiclass_lpspi3_hardware = {
	CCM_CCGR1, CCM_CCGR1_LPSPI3(CCM_CCGR_ON),
	DMAMUX_SOURCE_LPSPI3_TX, DMAMUX_SOURCE_LPSPI3_RX, _spi_dma_rxISR1,
	[COLOR="#FF0000"]1, 39,
	7 | 0x10, 2 | 0x10,
	0, 1,
	IOMUXC_LPSPI3_SDI_SELECT_INPUT,[/COLOR]
	26, 255,
	2 | 0x10, 0,
	1, 0,
	IOMUXC_LPSPI3_SDO_SELECT_INPUT,
	27, 255,
	2 | 0x10, 0,
	1,  0,
	IOMUXC_LPSPI3_SCK_SELECT_INPUT,
	0, 38, 255,
	7 | 0x10, 2 | 0x10, 0,
	1, 1, 0,
	0, 1, 0,
	&IOMUXC_LPSPI3_PCS0_SELECT_INPUT, &IOMUXC_LPSPI3_PCS0_SELECT_INPUT, 0
};
#else
const SPIClass::SPI_Hardware_t  SPIClass::spiclass_lpspi3_hardware = {
	CCM_CCGR1, CCM_CCGR1_LPSPI3(CCM_CCGR_ON),
	DMAMUX_SOURCE_LPSPI3_TX, DMAMUX_SOURCE_LPSPI3_RX, _spi_dma_rxISR1,
[COLOR="#FF0000"]	1, 
	7 | 0x10,
	0,
	IOMUXC_LPSPI3_SDI_SELECT_INPUT,[/COLOR]
	26,
	2 | 0x10,
	1,
	IOMUXC_LPSPI3_SDO_SELECT_INPUT,
	27,
	2 | 0x10,
	1, 
	IOMUXC_LPSPI3_SCK_SELECT_INPUT,
	0,
	7 | 0x10,
	1,
	0, 
	&IOMUXC_LPSPI3_PCS0_SELECT_INPUT
};
#endif
SPIClass SPI1((uintptr_t)&IMXRT_LPSPI3_S, (uintptr_t)&SPIClass::spiclass_lpspi3_hardware);
The parts in RED are the define for what is valid for MISO..
On T4, there is only one valid one (and only one valid for any of the SPI devices so table only 1 item long).

But the data shows here that if you choose example 39 it will change that pin to, mode 2, It's input select is value 1...

This differences are only seen when you call begin and/or call the setMISO pin functions. Beyond that no difference, between T4 and 4.1
 
Status
Not open for further replies.
Back
Top