Teensy 4 as SPI Slave

Glen

New member
I have the need to create something that emulates a piece of hardware that will be built later. The hardware will act as a SPI slave and I thought "heck, that should be easy....right?"

I've done a bunch of searching and while there are plenty of examples of SPI master code, there just doesn't seem to be much for implementing a slave, particularly on the Teensy 4.0. I'm just making the rabid assumption that the NXP part has lovely built-in register functionality that would allow me to configure it as a slave, point to the data, and let it handle my life for me, but getting that exposed to the Arduino (Teensyduino?) development environment doesn't seem to be as simple as I would have hoped.

I'm betting that for my purposes I can actually just hack together some bit-banging, polled-not-even-interrupt-driven code that will do what I want (it's pretty basic), but before I jump off of that particular embarrassing bridge I thought I would first publicly display my ignorance and ask if there are any examples of SPI slave code for the Teensy 4.0 out there.

Anybody have any sage wisdom here? OK, how about any semi-snarky advice? Or should I consider a different profession in the lucrative lawn-mowing business? :cool:

..glen
 
Hi Glen,

I'm in the same position. You're correct that the NXP has excellent hardware support for running as an I2C slave. The bad news is that the teensyduino implementation for Wire only supports master mode.

I'm writing a replacement driver at the moment which I'll make public as soon as it's at beta quality. I've already got slave mode sorted and I'm currently working on master mode. (Which turns out to be quite hard!) I expect to make it available in 2 to 3 weeks. (I'm working on it at the weekends)

You could use one of the Teensy 3 boards. I believe the i2c_t3 library supports slave mode and is shipped as part of Teensyduino. It's probably pretty reliable. See the Teensy docs for a link to the main thread about it.

If you'd like to try my driver for the Teensy 4 but can't wait a few weeks, I could bundle up what I've got this weekend and you could give it a go. I'd appreciate having someone test it for me. :)

cheers,
Richard
 
These are two different things: Slave SPI and Slave I2C

Both should be possible.

As mentioned the i2c_t3 library supports slave mode on T3.x... Last I noticed on issues up on this library is he now has some T4's and hopefully when he gets time will add support for T4...

But I also believe that the Wire library also supports slave mode. Example shown in Arduino documentation: https://www.arduino.cc/en/Tutorial/MasterWriter

And in fact if you look at the examples under the wire library you will see examples on how to do so.

Unfortunately it looks like we have not yet implemented the slave mode For T4... from WireIMXRT.cpp
Code:
void TwoWire::begin(uint8_t address)
{
	// TODO: slave mode
}

Again it is probably not hard to do. And if someone gets to it, it would be great to do a Pull Request with updated code. I might take a look soon if no one else gets to it...


Now as for SPI slave again the chip supports it. Unfortunately SPI library does not. I know there have been a few threads about this for T3.x and I played some with it awhile ago (T3.x). At the time, I found it tricky to get SPI client on T3.x to work the way I wanted, in trying to manage when things were in the queue and the like.
Who knows might be easier on T4...

Again at some point might play again with it...
 
These are two different things: Slave SPI and Slave I2C...

Errr.... Apparently I completely failed to read the question properly. Sorry about that.

I'm not doing anything with SPI but I might go and brush up on my reading skills. :)

cheers,
Richard
 
Thanks for the responses, but as Kurt noted this particular need is a SPI issue, not an I2C issue. Slaves are always trickier than it seems at the beginning because if you want to do it right you almost have to do it with interrupts, which makes life instantly more interesting.

For this particular case I can be pretty brutish about things since I have have carnal knowledge of both the master and slave side of the communication, and I can just implement some sort of gross bit-bang solution, I just didn't want to be that inelegant if somebody had already solved the problem! :cool:

I shall now put on my boxing gloves and bang out some code that shall never see the light of day beyond my desk....

..gb
 
Simple SPI Slave

Here's a simple SPI slave that only receives data and only works on Teensy 4.0. It always receives the same number of bytes (the constant BufSize.) DMA is used to write to the destination buffer, and your sketch needs to define the function spiBuf() to return a pointer to the destination buffer. You can also declare swapSpiBuf() if you want to double-buffer the received data.

This code is obviously hacked out of the Teensy SPI library.


Just include SPISlave.h and call beginSlave() in your begin() function. Nothing is required in loop().



The following is C:\Program Files (x86)\Arduino\hardware\teensy\avr\libraries\SPISlave\SPISlave.h

Code:
/*
 * Copyright (c) 2010 by Cristian Maglie <c.maglie@bug.st>
 * Copyright (c) 2014 by Paul Stoffregen <paul@pjrc.com> (Transaction API)
 * Copyright (c) 2014 by Matthijs Kooijman <matthijs@stdin.nl> (SPISettings AVR)
 * SPI Master library for arduino.
 *
 * This file is free software; you can redistribute it and/or modify
 * it under the terms of either the GNU General Public License version 2
 * or the GNU Lesser General Public License version 2.1, both as
 * published by the Free Software Foundation.
 */

#ifndef _SPI_SLAVE_H_INCLUDED
#define _SPI_SLAVE_H_INCLUDED

#include <Arduino.h>

#if defined(__arm__) && defined(TEENSYDUINO)
#if defined(__has_include) && __has_include(<EventResponder.h>)
// SPI_HAS_TRANSFER_ASYNC - Defined to say that the SPI supports an ASYNC version
// of the SPI_HAS_TRANSFER_BUF
#define SPI_HAS_TRANSFER_ASYNC 1
#include <DMAChannel.h>
#include <EventResponder.h>
#endif
#endif

// SPI_HAS_TRANSACTION means SPI has beginTransaction(), endTransaction(),
// usingInterrupt(), and SPISetting(clock, bitOrder, dataMode)
#define SPI_HAS_TRANSACTION 1

// Uncomment this line to add detection of mismatched begin/end transactions.
// A mismatch occurs if other libraries fail to use SPI.endTransaction() for
// each SPI.beginTransaction().  Connect a LED to this pin.  The LED will turn
// on if any mismatch is ever detected.
//#define SPI_TRANSACTION_MISMATCH_LED 5

// SPI_HAS_TRANSFER_BUF - is defined to signify that this library supports
// a version of transfer which allows you to pass in both TX and RX buffer
// pointers, either of which could be NULL
#define SPI_HAS_TRANSFER_BUF 1


#ifndef LSBFIRST
#define LSBFIRST 0
#endif
#ifndef MSBFIRST
#define MSBFIRST 1
#endif

#define SPI_MODE0 0x00
#define SPI_MODE1 0x04
#define SPI_MODE2 0x08
#define SPI_MODE3 0x0C

#define SPI_CLOCK_DIV4 0x00
#define SPI_CLOCK_DIV16 0x01
#define SPI_CLOCK_DIV64 0x02
#define SPI_CLOCK_DIV128 0x03
#define SPI_CLOCK_DIV2 0x04
#define SPI_CLOCK_DIV8 0x05
#define SPI_CLOCK_DIV32 0x06

#define SPI_MODE_MASK 0x0C  // CPOL = bit 3, CPHA = bit 2 on SPCR
#define SPI_CLOCK_MASK 0x03  // SPR1 = bit 1, SPR0 = bit 0 on SPCR
#define SPI_2XCLOCK_MASK 0x01  // SPI2X = bit 0 on SPSR



/**********************************************************/
/*     32 bit Teensy 4.x                                  */
/**********************************************************/

#if defined(__arm__) && defined(TEENSYDUINO) && (defined(__IMXRT1052__) || defined(__IMXRT1062__))
#define SPI_ATOMIC_VERSION 1

void synchronousRead_isr(void);


class DMABaseClass;


void dumpDMA_TCD( DMABaseClass *dmabc );


class SPISettings 
{
public:
	SPISettings(uint32_t clockIn, uint8_t bitOrderIn, uint8_t dataModeIn) : _clock
	(clockIn) 
	{
		init_AlwaysInline(bitOrderIn, dataModeIn);
	}

	SPISettings() : _clock(4000000) 
	{
		init_AlwaysInline(MSBFIRST, SPI_MODE0);
	}
private:
	void init_AlwaysInline(uint8_t bitOrder, uint8_t dataMode)
	  __attribute__((__always_inline__)) 
	  {
			tcr = LPSPI_TCR_FRAMESZ(7);    // TCR has polarity and bit order too

			// handle LSB setup 
			if (bitOrder == LSBFIRST) tcr |= LPSPI_TCR_LSBF;

			// Handle Data Mode
			if (dataMode & 0x08) tcr |= LPSPI_TCR_CPOL;

			// Note: On T3.2 when we set CPHA it also updated the timing.  It moved the 
			// PCS to SCK Delay Prescaler into the After SCK Delay Prescaler	
			if (dataMode & 0x04) tcr |= LPSPI_TCR_CPHA; 
	}

	inline uint32_t clock() {return _clock;}

	uint32_t _clock;
	uint32_t tcr; // transmit command, pg 2955
	friend class SPIClass;
};



class SPIClass // Teensy 4
{ 
public:
	static const uint8_t CNT_MISO_PINS = 1;
	static const uint8_t CNT_MOSI_PINS = 1;
	static const uint8_t CNT_SCK_PINS = 1;
	static const uint8_t CNT_CS_PINS = 1;
	
	struct SPI_Hardware_t
	{
		volatile uint32_t &clock_gate_register;
		const uint32_t clock_gate_mask;
		uint8_t  tx_dma_channel;
		uint8_t  rx_dma_channel;
		void     (*dma_rxisr)();
		const uint8_t  miso_pin[CNT_MISO_PINS];
		const uint32_t  miso_mux[CNT_MISO_PINS];
		const uint8_t  mosi_pin[CNT_MOSI_PINS];
		const uint32_t  mosi_mux[CNT_MOSI_PINS];
		const uint8_t  sck_pin[CNT_SCK_PINS];
		const uint32_t  sck_mux[CNT_SCK_PINS];
		const uint8_t  cs_pin[CNT_CS_PINS];
		const uint32_t  cs_mux[CNT_CS_PINS];

		volatile uint32_t &sck_select_input_register;
		volatile uint32_t &sdi_select_input_register;
		volatile uint32_t &sdo_select_input_register;
		volatile uint32_t &pcs0_select_input_register;
		const uint8_t sck_select_val;
		const uint8_t sdi_select_val;
		const uint8_t sdo_select_val;
		const uint8_t pcs0_select_val;
	};
	
	static const SPI_Hardware_t spiclass_lpspi4_hardware;
	static const SPI_Hardware_t spiclass_lpspi3_hardware;
	static const SPI_Hardware_t spiclass_lpspi1_hardware;


public:
	constexpr SPIClass(uintptr_t myport, uintptr_t myhardware)
		: port_addr(myport), hardware_addr(myhardware) 
	{
	}

	void beginSlave();

	bool initDMA();
	
	void swapSpiBuf()
	{ 
	   _dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize ); 
	}
	

	friend void _spi_dma_rxISR0();
	inline void dma_rxisr();


public:
	IMXRT_LPSPI_t & port() { return *(IMXRT_LPSPI_t *)port_addr; }

private:
	const SPI_Hardware_t & hardware() { return *(const SPI_Hardware_t *)hardware_addr; }
	uintptr_t port_addr;
	uintptr_t hardware_addr;

	uint32_t _clock = 0;
	uint32_t _ccr = 0;

	uint8_t miso_pin_index = 0;
	uint8_t mosi_pin_index = 0;
	uint8_t sck_pin_index = 0;
	
	uint8_t interruptMasksUsed = 0;
	uint32_t interruptMask[(NVIC_NUM_INTERRUPTS+31)/32] = {};
	uint32_t interruptSave[(NVIC_NUM_INTERRUPTS+31)/32] = {};
	

	// DMA Support
	bool initDMAChannels();
	enum DMAState { notAllocated, idle, active, completed};
	enum { MAX_DMA_COUNT = 32767 };
	DMAState _dma_state = DMAState::notAllocated;
	uint32_t _dma_count_remaining = 0;	// How many bytes left to output after current DMA completes
	
	DMAChannel *_dmaRX = nullptr;
	EventResponder *_dma_event_responder = nullptr;

};


#endif


extern SPIClass SPI;
extern SPIClass SPI1;
extern SPIClass SPI2;

#endif


The following is C:\Program Files (x86)\Arduino\hardware\teensy\avr\libraries\SPISlave\SPISlave.cpp

Code:
/*
 *
 * Copyright (c) 2010 by Cristian Maglie <c.maglie@bug.st>
 * SPI Master library for arduino.
 *
 * This file is free software; you can redistribute it and/or modify
 * it under the terms of either the GNU General Public License version 2
 * or the GNU Lesser General Public License version 2.1, both as
 * published by the Free Software Foundation.
 */

#include "SPISlave.h"
#include "pins_arduino.h"

#define DEBUG_DMA_TRANSFERS


/**********************************************************/
/*     32 bit Teensy 4.x                                  */
/**********************************************************/


void _spi_dma_rxISR0(void) {SPI.dma_rxisr();}

const SPIClass::SPI_Hardware_t  SPIClass::spiclass_lpspi4_hardware = 
{
	CCM_CCGR1, CCM_CCGR1_LPSPI4(CCM_CCGR_ON),
	DMAMUX_SOURCE_LPSPI4_TX, 
	DMAMUX_SOURCE_LPSPI4_RX, 
	_spi_dma_rxISR0,
	12, 
	3 | 0x10,
	11,
	3 | 0x10,
	13,
	3 | 0x10,
	10,  // from https://forum.pjrc.com/threads/57328-Teensy-4-0-SPI-Chip-Select-pins If I remember correctly the one case, where you need to use the Hardware CS pin is if you implement an SPI Client setup, which I have not tried.

	3 | 0x10,
	IOMUXC_LPSPI4_SCK_SELECT_INPUT, 
	IOMUXC_LPSPI4_SDI_SELECT_INPUT, 
	IOMUXC_LPSPI4_SDO_SELECT_INPUT, 
	IOMUXC_LPSPI4_PCS0_SELECT_INPUT,
	0, 0, 0, 0
};

SPIClass SPI((uintptr_t)&IMXRT_LPSPI4_S, (uintptr_t)&SPIClass::spiclass_lpspi4_hardware);

// sketch defines these functions.
unsigned char *spiBuf();
void swapSpiBuf();


/*
LPSPI slave mode uses the same shift register and logic as the master mode, but does not
use the clock configuration register and the transmit command register must remain static
during SPI bus transfers.
*/


/*
	clock_gate_register = CCM_CCGR1  // 13.7.22 p 1140  
	clock_gate_mask = CCM_CCGR1_LPSPI4(CCM_CCGR_ON)   // CCM_CCGR_ON = 3
	tx_dma_channel = DMAMUX_SOURCE_LPSPI4_TX  // 80
	rx_dma_channel = DMAMUX_SOURCE_LPSPI4_RX  // 79
	dma_rxisr = SPI.gdr_isr
			
	cs_mux[0] = 3 | 0x10

	sck_select_input_register = IOMUXC_LPSPI4_SCK_SELECT_INPUT
	sdi_select_input_register = IOMUXC_LPSPI4_SDI_SELECT_INPUT
	sdo_select_input_register = IOMUXC_LPSPI4_SDO_SELECT_INPUT
	pcs0_select_input_register = IOMUXC_LPSPI4_PCS0_SELECT_INPUT
	
	sck_select_val = 0
	sdi_select_val = 0
	sdo_select_val = 0
	pcs0_select_val = 0
*/

static const int rxWater = 15;
//-----------------------------------------------------------------------------
void SPIClass::beginSlave()
//-----------------------------------------------------------------------------
{
//	Serial.printf( "clock_gate_register %08x\n", hardware().clock_gate_register );
//	Serial.printf( "clock_gate_mask %08x\n", hardware().clock_gate_mask );

    // turn off the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR &= ~CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register &= ~hardware().clock_gate_mask;


	Serial.printf( "SPI MISO: %d MOSI: %d, SCK: %d CS: %d\n", 
                   hardware().miso_pin[0], 
                   hardware().mosi_pin[0], 
                   hardware().sck_pin[0], 
                   hardware().cs_pin[0] );


    // CCM = Clock control module
	//Serial.printf( "before CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register

    // IOMUXC = IOMUX Controller
	// SRE = slew rate fast
	// DSE(3) = Drive strength R0/5
	// SPEED(3) = max(200MHz)	

	uint32_t fastio = IOMUXC_PAD_DSE(6) | IOMUXC_PAD_SPEED(1);
	//uint32_t fastio = IOMUXC_PAD_SRE | IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3);

	*(portControlRegister(hardware().miso_pin[0])) = fastio;
	*(portControlRegister(hardware().mosi_pin[0])) = fastio;
	*(portControlRegister(hardware().sck_pin[0])) = fastio;


//	*(portControlRegister(16) = fastio;
//	*(portControlRegister(17) = fastio;
//	*(portControlRegister(19) = fastio;


    // turn on the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR |= CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register |= hardware().clock_gate_mask;

	Serial.printf( "after CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register


	*(portConfigRegister(hardware().miso_pin[0])) 
	   = hardware().miso_mux[0];
	*(portConfigRegister(hardware().mosi_pin[0])) 
	   = hardware().mosi_mux[0];
	*(portConfigRegister(hardware().sck_pin[0])) 
	   = hardware().sck_mux[0];

	// why do I need this in slave mode when master mode doesn't do it?
	*(portConfigRegister(hardware().cs_pin[0])) 
	   = hardware().cs_mux[0];  


//	*(portConfigRegister(16) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_07;
//	*(portConfigRegister(17) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_06;
//	*(portConfigRegister(19) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_00;


	// Set the Mux pins 
   	hardware().sck_select_input_register = hardware().sck_select_val;
	hardware().sdi_select_input_register = hardware().sdi_select_val;
	hardware().sdo_select_input_register = hardware().sdo_select_val;

    // why do I need this in slave mode when master mode doesn't do it?
	hardware().pcs0_select_input_register = hardware().pcs0_select_val;  
	   
	// CR  = Control register
	// RST = Software reset
	port().CR = LPSPI_CR_RST;

	// Initialize the Receive FIFO watermark to FIFO size - 1... 
	port().FCR = LPSPI_FCR_RXWATER( rxWater );
	
	port().CR = LPSPI_CR_MEN;     // Module Enable

	
	initDMA();
	

	Serial.printf( "PARAM %x CR %x SR %x IER %x DER %x CFGR0 %x CFGR1 %x FCR %x TCR %x\n", 
	               port().PARAM,
	               port().CR, port().SR, 
				   port().IER, port().DER,
				   port().CFGR0, port().CFGR1, 
				   port().FCR, port().TCR );	
}


static int const BufSize = 3000;


//-----------------------------------------------------------------------------
bool SPIClass::initDMAChannels() 
//-----------------------------------------------------------------------------
{
	_dmaRX = new DMAChannel();

	if (_dmaRX == nullptr)
		return false;

Serial.printf( "rx_dma_channel = %d\n", hardware().rx_dma_channel );

	// Set up the RX chain
	_dmaRX->disable();

	_dmaRX->source( (volatile uint8_t&)port().RDR );
/*
Disable Request
If this flag is set, the eDMA hardware automatically clears the corresponding ERQ bit when the current
major iteration count reaches zero.
0b - The channel's ERQ bit is not affected.
1b - The channel's ERQ bit is cleared when the major loop is complete.
*/
	_dmaRX->disableOnCompletion();   // TCD->CSR |= DMA_TCD_CSR_DREQ;
	_dmaRX->triggerAtHardwareEvent( hardware().rx_dma_channel );
	_dmaRX->attachInterrupt( hardware().dma_rxisr );

/*
Enable an interrupt when major iteration count completes.
If this flag is set, the channel generates an interrupt request by setting the appropriate bit in the INT when
the current major iteration count reaches zero.
0b - The end-of-major loop interrupt is disabled.
1b - The end-of-major loop interrupt is enabled.
*/

    // cause the dma_rxisr() to be called at the end of the transfer
	_dmaRX->interruptAtCompletion();  // TCD->CSR |= DMA_TCD_CSR_INTMAJOR;
	
	_dma_state = DMAState::idle;  // Should be first thing set!
	
	_dmaRX->enable(); 
	
	return true;
}


EventResponder _event_responder;

#define DATA_BUFFER_MAX (16*1024)

//-----------------------------------------------------------------------------
bool SPIClass::initDMA()
//-----------------------------------------------------------------------------
{	
	initDMAChannels();
	
	
    if (_dma_state == DMAState::active)
		return false; // already active

	_event_responder.clearEvent();	// Make sure it is not set yet
	
	_dma_count_remaining = 0;
	
	_dmaRX->TCD->ATTR_SRC = 0;		// Make sure set for 8 bit mode...
	_dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize );
	_dmaRX->TCD->DLASTSGA = -BufSize;

	if( (uint32_t)spiBuf() >= 0x20200000u )  
		arm_dcache_delete( spiBuf(), BufSize );
	
	_dma_event_responder = &_event_responder;

	dumpDMA_TCD( _dmaRX );

	// Make sure port is in 8 bit mode and clear watermark
	port().TCR = (port().TCR & ~(LPSPI_TCR_FRAMESZ(31))) | LPSPI_TCR_FRAMESZ(7);	
	port().FCR = 0; 

	// Lets try to output the first byte to make sure that we are in 8 bit mode...
 	port().DER = LPSPI_DER_RDDE;	//enable DMA on RX
	port().SR = 0x3f00;	// clear out all of the other status...

	_dmaRX->enable();

	_dma_state = DMAState::active;
	
	return true;
}



//-----------------------------------------------------------------------------
inline void DMAChanneltransferCount( DMAChannel *dmac, unsigned int len ) 
//-----------------------------------------------------------------------------
{
	// note does no validation of length...
	DMABaseClass::TCD_t *tcd = dmac->TCD;
	
	if (!(tcd->BITER & DMA_TCD_BITER_ELINK)) 
	{
		tcd->BITER = len & 0x7fff;
	} 
	else 
	{
		tcd->BITER = (tcd->BITER & 0xFE00) | (len & 0x1ff);
	}
	tcd->CITER = tcd->BITER; 
}

//-----------------------------------------------------------------------------
void dumpDMA_TCD( DMABaseClass *dmabc )
//-----------------------------------------------------------------------------
{
	Serial.printf("spiData = %p BC: %x TCD: %x:", spiBuf(), (uint32_t)dmabc, (uint32_t)dmabc->TCD);

	Serial.printf("SA:%x SO:%d AT:%x NB:%x SL:%d DA:%x DO:%d CI:%x DL:%x CS:%x BI:%x\n", (uint32_t)dmabc->TCD->SADDR,
		dmabc->TCD->SOFF, dmabc->TCD->ATTR, dmabc->TCD->NBYTES, dmabc->TCD->SLAST, (uint32_t)dmabc->TCD->DADDR, 
		dmabc->TCD->DOFF, dmabc->TCD->CITER, dmabc->TCD->DLASTSGA, dmabc->TCD->CSR, dmabc->TCD->BITER);
}


//-------------------------------------------------------------------------
void SPIClass::dma_rxisr() 
//-------------------------------------------------------------------------
{
	_dmaRX->clearInterrupt();   // 		DMA_CINT = channel;
	_dmaRX->clearComplete();    // 		DMA_CDNE = channel;
	
	// receive a full SPI buffer
    DMAChanneltransferCount( _dmaRX, BufSize );

    // Optionally allow the user to double buffer the data.  Comment 
	//   the swapSpiBuf() functions out if you don't need them.
    ::swapSpiBuf();   // swap the pointers
    swapSpiBuf();     // tell _dmaRX the new buffer
	
	
    _dmaRX->enable();

    asm("dsb");

}

	
//-------------------------------------------------------------------------
void swapSpiBuf()
//-------------------------------------------------------------------------
{ 
  _dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize ); 
}
 
First of all: wow, cool!

For my work however I actually need to be clocking data OUT of the slave (think of the slave kind of acting like a SPI eeprom). However I'm going to bet that if I peruse the code that Gordon provided I'll figure out how to make that happen. Thanks!

..glen

Here's a simple SPI slave that only receives data and only works on Teensy 4.0. It always receives the same number of bytes (the constant BufSize.) DMA is used to write to the destination buffer, and your sketch needs to define the function spiBuf() to return a pointer to the destination buffer. You can also declare swapSpiBuf() if you want to double-buffer the received data.

This code is obviously hacked out of the Teensy SPI library.


Just include SPISlave.h and call beginSlave() in your begin() function. Nothing is required in loop().



The following is C:\Program Files (x86)\Arduino\hardware\teensy\avr\libraries\SPISlave\SPISlave.h

Code:
/*
 * Copyright (c) 2010 by Cristian Maglie <c.maglie@bug.st>
 * Copyright (c) 2014 by Paul Stoffregen <paul@pjrc.com> (Transaction API)
 * Copyright (c) 2014 by Matthijs Kooijman <matthijs@stdin.nl> (SPISettings AVR)
 * SPI Master library for arduino.
 *
 * This file is free software; you can redistribute it and/or modify
 * it under the terms of either the GNU General Public License version 2
 * or the GNU Lesser General Public License version 2.1, both as
 * published by the Free Software Foundation.
 */

#ifndef _SPI_SLAVE_H_INCLUDED
#define _SPI_SLAVE_H_INCLUDED

#include <Arduino.h>

#if defined(__arm__) && defined(TEENSYDUINO)
#if defined(__has_include) && __has_include(<EventResponder.h>)
// SPI_HAS_TRANSFER_ASYNC - Defined to say that the SPI supports an ASYNC version
// of the SPI_HAS_TRANSFER_BUF
#define SPI_HAS_TRANSFER_ASYNC 1
#include <DMAChannel.h>
#include <EventResponder.h>
#endif
#endif

// SPI_HAS_TRANSACTION means SPI has beginTransaction(), endTransaction(),
// usingInterrupt(), and SPISetting(clock, bitOrder, dataMode)
#define SPI_HAS_TRANSACTION 1

// Uncomment this line to add detection of mismatched begin/end transactions.
// A mismatch occurs if other libraries fail to use SPI.endTransaction() for
// each SPI.beginTransaction().  Connect a LED to this pin.  The LED will turn
// on if any mismatch is ever detected.
//#define SPI_TRANSACTION_MISMATCH_LED 5

// SPI_HAS_TRANSFER_BUF - is defined to signify that this library supports
// a version of transfer which allows you to pass in both TX and RX buffer
// pointers, either of which could be NULL
#define SPI_HAS_TRANSFER_BUF 1


#ifndef LSBFIRST
#define LSBFIRST 0
#endif
#ifndef MSBFIRST
#define MSBFIRST 1
#endif

#define SPI_MODE0 0x00
#define SPI_MODE1 0x04
#define SPI_MODE2 0x08
#define SPI_MODE3 0x0C

#define SPI_CLOCK_DIV4 0x00
#define SPI_CLOCK_DIV16 0x01
#define SPI_CLOCK_DIV64 0x02
#define SPI_CLOCK_DIV128 0x03
#define SPI_CLOCK_DIV2 0x04
#define SPI_CLOCK_DIV8 0x05
#define SPI_CLOCK_DIV32 0x06

#define SPI_MODE_MASK 0x0C  // CPOL = bit 3, CPHA = bit 2 on SPCR
#define SPI_CLOCK_MASK 0x03  // SPR1 = bit 1, SPR0 = bit 0 on SPCR
#define SPI_2XCLOCK_MASK 0x01  // SPI2X = bit 0 on SPSR



/**********************************************************/
/*     32 bit Teensy 4.x                                  */
/**********************************************************/

#if defined(__arm__) && defined(TEENSYDUINO) && (defined(__IMXRT1052__) || defined(__IMXRT1062__))
#define SPI_ATOMIC_VERSION 1

void synchronousRead_isr(void);


class DMABaseClass;


void dumpDMA_TCD( DMABaseClass *dmabc );


class SPISettings 
{
public:
	SPISettings(uint32_t clockIn, uint8_t bitOrderIn, uint8_t dataModeIn) : _clock
	(clockIn) 
	{
		init_AlwaysInline(bitOrderIn, dataModeIn);
	}

	SPISettings() : _clock(4000000) 
	{
		init_AlwaysInline(MSBFIRST, SPI_MODE0);
	}
private:
	void init_AlwaysInline(uint8_t bitOrder, uint8_t dataMode)
	  __attribute__((__always_inline__)) 
	  {
			tcr = LPSPI_TCR_FRAMESZ(7);    // TCR has polarity and bit order too

			// handle LSB setup 
			if (bitOrder == LSBFIRST) tcr |= LPSPI_TCR_LSBF;

			// Handle Data Mode
			if (dataMode & 0x08) tcr |= LPSPI_TCR_CPOL;

			// Note: On T3.2 when we set CPHA it also updated the timing.  It moved the 
			// PCS to SCK Delay Prescaler into the After SCK Delay Prescaler	
			if (dataMode & 0x04) tcr |= LPSPI_TCR_CPHA; 
	}

	inline uint32_t clock() {return _clock;}

	uint32_t _clock;
	uint32_t tcr; // transmit command, pg 2955
	friend class SPIClass;
};



class SPIClass // Teensy 4
{ 
public:
	static const uint8_t CNT_MISO_PINS = 1;
	static const uint8_t CNT_MOSI_PINS = 1;
	static const uint8_t CNT_SCK_PINS = 1;
	static const uint8_t CNT_CS_PINS = 1;
	
	struct SPI_Hardware_t
	{
		volatile uint32_t &clock_gate_register;
		const uint32_t clock_gate_mask;
		uint8_t  tx_dma_channel;
		uint8_t  rx_dma_channel;
		void     (*dma_rxisr)();
		const uint8_t  miso_pin[CNT_MISO_PINS];
		const uint32_t  miso_mux[CNT_MISO_PINS];
		const uint8_t  mosi_pin[CNT_MOSI_PINS];
		const uint32_t  mosi_mux[CNT_MOSI_PINS];
		const uint8_t  sck_pin[CNT_SCK_PINS];
		const uint32_t  sck_mux[CNT_SCK_PINS];
		const uint8_t  cs_pin[CNT_CS_PINS];
		const uint32_t  cs_mux[CNT_CS_PINS];

		volatile uint32_t &sck_select_input_register;
		volatile uint32_t &sdi_select_input_register;
		volatile uint32_t &sdo_select_input_register;
		volatile uint32_t &pcs0_select_input_register;
		const uint8_t sck_select_val;
		const uint8_t sdi_select_val;
		const uint8_t sdo_select_val;
		const uint8_t pcs0_select_val;
	};
	
	static const SPI_Hardware_t spiclass_lpspi4_hardware;
	static const SPI_Hardware_t spiclass_lpspi3_hardware;
	static const SPI_Hardware_t spiclass_lpspi1_hardware;


public:
	constexpr SPIClass(uintptr_t myport, uintptr_t myhardware)
		: port_addr(myport), hardware_addr(myhardware) 
	{
	}

	void beginSlave();

	bool initDMA();
	
	void swapSpiBuf()
	{ 
	   _dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize ); 
	}
	

	friend void _spi_dma_rxISR0();
	inline void dma_rxisr();


public:
	IMXRT_LPSPI_t & port() { return *(IMXRT_LPSPI_t *)port_addr; }

private:
	const SPI_Hardware_t & hardware() { return *(const SPI_Hardware_t *)hardware_addr; }
	uintptr_t port_addr;
	uintptr_t hardware_addr;

	uint32_t _clock = 0;
	uint32_t _ccr = 0;

	uint8_t miso_pin_index = 0;
	uint8_t mosi_pin_index = 0;
	uint8_t sck_pin_index = 0;
	
	uint8_t interruptMasksUsed = 0;
	uint32_t interruptMask[(NVIC_NUM_INTERRUPTS+31)/32] = {};
	uint32_t interruptSave[(NVIC_NUM_INTERRUPTS+31)/32] = {};
	

	// DMA Support
	bool initDMAChannels();
	enum DMAState { notAllocated, idle, active, completed};
	enum { MAX_DMA_COUNT = 32767 };
	DMAState _dma_state = DMAState::notAllocated;
	uint32_t _dma_count_remaining = 0;	// How many bytes left to output after current DMA completes
	
	DMAChannel *_dmaRX = nullptr;
	EventResponder *_dma_event_responder = nullptr;

};


#endif


extern SPIClass SPI;
extern SPIClass SPI1;
extern SPIClass SPI2;

#endif


The following is C:\Program Files (x86)\Arduino\hardware\teensy\avr\libraries\SPISlave\SPISlave.cpp

Code:
/*
 *
 * Copyright (c) 2010 by Cristian Maglie <c.maglie@bug.st>
 * SPI Master library for arduino.
 *
 * This file is free software; you can redistribute it and/or modify
 * it under the terms of either the GNU General Public License version 2
 * or the GNU Lesser General Public License version 2.1, both as
 * published by the Free Software Foundation.
 */

#include "SPISlave.h"
#include "pins_arduino.h"

#define DEBUG_DMA_TRANSFERS


/**********************************************************/
/*     32 bit Teensy 4.x                                  */
/**********************************************************/


void _spi_dma_rxISR0(void) {SPI.dma_rxisr();}

const SPIClass::SPI_Hardware_t  SPIClass::spiclass_lpspi4_hardware = 
{
	CCM_CCGR1, CCM_CCGR1_LPSPI4(CCM_CCGR_ON),
	DMAMUX_SOURCE_LPSPI4_TX, 
	DMAMUX_SOURCE_LPSPI4_RX, 
	_spi_dma_rxISR0,
	12, 
	3 | 0x10,
	11,
	3 | 0x10,
	13,
	3 | 0x10,
	10,  // from https://forum.pjrc.com/threads/57328-Teensy-4-0-SPI-Chip-Select-pins If I remember correctly the one case, where you need to use the Hardware CS pin is if you implement an SPI Client setup, which I have not tried.

	3 | 0x10,
	IOMUXC_LPSPI4_SCK_SELECT_INPUT, 
	IOMUXC_LPSPI4_SDI_SELECT_INPUT, 
	IOMUXC_LPSPI4_SDO_SELECT_INPUT, 
	IOMUXC_LPSPI4_PCS0_SELECT_INPUT,
	0, 0, 0, 0
};

SPIClass SPI((uintptr_t)&IMXRT_LPSPI4_S, (uintptr_t)&SPIClass::spiclass_lpspi4_hardware);

// sketch defines these functions.
unsigned char *spiBuf();
void swapSpiBuf();


/*
LPSPI slave mode uses the same shift register and logic as the master mode, but does not
use the clock configuration register and the transmit command register must remain static
during SPI bus transfers.
*/


/*
	clock_gate_register = CCM_CCGR1  // 13.7.22 p 1140  
	clock_gate_mask = CCM_CCGR1_LPSPI4(CCM_CCGR_ON)   // CCM_CCGR_ON = 3
	tx_dma_channel = DMAMUX_SOURCE_LPSPI4_TX  // 80
	rx_dma_channel = DMAMUX_SOURCE_LPSPI4_RX  // 79
	dma_rxisr = SPI.gdr_isr
			
	cs_mux[0] = 3 | 0x10

	sck_select_input_register = IOMUXC_LPSPI4_SCK_SELECT_INPUT
	sdi_select_input_register = IOMUXC_LPSPI4_SDI_SELECT_INPUT
	sdo_select_input_register = IOMUXC_LPSPI4_SDO_SELECT_INPUT
	pcs0_select_input_register = IOMUXC_LPSPI4_PCS0_SELECT_INPUT
	
	sck_select_val = 0
	sdi_select_val = 0
	sdo_select_val = 0
	pcs0_select_val = 0
*/

static const int rxWater = 15;
//-----------------------------------------------------------------------------
void SPIClass::beginSlave()
//-----------------------------------------------------------------------------
{
//	Serial.printf( "clock_gate_register %08x\n", hardware().clock_gate_register );
//	Serial.printf( "clock_gate_mask %08x\n", hardware().clock_gate_mask );

    // turn off the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR &= ~CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register &= ~hardware().clock_gate_mask;


	Serial.printf( "SPI MISO: %d MOSI: %d, SCK: %d CS: %d\n", 
                   hardware().miso_pin[0], 
                   hardware().mosi_pin[0], 
                   hardware().sck_pin[0], 
                   hardware().cs_pin[0] );


    // CCM = Clock control module
	//Serial.printf( "before CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register

    // IOMUXC = IOMUX Controller
	// SRE = slew rate fast
	// DSE(3) = Drive strength R0/5
	// SPEED(3) = max(200MHz)	

	uint32_t fastio = IOMUXC_PAD_DSE(6) | IOMUXC_PAD_SPEED(1);
	//uint32_t fastio = IOMUXC_PAD_SRE | IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3);

	*(portControlRegister(hardware().miso_pin[0])) = fastio;
	*(portControlRegister(hardware().mosi_pin[0])) = fastio;
	*(portControlRegister(hardware().sck_pin[0])) = fastio;


//	*(portControlRegister(16) = fastio;
//	*(portControlRegister(17) = fastio;
//	*(portControlRegister(19) = fastio;


    // turn on the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR |= CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register |= hardware().clock_gate_mask;

	Serial.printf( "after CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register


	*(portConfigRegister(hardware().miso_pin[0])) 
	   = hardware().miso_mux[0];
	*(portConfigRegister(hardware().mosi_pin[0])) 
	   = hardware().mosi_mux[0];
	*(portConfigRegister(hardware().sck_pin[0])) 
	   = hardware().sck_mux[0];

	// why do I need this in slave mode when master mode doesn't do it?
	*(portConfigRegister(hardware().cs_pin[0])) 
	   = hardware().cs_mux[0];  


//	*(portConfigRegister(16) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_07;
//	*(portConfigRegister(17) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_06;
//	*(portConfigRegister(19) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_00;


	// Set the Mux pins 
   	hardware().sck_select_input_register = hardware().sck_select_val;
	hardware().sdi_select_input_register = hardware().sdi_select_val;
	hardware().sdo_select_input_register = hardware().sdo_select_val;

    // why do I need this in slave mode when master mode doesn't do it?
	hardware().pcs0_select_input_register = hardware().pcs0_select_val;  
	   
	// CR  = Control register
	// RST = Software reset
	port().CR = LPSPI_CR_RST;

	// Initialize the Receive FIFO watermark to FIFO size - 1... 
	port().FCR = LPSPI_FCR_RXWATER( rxWater );
	
	port().CR = LPSPI_CR_MEN;     // Module Enable

	
	initDMA();
	

	Serial.printf( "PARAM %x CR %x SR %x IER %x DER %x CFGR0 %x CFGR1 %x FCR %x TCR %x\n", 
	               port().PARAM,
	               port().CR, port().SR, 
				   port().IER, port().DER,
				   port().CFGR0, port().CFGR1, 
				   port().FCR, port().TCR );	
}


static int const BufSize = 3000;


//-----------------------------------------------------------------------------
bool SPIClass::initDMAChannels() 
//-----------------------------------------------------------------------------
{
	_dmaRX = new DMAChannel();

	if (_dmaRX == nullptr)
		return false;

Serial.printf( "rx_dma_channel = %d\n", hardware().rx_dma_channel );

	// Set up the RX chain
	_dmaRX->disable();

	_dmaRX->source( (volatile uint8_t&)port().RDR );
/*
Disable Request
If this flag is set, the eDMA hardware automatically clears the corresponding ERQ bit when the current
major iteration count reaches zero.
0b - The channel's ERQ bit is not affected.
1b - The channel's ERQ bit is cleared when the major loop is complete.
*/
	_dmaRX->disableOnCompletion();   // TCD->CSR |= DMA_TCD_CSR_DREQ;
	_dmaRX->triggerAtHardwareEvent( hardware().rx_dma_channel );
	_dmaRX->attachInterrupt( hardware().dma_rxisr );

/*
Enable an interrupt when major iteration count completes.
If this flag is set, the channel generates an interrupt request by setting the appropriate bit in the INT when
the current major iteration count reaches zero.
0b - The end-of-major loop interrupt is disabled.
1b - The end-of-major loop interrupt is enabled.
*/

    // cause the dma_rxisr() to be called at the end of the transfer
	_dmaRX->interruptAtCompletion();  // TCD->CSR |= DMA_TCD_CSR_INTMAJOR;
	
	_dma_state = DMAState::idle;  // Should be first thing set!
	
	_dmaRX->enable(); 
	
	return true;
}


EventResponder _event_responder;

#define DATA_BUFFER_MAX (16*1024)

//-----------------------------------------------------------------------------
bool SPIClass::initDMA()
//-----------------------------------------------------------------------------
{	
	initDMAChannels();
	
	
    if (_dma_state == DMAState::active)
		return false; // already active

	_event_responder.clearEvent();	// Make sure it is not set yet
	
	_dma_count_remaining = 0;
	
	_dmaRX->TCD->ATTR_SRC = 0;		// Make sure set for 8 bit mode...
	_dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize );
	_dmaRX->TCD->DLASTSGA = -BufSize;

	if( (uint32_t)spiBuf() >= 0x20200000u )  
		arm_dcache_delete( spiBuf(), BufSize );
	
	_dma_event_responder = &_event_responder;

	dumpDMA_TCD( _dmaRX );

	// Make sure port is in 8 bit mode and clear watermark
	port().TCR = (port().TCR & ~(LPSPI_TCR_FRAMESZ(31))) | LPSPI_TCR_FRAMESZ(7);	
	port().FCR = 0; 

	// Lets try to output the first byte to make sure that we are in 8 bit mode...
 	port().DER = LPSPI_DER_RDDE;	//enable DMA on RX
	port().SR = 0x3f00;	// clear out all of the other status...

	_dmaRX->enable();

	_dma_state = DMAState::active;
	
	return true;
}



//-----------------------------------------------------------------------------
inline void DMAChanneltransferCount( DMAChannel *dmac, unsigned int len ) 
//-----------------------------------------------------------------------------
{
	// note does no validation of length...
	DMABaseClass::TCD_t *tcd = dmac->TCD;
	
	if (!(tcd->BITER & DMA_TCD_BITER_ELINK)) 
	{
		tcd->BITER = len & 0x7fff;
	} 
	else 
	{
		tcd->BITER = (tcd->BITER & 0xFE00) | (len & 0x1ff);
	}
	tcd->CITER = tcd->BITER; 
}

//-----------------------------------------------------------------------------
void dumpDMA_TCD( DMABaseClass *dmabc )
//-----------------------------------------------------------------------------
{
	Serial.printf("spiData = %p BC: %x TCD: %x:", spiBuf(), (uint32_t)dmabc, (uint32_t)dmabc->TCD);

	Serial.printf("SA:%x SO:%d AT:%x NB:%x SL:%d DA:%x DO:%d CI:%x DL:%x CS:%x BI:%x\n", (uint32_t)dmabc->TCD->SADDR,
		dmabc->TCD->SOFF, dmabc->TCD->ATTR, dmabc->TCD->NBYTES, dmabc->TCD->SLAST, (uint32_t)dmabc->TCD->DADDR, 
		dmabc->TCD->DOFF, dmabc->TCD->CITER, dmabc->TCD->DLASTSGA, dmabc->TCD->CSR, dmabc->TCD->BITER);
}


//-------------------------------------------------------------------------
void SPIClass::dma_rxisr() 
//-------------------------------------------------------------------------
{
	_dmaRX->clearInterrupt();   // 		DMA_CINT = channel;
	_dmaRX->clearComplete();    // 		DMA_CDNE = channel;
	
	// receive a full SPI buffer
    DMAChanneltransferCount( _dmaRX, BufSize );

    // Optionally allow the user to double buffer the data.  Comment 
	//   the swapSpiBuf() functions out if you don't need them.
    ::swapSpiBuf();   // swap the pointers
    swapSpiBuf();     // tell _dmaRX the new buffer
	
	
    _dmaRX->enable();

    asm("dsb");

}

	
//-------------------------------------------------------------------------
void swapSpiBuf()
//-------------------------------------------------------------------------
{ 
  _dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize ); 
}
 
Here's a simple SPI slave that only receives data and only works on Teensy 4.0. It always receives the same number of bytes (the constant BufSize.) DMA is used to write to the destination buffer, and your sketch needs to define the function spiBuf() to return a pointer to the destination buffer. You can also declare swapSpiBuf() if you want to double-buffer the received data.

This code is obviously hacked out of the Teensy SPI library.


Just include SPISlave.h and call beginSlave() in your begin() function. Nothing is required in loop().



The following is C:\Program Files (x86)\Arduino\hardware\teensy\avr\libraries\SPISlave\SPISlave.h

Code:
/*
 * Copyright (c) 2010 by Cristian Maglie <c.maglie@bug.st>
 * Copyright (c) 2014 by Paul Stoffregen <paul@pjrc.com> (Transaction API)
 * Copyright (c) 2014 by Matthijs Kooijman <matthijs@stdin.nl> (SPISettings AVR)
 * SPI Master library for arduino.
 *
 * This file is free software; you can redistribute it and/or modify
 * it under the terms of either the GNU General Public License version 2
 * or the GNU Lesser General Public License version 2.1, both as
 * published by the Free Software Foundation.
 */

#ifndef _SPI_SLAVE_H_INCLUDED
#define _SPI_SLAVE_H_INCLUDED

#include <Arduino.h>

#if defined(__arm__) && defined(TEENSYDUINO)
#if defined(__has_include) && __has_include(<EventResponder.h>)
// SPI_HAS_TRANSFER_ASYNC - Defined to say that the SPI supports an ASYNC version
// of the SPI_HAS_TRANSFER_BUF
#define SPI_HAS_TRANSFER_ASYNC 1
#include <DMAChannel.h>
#include <EventResponder.h>
#endif
#endif

// SPI_HAS_TRANSACTION means SPI has beginTransaction(), endTransaction(),
// usingInterrupt(), and SPISetting(clock, bitOrder, dataMode)
#define SPI_HAS_TRANSACTION 1

// Uncomment this line to add detection of mismatched begin/end transactions.
// A mismatch occurs if other libraries fail to use SPI.endTransaction() for
// each SPI.beginTransaction().  Connect a LED to this pin.  The LED will turn
// on if any mismatch is ever detected.
//#define SPI_TRANSACTION_MISMATCH_LED 5

// SPI_HAS_TRANSFER_BUF - is defined to signify that this library supports
// a version of transfer which allows you to pass in both TX and RX buffer
// pointers, either of which could be NULL
#define SPI_HAS_TRANSFER_BUF 1


#ifndef LSBFIRST
#define LSBFIRST 0
#endif
#ifndef MSBFIRST
#define MSBFIRST 1
#endif

#define SPI_MODE0 0x00
#define SPI_MODE1 0x04
#define SPI_MODE2 0x08
#define SPI_MODE3 0x0C

#define SPI_CLOCK_DIV4 0x00
#define SPI_CLOCK_DIV16 0x01
#define SPI_CLOCK_DIV64 0x02
#define SPI_CLOCK_DIV128 0x03
#define SPI_CLOCK_DIV2 0x04
#define SPI_CLOCK_DIV8 0x05
#define SPI_CLOCK_DIV32 0x06

#define SPI_MODE_MASK 0x0C  // CPOL = bit 3, CPHA = bit 2 on SPCR
#define SPI_CLOCK_MASK 0x03  // SPR1 = bit 1, SPR0 = bit 0 on SPCR
#define SPI_2XCLOCK_MASK 0x01  // SPI2X = bit 0 on SPSR



/**********************************************************/
/*     32 bit Teensy 4.x                                  */
/**********************************************************/

#if defined(__arm__) && defined(TEENSYDUINO) && (defined(__IMXRT1052__) || defined(__IMXRT1062__))
#define SPI_ATOMIC_VERSION 1

void synchronousRead_isr(void);


class DMABaseClass;


void dumpDMA_TCD( DMABaseClass *dmabc );


class SPISettings 
{
public:
	SPISettings(uint32_t clockIn, uint8_t bitOrderIn, uint8_t dataModeIn) : _clock
	(clockIn) 
	{
		init_AlwaysInline(bitOrderIn, dataModeIn);
	}

	SPISettings() : _clock(4000000) 
	{
		init_AlwaysInline(MSBFIRST, SPI_MODE0);
	}
private:
	void init_AlwaysInline(uint8_t bitOrder, uint8_t dataMode)
	  __attribute__((__always_inline__)) 
	  {
			tcr = LPSPI_TCR_FRAMESZ(7);    // TCR has polarity and bit order too

			// handle LSB setup 
			if (bitOrder == LSBFIRST) tcr |= LPSPI_TCR_LSBF;

			// Handle Data Mode
			if (dataMode & 0x08) tcr |= LPSPI_TCR_CPOL;

			// Note: On T3.2 when we set CPHA it also updated the timing.  It moved the 
			// PCS to SCK Delay Prescaler into the After SCK Delay Prescaler	
			if (dataMode & 0x04) tcr |= LPSPI_TCR_CPHA; 
	}

	inline uint32_t clock() {return _clock;}

	uint32_t _clock;
	uint32_t tcr; // transmit command, pg 2955
	friend class SPIClass;
};



class SPIClass // Teensy 4
{ 
public:
	static const uint8_t CNT_MISO_PINS = 1;
	static const uint8_t CNT_MOSI_PINS = 1;
	static const uint8_t CNT_SCK_PINS = 1;
	static const uint8_t CNT_CS_PINS = 1;
	
	struct SPI_Hardware_t
	{
		volatile uint32_t &clock_gate_register;
		const uint32_t clock_gate_mask;
		uint8_t  tx_dma_channel;
		uint8_t  rx_dma_channel;
		void     (*dma_rxisr)();
		const uint8_t  miso_pin[CNT_MISO_PINS];
		const uint32_t  miso_mux[CNT_MISO_PINS];
		const uint8_t  mosi_pin[CNT_MOSI_PINS];
		const uint32_t  mosi_mux[CNT_MOSI_PINS];
		const uint8_t  sck_pin[CNT_SCK_PINS];
		const uint32_t  sck_mux[CNT_SCK_PINS];
		const uint8_t  cs_pin[CNT_CS_PINS];
		const uint32_t  cs_mux[CNT_CS_PINS];

		volatile uint32_t &sck_select_input_register;
		volatile uint32_t &sdi_select_input_register;
		volatile uint32_t &sdo_select_input_register;
		volatile uint32_t &pcs0_select_input_register;
		const uint8_t sck_select_val;
		const uint8_t sdi_select_val;
		const uint8_t sdo_select_val;
		const uint8_t pcs0_select_val;
	};
	
	static const SPI_Hardware_t spiclass_lpspi4_hardware;
	static const SPI_Hardware_t spiclass_lpspi3_hardware;
	static const SPI_Hardware_t spiclass_lpspi1_hardware;


public:
	constexpr SPIClass(uintptr_t myport, uintptr_t myhardware)
		: port_addr(myport), hardware_addr(myhardware) 
	{
	}

	void beginSlave();

	bool initDMA();
	
	void swapSpiBuf()
	{ 
	   _dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize ); 
	}
	

	friend void _spi_dma_rxISR0();
	inline void dma_rxisr();


public:
	IMXRT_LPSPI_t & port() { return *(IMXRT_LPSPI_t *)port_addr; }

private:
	const SPI_Hardware_t & hardware() { return *(const SPI_Hardware_t *)hardware_addr; }
	uintptr_t port_addr;
	uintptr_t hardware_addr;

	uint32_t _clock = 0;
	uint32_t _ccr = 0;

	uint8_t miso_pin_index = 0;
	uint8_t mosi_pin_index = 0;
	uint8_t sck_pin_index = 0;
	
	uint8_t interruptMasksUsed = 0;
	uint32_t interruptMask[(NVIC_NUM_INTERRUPTS+31)/32] = {};
	uint32_t interruptSave[(NVIC_NUM_INTERRUPTS+31)/32] = {};
	

	// DMA Support
	bool initDMAChannels();
	enum DMAState { notAllocated, idle, active, completed};
	enum { MAX_DMA_COUNT = 32767 };
	DMAState _dma_state = DMAState::notAllocated;
	uint32_t _dma_count_remaining = 0;	// How many bytes left to output after current DMA completes
	
	DMAChannel *_dmaRX = nullptr;
	EventResponder *_dma_event_responder = nullptr;

};


#endif


extern SPIClass SPI;
extern SPIClass SPI1;
extern SPIClass SPI2;

#endif


The following is C:\Program Files (x86)\Arduino\hardware\teensy\avr\libraries\SPISlave\SPISlave.cpp

Code:
/*
 *
 * Copyright (c) 2010 by Cristian Maglie <c.maglie@bug.st>
 * SPI Master library for arduino.
 *
 * This file is free software; you can redistribute it and/or modify
 * it under the terms of either the GNU General Public License version 2
 * or the GNU Lesser General Public License version 2.1, both as
 * published by the Free Software Foundation.
 */

#include "SPISlave.h"
#include "pins_arduino.h"

#define DEBUG_DMA_TRANSFERS


/**********************************************************/
/*     32 bit Teensy 4.x                                  */
/**********************************************************/


void _spi_dma_rxISR0(void) {SPI.dma_rxisr();}

const SPIClass::SPI_Hardware_t  SPIClass::spiclass_lpspi4_hardware = 
{
	CCM_CCGR1, CCM_CCGR1_LPSPI4(CCM_CCGR_ON),
	DMAMUX_SOURCE_LPSPI4_TX, 
	DMAMUX_SOURCE_LPSPI4_RX, 
	_spi_dma_rxISR0,
	12, 
	3 | 0x10,
	11,
	3 | 0x10,
	13,
	3 | 0x10,
	10,  // from https://forum.pjrc.com/threads/57328-Teensy-4-0-SPI-Chip-Select-pins If I remember correctly the one case, where you need to use the Hardware CS pin is if you implement an SPI Client setup, which I have not tried.

	3 | 0x10,
	IOMUXC_LPSPI4_SCK_SELECT_INPUT, 
	IOMUXC_LPSPI4_SDI_SELECT_INPUT, 
	IOMUXC_LPSPI4_SDO_SELECT_INPUT, 
	IOMUXC_LPSPI4_PCS0_SELECT_INPUT,
	0, 0, 0, 0
};

SPIClass SPI((uintptr_t)&IMXRT_LPSPI4_S, (uintptr_t)&SPIClass::spiclass_lpspi4_hardware);

// sketch defines these functions.
unsigned char *spiBuf();
void swapSpiBuf();


/*
LPSPI slave mode uses the same shift register and logic as the master mode, but does not
use the clock configuration register and the transmit command register must remain static
during SPI bus transfers.
*/


/*
	clock_gate_register = CCM_CCGR1  // 13.7.22 p 1140  
	clock_gate_mask = CCM_CCGR1_LPSPI4(CCM_CCGR_ON)   // CCM_CCGR_ON = 3
	tx_dma_channel = DMAMUX_SOURCE_LPSPI4_TX  // 80
	rx_dma_channel = DMAMUX_SOURCE_LPSPI4_RX  // 79
	dma_rxisr = SPI.gdr_isr
			
	cs_mux[0] = 3 | 0x10

	sck_select_input_register = IOMUXC_LPSPI4_SCK_SELECT_INPUT
	sdi_select_input_register = IOMUXC_LPSPI4_SDI_SELECT_INPUT
	sdo_select_input_register = IOMUXC_LPSPI4_SDO_SELECT_INPUT
	pcs0_select_input_register = IOMUXC_LPSPI4_PCS0_SELECT_INPUT
	
	sck_select_val = 0
	sdi_select_val = 0
	sdo_select_val = 0
	pcs0_select_val = 0
*/

static const int rxWater = 15;
//-----------------------------------------------------------------------------
void SPIClass::beginSlave()
//-----------------------------------------------------------------------------
{
//	Serial.printf( "clock_gate_register %08x\n", hardware().clock_gate_register );
//	Serial.printf( "clock_gate_mask %08x\n", hardware().clock_gate_mask );

    // turn off the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR &= ~CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register &= ~hardware().clock_gate_mask;


	Serial.printf( "SPI MISO: %d MOSI: %d, SCK: %d CS: %d\n", 
                   hardware().miso_pin[0], 
                   hardware().mosi_pin[0], 
                   hardware().sck_pin[0], 
                   hardware().cs_pin[0] );


    // CCM = Clock control module
	//Serial.printf( "before CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register

    // IOMUXC = IOMUX Controller
	// SRE = slew rate fast
	// DSE(3) = Drive strength R0/5
	// SPEED(3) = max(200MHz)	

	uint32_t fastio = IOMUXC_PAD_DSE(6) | IOMUXC_PAD_SPEED(1);
	//uint32_t fastio = IOMUXC_PAD_SRE | IOMUXC_PAD_DSE(3) | IOMUXC_PAD_SPEED(3);

	*(portControlRegister(hardware().miso_pin[0])) = fastio;
	*(portControlRegister(hardware().mosi_pin[0])) = fastio;
	*(portControlRegister(hardware().sck_pin[0])) = fastio;


//	*(portControlRegister(16) = fastio;
//	*(portControlRegister(17) = fastio;
//	*(portControlRegister(19) = fastio;


    // turn on the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR |= CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register |= hardware().clock_gate_mask;

	Serial.printf( "after CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register


	*(portConfigRegister(hardware().miso_pin[0])) 
	   = hardware().miso_mux[0];
	*(portConfigRegister(hardware().mosi_pin[0])) 
	   = hardware().mosi_mux[0];
	*(portConfigRegister(hardware().sck_pin[0])) 
	   = hardware().sck_mux[0];

	// why do I need this in slave mode when master mode doesn't do it?
	*(portConfigRegister(hardware().cs_pin[0])) 
	   = hardware().cs_mux[0];  


//	*(portConfigRegister(16) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_07;
//	*(portConfigRegister(17) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_06;
//	*(portConfigRegister(19) = IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_00;


	// Set the Mux pins 
   	hardware().sck_select_input_register = hardware().sck_select_val;
	hardware().sdi_select_input_register = hardware().sdi_select_val;
	hardware().sdo_select_input_register = hardware().sdo_select_val;

    // why do I need this in slave mode when master mode doesn't do it?
	hardware().pcs0_select_input_register = hardware().pcs0_select_val;  
	   
	// CR  = Control register
	// RST = Software reset
	port().CR = LPSPI_CR_RST;

	// Initialize the Receive FIFO watermark to FIFO size - 1... 
	port().FCR = LPSPI_FCR_RXWATER( rxWater );
	
	port().CR = LPSPI_CR_MEN;     // Module Enable

	
	initDMA();
	

	Serial.printf( "PARAM %x CR %x SR %x IER %x DER %x CFGR0 %x CFGR1 %x FCR %x TCR %x\n", 
	               port().PARAM,
	               port().CR, port().SR, 
				   port().IER, port().DER,
				   port().CFGR0, port().CFGR1, 
				   port().FCR, port().TCR );	
}


static int const BufSize = 3000;


//-----------------------------------------------------------------------------
bool SPIClass::initDMAChannels() 
//-----------------------------------------------------------------------------
{
	_dmaRX = new DMAChannel();

	if (_dmaRX == nullptr)
		return false;

Serial.printf( "rx_dma_channel = %d\n", hardware().rx_dma_channel );

	// Set up the RX chain
	_dmaRX->disable();

	_dmaRX->source( (volatile uint8_t&)port().RDR );
/*
Disable Request
If this flag is set, the eDMA hardware automatically clears the corresponding ERQ bit when the current
major iteration count reaches zero.
0b - The channel's ERQ bit is not affected.
1b - The channel's ERQ bit is cleared when the major loop is complete.
*/
	_dmaRX->disableOnCompletion();   // TCD->CSR |= DMA_TCD_CSR_DREQ;
	_dmaRX->triggerAtHardwareEvent( hardware().rx_dma_channel );
	_dmaRX->attachInterrupt( hardware().dma_rxisr );

/*
Enable an interrupt when major iteration count completes.
If this flag is set, the channel generates an interrupt request by setting the appropriate bit in the INT when
the current major iteration count reaches zero.
0b - The end-of-major loop interrupt is disabled.
1b - The end-of-major loop interrupt is enabled.
*/

    // cause the dma_rxisr() to be called at the end of the transfer
	_dmaRX->interruptAtCompletion();  // TCD->CSR |= DMA_TCD_CSR_INTMAJOR;
	
	_dma_state = DMAState::idle;  // Should be first thing set!
	
	_dmaRX->enable(); 
	
	return true;
}


EventResponder _event_responder;

#define DATA_BUFFER_MAX (16*1024)

//-----------------------------------------------------------------------------
bool SPIClass::initDMA()
//-----------------------------------------------------------------------------
{	
	initDMAChannels();
	
	
    if (_dma_state == DMAState::active)
		return false; // already active

	_event_responder.clearEvent();	// Make sure it is not set yet
	
	_dma_count_remaining = 0;
	
	_dmaRX->TCD->ATTR_SRC = 0;		// Make sure set for 8 bit mode...
	_dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize );
	_dmaRX->TCD->DLASTSGA = -BufSize;

	if( (uint32_t)spiBuf() >= 0x20200000u )  
		arm_dcache_delete( spiBuf(), BufSize );
	
	_dma_event_responder = &_event_responder;

	dumpDMA_TCD( _dmaRX );

	// Make sure port is in 8 bit mode and clear watermark
	port().TCR = (port().TCR & ~(LPSPI_TCR_FRAMESZ(31))) | LPSPI_TCR_FRAMESZ(7);	
	port().FCR = 0; 

	// Lets try to output the first byte to make sure that we are in 8 bit mode...
 	port().DER = LPSPI_DER_RDDE;	//enable DMA on RX
	port().SR = 0x3f00;	// clear out all of the other status...

	_dmaRX->enable();

	_dma_state = DMAState::active;
	
	return true;
}



//-----------------------------------------------------------------------------
inline void DMAChanneltransferCount( DMAChannel *dmac, unsigned int len ) 
//-----------------------------------------------------------------------------
{
	// note does no validation of length...
	DMABaseClass::TCD_t *tcd = dmac->TCD;
	
	if (!(tcd->BITER & DMA_TCD_BITER_ELINK)) 
	{
		tcd->BITER = len & 0x7fff;
	} 
	else 
	{
		tcd->BITER = (tcd->BITER & 0xFE00) | (len & 0x1ff);
	}
	tcd->CITER = tcd->BITER; 
}

//-----------------------------------------------------------------------------
void dumpDMA_TCD( DMABaseClass *dmabc )
//-----------------------------------------------------------------------------
{
	Serial.printf("spiData = %p BC: %x TCD: %x:", spiBuf(), (uint32_t)dmabc, (uint32_t)dmabc->TCD);

	Serial.printf("SA:%x SO:%d AT:%x NB:%x SL:%d DA:%x DO:%d CI:%x DL:%x CS:%x BI:%x\n", (uint32_t)dmabc->TCD->SADDR,
		dmabc->TCD->SOFF, dmabc->TCD->ATTR, dmabc->TCD->NBYTES, dmabc->TCD->SLAST, (uint32_t)dmabc->TCD->DADDR, 
		dmabc->TCD->DOFF, dmabc->TCD->CITER, dmabc->TCD->DLASTSGA, dmabc->TCD->CSR, dmabc->TCD->BITER);
}


//-------------------------------------------------------------------------
void SPIClass::dma_rxisr() 
//-------------------------------------------------------------------------
{
	_dmaRX->clearInterrupt();   // 		DMA_CINT = channel;
	_dmaRX->clearComplete();    // 		DMA_CDNE = channel;
	
	// receive a full SPI buffer
    DMAChanneltransferCount( _dmaRX, BufSize );

    // Optionally allow the user to double buffer the data.  Comment 
	//   the swapSpiBuf() functions out if you don't need them.
    ::swapSpiBuf();   // swap the pointers
    swapSpiBuf();     // tell _dmaRX the new buffer
	
	
    _dmaRX->enable();

    asm("dsb");

}

	
//-------------------------------------------------------------------------
void swapSpiBuf()
//-------------------------------------------------------------------------
{ 
  _dmaRX->destinationBuffer((uint8_t*)spiBuf(), BufSize ); 
}

Well this might be JUST what I was looking for to get a framebuffer snarf from a sensor, was trying to manually port from the vendor provided SAM3XE code and bashing my head on desk while I tried to grok which registers to use.
 
Cool stuff thanks! By the way is there any example sketch? I know it would be simple but I'm having a hard time figuring it out.
 
Hello,

I am trying to make the SPI slave to work in teensy 4. I already tried GordonRush's solution and it works nicely if you only need to receive packets.

Since I am not concerned about speed a lot I created a very simplified version of his code that doesn't use DMA.

The beginSlave funtion looks like that:
Code:
//-----------------------------------------------------------------------------
void SPIClass::beginSlave()
//-----------------------------------------------------------------------------
{
	Serial.printf( "clock_gate_register %08x\n", hardware().clock_gate_register );
	Serial.printf( "clock_gate_mask %08x\n", hardware().clock_gate_mask );

    // turn off the LPSPI4 clock ( CCGR1 )
	// this is really CCM_CBCMR &= ~CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register &= ~hardware().clock_gate_mask;


	Serial.printf( "SPI MISO: %d MOSI: %d, SCK: %d CS: %d\n", 
                   hardware().miso_pin[0], 
                   hardware().mosi_pin[0], 
                   hardware().sck_pin[0], 
                   hardware().cs_pin[0] );


    // CCM = Clock control module
	Serial.printf( "before CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register

   // IOMUXC = IOMUX Controller
	// SRE = slew rate fast
	// DSE(3) = Drive strength R0/5
	// SPEED(1) = max(100MHz)	
  // Basically we select the pad that we want from the "External Signals and Pin Multiplexing" and we configure it's
  // characteristics from the it's register. For example for our SCK this should be the register: 
  // IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_03 
  // So does DSE = 0 means INPUT MODE?!??!?!?! 
	uint32_t fastio = IOMUXC_PAD_DSE(6) | IOMUXC_PAD_SPEED(1);

	*(portControlRegister(hardware().miso_pin[0])) = fastio;
	*(portControlRegister(hardware().mosi_pin[0])) = fastio;
	*(portControlRegister(hardware().sck_pin[0])) = fastio;

    // turn on the LPSPI clock ( CCGR1 )
	// this is really CCM_CBCMR |= CCM_CCGR1_LPSPI4(CCM_CCGR_ON);
	hardware().clock_gate_register |= hardware().clock_gate_mask;

	Serial.printf( "after CBCMR = %08lX CCGR1 = %08lX\n", CCM_CBCMR, CCM_CCGR1 ); // CCM Bus Clock Multiplexer Register

  // Configure the mode of operation for the pin. This can be for example, act as CLK for the LPSPI4 or act as FLEXIO pin3 and so on. 
  // For example for our (LPSPI4) CLK the register is this one:
  // IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_03 
	*(portConfigRegister(hardware().miso_pin[0])) 
	   = hardware().miso_mux[0];
	*(portConfigRegister(hardware().mosi_pin[0])) 
	   = hardware().mosi_mux[0];
	*(portConfigRegister(hardware().sck_pin[0])) 
	   = hardware().sck_mux[0];

	// why do I need this in slave mode when master mode doesn't do it?
	*(portConfigRegister(hardware().cs_pin[0])) 
	   = hardware().cs_mux[0];  


  // This selects which pad to connect a "functionality". So for example for the LPSPI4 Clock functionality you select the
  // one of the available pads. For example for LPSPI4 SCK you basically configure the following register.
  // IOMUXC_LPSPI4_SCK_SELECT_INPUT
	// Set the Mux pins 
   	hardware().sck_select_input_register = hardware().sck_select_val;
	hardware().sdi_select_input_register = hardware().sdi_select_val;
	hardware().sdo_select_input_register = hardware().sdo_select_val;

    // why do I need this in slave mode when master mode doesn't do it?
	hardware().pcs0_select_input_register = hardware().pcs0_select_val;  
	   
	// CR  = Control register
	// RST = Software reset
	port().CR = LPSPI_CR_RST;

	// Initialize the Receive FIFO watermark to FIFO size - 1... 
  // Receive FIFO Watermark
  // The Receive Data Flag is set whenever the number of words in the receive FIFO is greater than
  // RXWATER. Writing a value equal or greater than the FIFO size will be truncated.
	port().FCR = LPSPI_FCR_RXWATER( rxWater );


  // In slave mode we can only change the Transfer control register before the enable the module
	port().TCR = (port().TCR & ~(LPSPI_TCR_FRAMESZ(31))) | LPSPI_TCR_FRAMESZ(7);

  // Select slave mode 
  port().CFGR1 &= ~(LPSPI_CFGR1_MASTER);
	
	port().CR = LPSPI_CR_MEN;     // Module Enable

	
	

	Serial.printf( "PARAM %x CR %x SR %x IER %x DER %x CFGR0 %x CFGR1 %x FCR %x TCR %x\n", 
	               port().PARAM,
	               port().CR, port().SR, 
				   port().IER, port().DER,
				   port().CFGR0, port().CFGR1, 
				   port().FCR, port().TCR );	
}

I didn't change anything more in the configuration. Then when I want to receive data I just read the receive data register (RDR). This also works alright If I want to just receive data.

I have two problems, the first one is regarding the pin configuration. I realized that in both of version's of slave mode ( the DMA and simplified non-DMA version) the pin that is reading the data is the MOSI pin. Is this normal? I read in the documentation about the SIN and SOUT modes of the pins but it was not clear to me if this means that these are relevant to the modes of SPI. Meaning that in master mode SIN->MISO and SOUT->MOSI and in slave mode we need to revert that.

The second and most important question though is about sending data in SPI slave mode. In the documentation I couldn't find any relevant configuration which I didn't do. So right now I am just writing in the transmit data register (TDR) but nothing happens. I checked with the oscilloscope and no signal was there while I was writing to the transmit data register. By the way I know that SPI supports writes only when the master device is sending data in the MOSI pin(at least in full-duplex mode), so I had a master device sending data all the time while I was trying to reply.

Any suggestions will be very welcome since I am bit stuck with that to be honest.

Thanks for your help anyway!
 
UPDATE: I am checking the receive FIFO and the transmit FIFO and when I write to the TDR and read from RDR the numbers change but still there is no reply from the SPI slave device. I am also controlling the select pin accordingly but still there is nothing in the transmit pin.
 
Things like this are a lot easier to help with, if you post your current example code. That way might be able to see if there is something obvious. And maybe a description of a simple setup.

A long time ago I played some with SPI slave on a T3.x? don't remember which one I tried. At the time I remember there were some interesting issues on FIFO usage and when the data you are trying to transfer actually does.

You sort of had to preload the output register before receiving anything.

That is if for example you send a command to the T4, and expect that the response will be on that byte won't work. At times I had to send one or two extra characters to get the data returned... Which I don't think is totally unusual. Example lots of displays when you do something like read a pixel, you which returns a 3 byte color for the pixel, often times, you send a command, and then you send a NULL character which again you ignore response and then in this case three more bytes transfered which will return the pixel data... Reading rectangle of pixels is the same, in that again one blank returned and then the three bytes for each of the pixels...
 
A quick hack up of the T4 slave code from other thread, where I feed it data from SPI1 on the same T4...

Code:
#include <DMAChannel.h>
#include <SPI.h>
#define PRREG(x) Serial.print(#x" 0x"); Serial.println(x, BIN) //Serial.println(x,BIN)
#define SAMPLES 1
#define ChipSelectSlave 10
#define CS_MASTER 0
volatile bool received_data = false;

DMAMEM static uint16_t rx_buffer[SAMPLES];
uint16_t loop_count = 0;

DMAChannel rx(false);

void rxISR() {
  rx.clearInterrupt();
  received_data = true;
  arm_dcache_delete(rx_buffer, SAMPLES); //delete Cache!
  LPSPI4_TDR = rx_buffer[0];
  asm volatile ("dsb");
}

bool initSPISlaveDMA() {
  rx.begin(true);
  rx.source((uint16_t &) LPSPI4_RDR);
  rx.triggerAtHardwareEvent(DMAMUX_SOURCE_LPSPI4_RX);
  rx.attachInterrupt(rxISR);
  rx.interruptAtCompletion(); //TCD->CSR |= DMA_TCD_CSR_INTMAJOR;
  rx.destinationBuffer(rx_buffer, SAMPLES + 1);
  rx.enable();
  return 1;
}

bool initSPISlave() {
  LPSPI4_CR &= ~LPSPI_CR_MEN; //Modul ausschalten
  LPSPI4_CR = LPSPI_CR_RST; //Master Logic reset! (Control Register => Software Reset)
  LPSPI4_CR &=  ~LPSPI_CR_RST; //Master Logic reset! (Control Register => Software Reset)
  LPSPI4_TCR = LPSPI_TCR_FRAMESZ(15); //16Bit Mode
  LPSPI4_DER = LPSPI_DER_RDDE; //RX DMA Request Enable
  LPSPI4_CR |= LPSPI_CR_MEN; //Enable SPI Module!
  return 1;
}

void setup() {
  // I am going to use SPI1 as the master.
  SPI1.begin(); // initialize it... 
  pinMode(CS_MASTER, OUTPUT);
  digitalWrite(CS_MASTER, HIGH);
  
  Serial.begin(115200);
  SPI.begin();
  SPI.setCS(ChipSelectSlave);

  while (!Serial);

  Serial.println("Init SPI!");
  if (initSPISlave()) {
    Serial.println("SPI SLAVE init!");
  }
  if (initSPISlaveDMA()) {
    Serial.println("DMA Channel init!");
  }
}

void loop() {
  elapsedMicros emloop = 0;
  received_data = false;
  digitalWrite(CS_MASTER, LOW);
  SPI1.beginTransaction(SPISettings(8000000, MSBFIRST, SPI_MODE0));
  uint16_t received_data = SPI1.transfer16(loop_count);
  SPI1.endTransaction();
  while (!received_data && (emloop < 250000)) ;
  Serial.printf("SPI Transfer: %x Return: %x SPI DMA: %x delay: %d\n", loop_count, received_data, rx_buffer[0], (uint32_t)emloop);
  delay(250);
  loop_count++;
}
And I have the Two SPI busses wired to each other... Note: using my own board with the FRDM 4236....
So have IO pins connected to each other: 0-10, 1-11, 26-12, 27-13

Output:
Code:
SPI Transfer: 506 Return: 504 SPI DMA: 505 delay: 3
SPI Transfer: 507 Return: 505 SPI DMA: 506 delay: 3
SPI Transfer: 508 Return: 506 SPI DMA: 507 delay: 3
SPI Transfer: 509 Return: 507 SPI DMA: 508 delay: 3
SPI Transfer: 50a Return: 508 SPI DMA: 509 delay: 3
SPI Transfer: 50b Return: 509 SPI DMA: 50a delay: 3
SPI Transfer: 50c Return: 50a SPI DMA: 50b delay: 3
SPI Transfer: 50d Return: 50b SPI DMA: 50c delay: 3
SPI Transfer: 50e Return: 50c SPI DMA: 50d delay: 3
SPI Transfer: 50f Return: 50d SPI DMA: 50e delay: 3
SPI Transfer: 510 Return: 50e SPI DMA: 50f delay: 3
SPI Transfer: 511 Return: 50f SPI DMA: 510 delay: 3
SPI Transfer: 512 Return: 510 SPI DMA: 511 delay: 3
SPI Transfer: 513 Return: 511 SPI DMA: 512 delay: 3
SPI Transfer: 514 Return: 512 SPI DMA: 513 delay: 2
 
Thank you very much for the reply KurtE!

You are correct, even though I shared my initialization code in a previous post I didn't send a example code that someone can run.

So my setup right now is an stm32f103 acting as SPI master and the teensy 4 acting as SPI slave. I think that I fixed my problem actually which was that the slave select pin was not working correctly but I want to investigate a bit further. I will write a post with more details and example code at some point later here.

As for the output register, it's true that you need to preload the output register before receiving anything. That makes sense because of the nature of the SPI protocol I think.

Thank you for the info about the displays! I didn't use many yet so it's good to know what to expect!
 
Thank you KurtE for providing your code example.
In my application I need SPI1 to be the slave SPI interface.

So for starters I tried to use your code with the only change that the two SPI interfaces are swapped. Its not working. I'm apparently getting interrupts but the received data is always the same.

Code:
#include <DMAChannel.h>
#include <SPI.h>
#define PRREG(x) Serial.print(#x" 0x"); Serial.println(x, BIN) //Serial.println(x,BIN)
#define SAMPLES 1
#define ChipSelectSlave 0
#define CS_MASTER 10
volatile bool received_data = false;

DMAMEM static uint16_t rx_buffer[SAMPLES];
uint16_t loop_count = 0;

DMAChannel rx(false);

void rxISR() {
  rx.clearInterrupt();
  received_data = true;
  arm_dcache_delete(rx_buffer, SAMPLES); //delete Cache!
  LPSPI3_TDR = rx_buffer[0];
  asm volatile ("dsb");
}

bool initSPISlaveDMA() {
  rx.begin(true);
  rx.source((uint16_t &) LPSPI3_RDR);
  rx.triggerAtHardwareEvent(DMAMUX_SOURCE_LPSPI3_RX);
  rx.attachInterrupt(rxISR);
  rx.interruptAtCompletion(); //TCD->CSR |= DMA_TCD_CSR_INTMAJOR;
  rx.destinationBuffer(rx_buffer, SAMPLES + 1);
  rx.enable();
  return 1;
}

bool initSPISlave() {
  LPSPI3_CR &= ~LPSPI_CR_MEN; //Modul ausschalten
  LPSPI3_CR = LPSPI_CR_RST; //Master Logic reset! (Control Register => Software Reset)
  LPSPI3_CR &=  ~LPSPI_CR_RST; //Master Logic reset! (Control Register => Software Reset)
  LPSPI3_TCR = LPSPI_TCR_FRAMESZ(15); //16Bit Mode
  LPSPI3_DER = LPSPI_DER_RDDE; //RX DMA Request Enable
  LPSPI3_CR |= LPSPI_CR_MEN; //Enable SPI Module!
  return 1;
}

void setup() {
  // I am going to use SPI1 as the master.
  SPI.begin(); // initialize it... 
  pinMode(CS_MASTER, OUTPUT);
  digitalWrite(CS_MASTER, HIGH);
  
  Serial.begin(115200);
  SPI1.begin();
  SPI1.setCS(ChipSelectSlave);

  while (!Serial);

  Serial.println("Init SPI!");
  if (initSPISlave()) {
    Serial.println("SPI SLAVE init!");
  }
  if (initSPISlaveDMA()) {
    Serial.println("DMA Channel init!");
  }
}

void loop() {
  elapsedMicros emloop = 0;
  received_data = false;
  digitalWrite(CS_MASTER, LOW);
  SPI.beginTransaction(SPISettings(8000000, MSBFIRST, SPI_MODE0));
  uint16_t received_data = SPI.transfer16(loop_count);
  SPI.endTransaction();
  digitalWrite(CS_MASTER, HIGH);
  while (!received_data && (emloop < 250000)) ;
  Serial.printf("SPI Transfer: %x Return: %x SPI DMA: %x delay: %d\n", loop_count, received_data, rx_buffer[0], (uint32_t)emloop);
  delay(250);
  loop_count++;
}

This is the serial output that I'm getting:
Code:
Init SPI!
SPI SLAVE init!
DMA Channel init!
SPI Transfer: 0 Return: ffff SPI DMA: b9db delay: 3
SPI Transfer: 1 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 2 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 3 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 4 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 5 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 6 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 7 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 8 Return: ffff SPI DMA: b9db delay: 2
SPI Transfer: 9 Return: ffff SPI DMA: b9db delay: 2

Does anybody have an idea why this is not working? The HW setup should be fine. KurtEs original code is working.
 
I don't see anything obvious yet. Other than an issue with my earlier one as well.

There are two different received_data variables, which may cause things to screw up.
Notice Loop:
Code:
void loop() {
  elapsedMicros emloop = 0;
[COLOR="#FF0000"]  received_data = false;[/COLOR]
  digitalWrite(CS_MASTER, LOW);
  SPI1.beginTransaction(SPISettings(8000000, MSBFIRST, SPI_MODE0));
 [COLOR="#FF0000"] uint16_t received_data = SPI1.transfer16(loop_count);[/COLOR]
  SPI1.endTransaction();
  while (!received_data && (emloop < 250000)) ;
  Serial.printf("SPI Transfer: %x Return: %x SPI DMA: %x delay: %d\n", loop_count, received_data, rx_buffer[0], (uint32_t)emloop);
  delay(250);
  loop_count++;
}
Try maybe changing like:
Code:
uint16_t spi_returned_data = SPI1.transfer16(loop_count)

and change it in:
Code:
Serial.printf("SPI Transfer: %x Return: %x SPI DMA: %x delay: %d\n", loop_count, spi_returned_data , rx_buffer[0], (uint32_t)emloop);
 
Quick update: I have made a version of the sketch that should be able to be configured for either of the two as master and other as slave, and I am seeing your issue.

Debugging!
 
Found bug in setCS - Pushed a change up as my current SPI Pull request:
https://github.com/PaulStoffregen/SPI/pull/57

Not sure how long it will be before @Paul will be free to pull stuff in for new beta build...

But in the mean time you should be able to get it to work by editing your SPI.cpp.
At about line 1396

Code:
// setCS() is not intended for use from normal Arduino programs/sketches.
uint8_t SPIClass::setCS(uint8_t pin)
{
	for (unsigned int i = 0; i < sizeof(hardware().cs_pin); i++) {
		if (pin == hardware().cs_pin[i]) {
//			*(portConfigRegister(pin)) = hardware().sck_mux[i];
			*(portConfigRegister(pin)) = hardware().cs_mux[i];
			return 1;
		}
	}
	return 0;
}
Was using the values from hardware table.

In case you wish to see the updated version of the program
If you comment out the line in RED: with the unedited copy of SPI, it will fail to communicate properly.

Code:
#include <DMAChannel.h>
#include <SPI.h>
#define PRREG(x) Serial.print(#x" 0x"); Serial.println(x, BIN) //Serial.println(x,BIN)
#define SAMPLES 1

[COLOR="#FF0000"]#define SPI1_MASTER[/COLOR]
#ifdef SPI1_MASTER
#define SPIM SPI1
#define SPIS SPI
#define ChipSelectSlave 10
IMXRT_LPSPI_t *spis_regs = &IMXRT_LPSPI4_S;
#define SPIS_DMAMUX_SOURCE_RX DMAMUX_SOURCE_LPSPI4_RX
#define CS_MASTER 0
#else
#define SPIM SPI
#define SPIS SPI1
#define ChipSelectSlave 0
IMXRT_LPSPI_t *spis_regs = &IMXRT_LPSPI3_S;
#define SPIS_DMAMUX_SOURCE_RX DMAMUX_SOURCE_LPSPI3_RX
#define CS_MASTER 10
#endif


volatile bool received_data = false;
DMAMEM static uint16_t rx_buffer[SAMPLES];
uint16_t loop_count = 0;

DMAChannel rx(false);

void rxISR() {
  rx.clearInterrupt();
  received_data = true;
  arm_dcache_delete(rx_buffer, SAMPLES); //delete Cache!
  spis_regs->TDR = rx_buffer[0];
  asm volatile ("dsb");
}

bool initSPISlaveDMA() {
  rx.begin(true);
  rx.source((uint16_t &) spis_regs->RDR);
  rx.triggerAtHardwareEvent(SPIS_DMAMUX_SOURCE_RX);
  rx.attachInterrupt(rxISR);
  rx.interruptAtCompletion(); //TCD->CSR |= DMA_TCD_CSR_INTMAJOR;
  rx.destinationBuffer(rx_buffer, SAMPLES + 1);
  rx.enable();
  return 1;
}

bool initSPISlave() {
  spis_regs->CR &= ~LPSPI_CR_MEN; //Modul ausschalten
  spis_regs->CR = LPSPI_CR_RST; //Master Logic reset! (Control Register => Software Reset)
  spis_regs->CR &=  ~LPSPI_CR_RST; //Master Logic reset! (Control Register => Software Reset)
  spis_regs->TCR = LPSPI_TCR_FRAMESZ(15); //16Bit Mode
  spis_regs->DER = LPSPI_DER_RDDE; //RX DMA Request Enable
  spis_regs->CR |= LPSPI_CR_MEN; //Enable SPI Module!
  return 1;
}

void setup() {
  while (!Serial);
  Serial.begin(115200);
  // I am going to use SPI1 as the master.
  SPIM.begin(); // initialize it... 
  pinMode(CS_MASTER, OUTPUT);
  digitalWrite(CS_MASTER, HIGH);
  
  SPIS.begin();
  pinMode(ChipSelectSlave, INPUT);
  uint8_t cs_mask =  SPIS.setCS(ChipSelectSlave);
  Serial.printf("Slave CS: %d set: %x\n", ChipSelectSlave, cs_mask);

  Serial.println("Init SPI!");
  if (initSPISlave()) {
    Serial.println("SPI SLAVE init!");
  }
  if (initSPISlaveDMA()) {
    Serial.println("DMA Channel init!");
  }
}

void loop() {
  elapsedMicros emloop = 0;
  received_data = false;
  digitalWrite(CS_MASTER, LOW);
  SPIM.beginTransaction(SPISettings(8000000, MSBFIRST, SPI_MODE0));
  uint16_t spi_returned_data  = SPIM.transfer16(loop_count);
  SPIM.endTransaction();
  while (!received_data && (emloop < 250000)) ;
  Serial.printf("SPI Transfer: %x Return: %x SPI DMA: %x delay: %d\n", loop_count, spi_returned_data , rx_buffer[0], (uint32_t)emloop);
  digitalWrite(CS_MASTER, HIGH);
  delay(250);
  loop_count++;
}
 
Thank you very much for looking in this, KurtE!

I'll test this with my code tonight. I'm quite sure this will fix my problem!
 
Just wondering if there has been any further work on T4.0/4.1 SPI slave? Is it incorporated in a library somewhere?

Thanks for any word on this.

Also, should this code work for Teensy 4.1? I'm having trouble getting the SPI interrupt to occur. Thanks again.
 
Last edited:
SPI Slave 4.0/4.1

Just wondering if there has been any further work on T4.0/4.1 SPI slave? Is it incorporated in a library somewhere?

Thanks for any word on this.

Also, should this code work for Teensy 4.1? I'm having trouble getting the SPI interrupt to occur. Thanks again.


I'm new here, this is my first post.

I have been working on getting these examples working on two Teensys. I have a 4.0 and a 4.1 and I have tried swapping master/slave between the two and I have not gotten anything to work.

I have only modified the slave by adding pinMode(MOSI, INPUT); to setup because the waveform on the scope showed MOSI was distorted. The master is just the example DigitalPotControl which is putting out clear waveform data and is captured correctly using a UNO and with a logic analyzer.

I also created a dummy script that read MOSI, SCK, CS pins, one at a time, and output them onto pin 9 to prove that the hardware is able to work with these pins as inputs to make sure these signals are able to enter the components.

When I get something working, I'll update this!

Also, I'm a hardware engineer and not a programmer. :eek:
 
Just wondering if there has been any further work on T4.0/4.1 SPI slave? Is it incorporated in a library somewhere?

Thanks for any word on this.

Also, should this code work for Teensy 4.1? I'm having trouble getting the SPI interrupt to occur. Thanks again.

I would also have the same question on any progress on a official slave implementation for teensys 4.x
 
Back
Top