SPI-DMA-slave problem.

Status
Not open for further replies.

WMXZ

Well-known member
OK,
I have the following issue
A Teensy3.1 T1 is SPI-master and transfer data to Teensy3.1 T2, configured as slave and receive simultaneously data from T2.
As the process has to run in parallel to the CPU, I use DMA.

Observation:
T1 receives correctly data from T2 (slave sends different data than master) and DMA completion ISR (RX) is called. So T2 sends data also in slave mode.
T2, however, never calls DMA completion ISR (RX).

As my SW is completely in C, code here does not help and may not be necessary.

So I have two questions:
1st: Did someone had similar issues and found a solution?
2nd: Is there a stable 'official' teensy SPI-DMA (master/slave) library I could use to provide code that could be read/tested by others?

Gazie,
Walter
 
Observation:
T1 receives correctly data from T2 (slave sends different data than master) and DMA completion ISR (RX) is called. So T2 sends data also in slave mode.

Update:
Further inspection revealed that T2 sends only the last word in the buffer, as if DMA of T2 run's fast to end of buffer and SPI of T2 sends only what it has in PUSHR_SLAVE.

I vaguely remember, that someone else made similar observation. However, I do not recall, how the issue was resolved.

getting up an Arduino example with 2 teensy turned out to be more complicated than I thought. So no example code yet.
 
Attached is an image of a digital analyzer (always the same problem as mentioned above)
T1 is master T2 is slave

channel 0 not used
channel 1 spi clock T1.SCK and T2.SCK (four 16bit words at 12.5 MHz may be noted)
channel 2 T1.Dout (connected to T2.Din)
channel 3 T2.Dout (connected to T1.Din)

(SCK is pin 14, Dout is pin 7, Din is Pin 8)

data transmitted from T1 to T2 are 1,2,3,4 (channel2)
data transmitted from T2 to T1 are 1,3,5,7 (channel3)

dualTeensySpi.jpg

One may note that T1.Dout is high except when transferring data and between data
T2.Dout is generally low except when it transfers data and the value seems to be always 3.

DMA of T1 (master) generates DMA atCompletion interrupt
DMA of T2 (slave) generates no DMA atCompletion interrupt.

IMO, the picture confirms that
dma on T2(slave) loads PUSHR_SLAVE as fast as possible (for this case the DMA reached second value in transmit array).

Note: both Teensies run the same program, who is master is determined by connecting a dig Pin to ground
 
update on dual-teensy master-slave SPI communication

An update on the dual-teensy master-slave SPI communication

I succeeded nearly to let 2 T3.1 communicate via SPI

To avoid any unintended side-effects, I programmed it without libraries.
Code:
#define NDAT 8

int m_isMaster=0;
short xx[NDAT],yy[NDAT];

void logg(short *yy, int nn)
{ int ii;
  if(m_isMaster)
    Serial.printf("Master: ");
  else
    Serial.printf("Slave: ");
    
  for(ii=0;ii<nn;ii++) Serial.printf("%d ",yy[ii]); Serial.println();
}

void dma_ch0_isr(void)
{ DMA_CINT=0;
  DMA_CDNE=0;
}
void dma_ch1_isr(void)
{ DMA_CINT=1;
  DMA_CDNE=1;
  SPI0_MCR |= SPI_MCR_HALT | SPI_MCR_MDIS;
  logg(yy,NDAT);
}

void startXfer(void)
{  int ii;
//
 for(ii=0;ii<NDAT;ii++) yy[ii]=0;
 if(m_isMaster)
   for(ii=0;ii<NDAT;ii++) xx[ii]=1+ii;
 else
   for(ii=0;ii<NDAT;ii++) xx[ii]=1+2*ii;
   
  //spi SETUP	
  SPI0.MCR =	//SPI_MCR_MDIS |   // module disable 
		SPI_MCR_HALT |   // stop transfer
		SPI_MCR_PCSIS(0x1F); // set all inactive states high
  SPI0.MCR |= SPI_MCR_CLR_TXF;
  SPI0.MCR |= SPI_MCR_CLR_RXF;
  
  if(m_isMaster) SPI0.MCR |= SPI_MCR_MSTR;
  
#if F_BUS == 48000000
#define SPI_CLOCK   (SPI_CTAR_PBR(2) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) //(48 / 5) * ((1+1)/2) = 9.8 MHz
#endif

  if(m_isMaster)
	SPI0.CTAR0 = SPI_CLOCK | 
	SPI_CTAR_FMSZ(15) |
	SPI_CTAR_PCSSCK(0) |
	SPI_CTAR_PASC(0) |
	SPI_CTAR_PDT(0) |
	SPI_CTAR_CSSCK(0) |
	SPI_CTAR_ASC(0) |
	SPI_CTAR_DT(0);
  else
	SPI0_CTAR0_SLAVE = SPI_CTAR_FMSZ(15);

  // start SPI
   SPI0.RSER = SPI_RSER_TFFF_DIRS | SPI_RSER_TFFF_RE | // transmit fifo fill flag to DMA
		SPI_RSER_RFDF_DIRS | SPI_RSER_RFDF_RE;  // receive fifo drain flag to DMA

  if(m_isMaster)
	SPI0.MCR = SPI_MCR_MSTR;
  else
	SPI0.MCR = 0;   

// set transmit
 	if(m_isMaster)
	    DMA_TCD0_DADDR=&SPI0_PUSHR;
	else
	    DMA_TCD0_DADDR=&SPI0_PUSHR_SLAVE;
        DMA_TCD0_DOFF=0;
        DMA_TCD0_DLASTSGA= 0;

        DMA_TCD0_ATTR=1<<8|1;
        DMA_TCD0_NBYTES_MLNO=2;
        
        DMA_TCD0_SADDR=xx;
        DMA_TCD0_SOFF=2;
        DMA_TCD0_SLAST=-2*NDAT;
        
        DMA_TCD0_CITER_ELINKNO = DMA_TCD0_BITER_ELINKNO=NDAT;
        
        DMA_TCD0_CSR = DMA_TCD_CSR_INTMAJOR | DMA_TCD_CSR_DREQ;
        
	DMAMUX0_CHCFG0 = DMAMUX_DISABLE;
	DMAMUX0_CHCFG0 = DMAMUX_SOURCE_SPI0_TX | DMAMUX_ENABLE;

// set receive
 	DMA_TCD1_SADDR=&SPI0_POPR;
	DMA_TCD1_SOFF=0;
        DMA_TCD1_SLAST= 0;

        DMA_TCD1_ATTR=1<<8|1;
        DMA_TCD1_NBYTES_MLNO=2;
        
        DMA_TCD1_DADDR=yy;
        DMA_TCD1_DOFF=2;
        DMA_TCD1_DLASTSGA=-2*NDAT;
        
        DMA_TCD1_CITER_ELINKNO = DMA_TCD1_BITER_ELINKNO=NDAT;
        
        DMA_TCD1_CSR = DMA_TCD_CSR_INTMAJOR | DMA_TCD_CSR_DREQ;
           
	DMAMUX0_CHCFG1 = DMAMUX_DISABLE;
	DMAMUX0_CHCFG1 = DMAMUX_SOURCE_SPI0_RX | DMAMUX_ENABLE;

  //start DMA
       	DMA_SERQ = 0;
	NVIC_ENABLE_IRQ(IRQ_DMA_CH0);
       	DMA_SERQ = 1;
	NVIC_ENABLE_IRQ(IRQ_DMA_CH1);
}

void setup(void)
{
    SIM_SCGC6 |= SIM_SCGC6_SPI0;
    SIM_SCGC6 |= SIM_SCGC6_DMAMUX;
    SIM_SCGC7 |= SIM_SCGC7_DMA;

    DMA_CR = 0;
    
//    while(!Serial);
    
    // here we must check if we are master or slave
    pinMode(23, INPUT_PULLUP);
    delay(10); // give some tie to settle
    if (digitalRead(23)) m_isMaster=0; else m_isMaster=1;
    pinMode(23, INPUT); // to avoid too much current (is it necesary?)
    
    Serial.printf("isMaster = %d\n",m_isMaster);
    
	if(m_isMaster)
	{
		CORE_PIN2_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); // SRE slew rate enable
		CORE_PIN14_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); // DSE drive strength enable
		CORE_PIN7_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); 
		CORE_PIN8_CONFIG = PORT_PCR_MUX(2);
	}
	else
	{
		CORE_PIN2_CONFIG = PORT_PCR_MUX(2); 
		CORE_PIN14_CONFIG = PORT_PCR_MUX(2); 
		CORE_PIN7_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); 
		CORE_PIN8_CONFIG = PORT_PCR_MUX(2);
	}
    
}

void loop(void)
{ startXfer();
  delay(100);
}

The results are
Code:
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13 
Master: 15 1 3 5 7 9 11 13
and
Code:
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8 
Slave: 1 2 3 4 5 6 7 8

Observations:
slave receives data correctly (as intended by master)
master receives data NEARLY correctly (i.e.slave starts transmission with LAST word in array and wraps then to first word)
max speed is about 10 MHz clock rate
If speed is higher (e.g. 12.5 MHz) then slave transmission lags by one bit, i.e. master does not 'see' LSB.
In how fare SCK duty-cycle plays a role, I don't know.

So open question are:
- How can one tell slave to start with first word and not with last word?
(that it is the transmission and not a problem in reception is verified with logic analyzer)
- can one adjust timing such that the response time of transmitting slave does not result in dropping the LSB by the master receiver.
 
Last edited:
All working now!

Got it working

After commenting the module disable as in
Code:
 SPI0.MCR =	//SPI_MCR_MDIS |   // module disable 
		SPI_MCR_HALT |   // stop transfer
		SPI_MCR_PCSIS(0x1F); // set all inactive states high
the program works as expected.

It seems that clearing FIFO is only possible when module is not disabled.
The manual says that disabling FIFO is only possible with enabled module, but does this not mention for clearing FIFO.

OK, in the end I found it.

Will see if I can it get a little bit faster (from 9.8 MHz to 12.5 MHz, would be 25%)
 
Will see if I can it get a little bit faster (from 9.8 MHz to 12.5 MHz, would be 25%)

By changing the F_BUS calculation in mk20dx128.c and definition in kinetis.h from 48 to 36 MHz, I could use a SPI clock divider

Code:
#define SPI_CLOCK   (SPI_CTAR_PBR(1) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) //(36 / 3) * ((1+1)/2) = 12 MHz
that results into a 40/60 clock duty-cycle sufficient to allow master receive to capture all bits of slave transmit at 12 MHz.
 
Hey nice that you got it working.
We are actually looking for some code which does support slave mode via DMA.
Did you wrap up your code to some library ?
 
Hey nice that you got it working.
We are actually looking for some code which does support slave mode via DMA.
Did you wrap up your code to some library ?

No, I did not wrap it up as library,
but following the examples in the previous messages, you should be able to get a working example and make your own library.
Please note that the same code works for both master and slave.
 
So bringing an old thread to life:) I was wondering if this example works with T3.5/3.6?
I have a raspberry pi with a program that sends data to the SPI. So now I'm in need of a Teensy slave and I stumbled upon this interesting thread

Thank you!
 
It should work the same. I would simply try.
You have a simple SPI program for RPI one could use to test?
 
Actually I'm not the author of the program :) the program is called Machinekit and someone has made a plugin that sends position and button commands via SPI to a PIC. All fitmware is available on GIT. But since buying a PIC and programming a precompiled code is no fun I wanted to give it a go with a Teensy and aldo I can expand and adapt the system to my needs:)
I tried with a library provided by a PJRC member named btmcmahan. But it didn't work. Im not getting any data into the Teensy. The SPI works :/
Maybe you can give the RPi firmware a quick look and maybe see if it is doable with this library:)
RPi firmware:
https://github.com/kinsamanka/PICnc-V2/tree/master/HAL
PIC firmware:
https://github.com/kinsamanka/PICnc-V2/tree/master/firmware

Thank you!
 
The code you posted doesn't compile :/

sketch_nov09a:37: error: 'SPI0' was not declared in this scope
SPI0.MCR = //SPI_MCR_MDIS | // module disable

^

'SPI0' was not declared in this scope
 
Fixed it :) it seems there was a change in teensyduino core files.. instead of
Code:
SPI0.
i replaced them with
Code:
KINETISK_SPI0.

Code:
#define NDAT 8

int m_isMaster=0;
short xx[NDAT],yy[NDAT];

void logg(short *yy, int nn)
{ int ii;
  if(m_isMaster)
    Serial.printf("Master: ");
  else
    Serial.printf("Slave: ");
    
  for(ii=0;ii<nn;ii++) Serial.printf("%d ",yy[ii]); Serial.println();
}

void dma_ch0_isr(void)
{ DMA_CINT=0;
  DMA_CDNE=0;
}
void dma_ch1_isr(void)
{ DMA_CINT=1;
  DMA_CDNE=1;
  SPI0_MCR |= SPI_MCR_HALT | SPI_MCR_MDIS;
  logg(yy,NDAT);
}

void startXfer(void)
{  int ii;
//
 for(ii=0;ii<NDAT;ii++) yy[ii]=0;
 if(m_isMaster)
   for(ii=0;ii<NDAT;ii++) xx[ii]=1+ii;
 else
   for(ii=0;ii<NDAT;ii++) xx[ii]=1+2*ii;
   
  //spi SETUP  
  KINETISK_SPI0.MCR =  //SPI_MCR_MDIS |   // module disable 
    SPI_MCR_HALT |   // stop transfer
    SPI_MCR_PCSIS(0x1F); // set all inactive states high
  KINETISK_SPI0.MCR |= SPI_MCR_CLR_TXF;
  KINETISK_SPI0.MCR |= SPI_MCR_CLR_RXF;
  
  if(m_isMaster) KINETISK_SPI0.MCR |= SPI_MCR_MSTR;
  
#if F_BUS == 48000000
#define SPI_CLOCK   (SPI_CTAR_PBR(2) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) //(48 / 5) * ((1+1)/2) = 9.8 MHz
#endif

  if(m_isMaster)
  KINETISK_SPI0.CTAR0 = SPI_CLOCK | 
  SPI_CTAR_FMSZ(15) |
  SPI_CTAR_PCSSCK(0) |
  SPI_CTAR_PASC(0) |
  SPI_CTAR_PDT(0) |
  SPI_CTAR_CSSCK(0) |
  SPI_CTAR_ASC(0) |
  SPI_CTAR_DT(0);
  else
  SPI0_CTAR0_SLAVE = SPI_CTAR_FMSZ(15);

  // start SPI
   KINETISK_SPI0.RSER = SPI_RSER_TFFF_DIRS | SPI_RSER_TFFF_RE | // transmit fifo fill flag to DMA
    SPI_RSER_RFDF_DIRS | SPI_RSER_RFDF_RE;  // receive fifo drain flag to DMA

  if(m_isMaster)
  KINETISK_SPI0.MCR = SPI_MCR_MSTR;
  else
  KINETISK_SPI0.MCR = 0;   

// set transmit
  if(m_isMaster)
      DMA_TCD0_DADDR=&SPI0_PUSHR;
  else
      DMA_TCD0_DADDR=&SPI0_PUSHR_SLAVE;
        DMA_TCD0_DOFF=0;
        DMA_TCD0_DLASTSGA= 0;

        DMA_TCD0_ATTR=1<<8|1;
        DMA_TCD0_NBYTES_MLNO=2;
        
        DMA_TCD0_SADDR=xx;
        DMA_TCD0_SOFF=2;
        DMA_TCD0_SLAST=-2*NDAT;
        
        DMA_TCD0_CITER_ELINKNO = DMA_TCD0_BITER_ELINKNO=NDAT;
        
        DMA_TCD0_CSR = DMA_TCD_CSR_INTMAJOR | DMA_TCD_CSR_DREQ;
        
  DMAMUX0_CHCFG0 = DMAMUX_DISABLE;
  DMAMUX0_CHCFG0 = DMAMUX_SOURCE_SPI0_TX | DMAMUX_ENABLE;

// set receive
  DMA_TCD1_SADDR=&SPI0_POPR;
  DMA_TCD1_SOFF=0;
        DMA_TCD1_SLAST= 0;

        DMA_TCD1_ATTR=1<<8|1;
        DMA_TCD1_NBYTES_MLNO=2;
        
        DMA_TCD1_DADDR=yy;
        DMA_TCD1_DOFF=2;
        DMA_TCD1_DLASTSGA=-2*NDAT;
        
        DMA_TCD1_CITER_ELINKNO = DMA_TCD1_BITER_ELINKNO=NDAT;
        
        DMA_TCD1_CSR = DMA_TCD_CSR_INTMAJOR | DMA_TCD_CSR_DREQ;
           
  DMAMUX0_CHCFG1 = DMAMUX_DISABLE;
  DMAMUX0_CHCFG1 = DMAMUX_SOURCE_SPI0_RX | DMAMUX_ENABLE;

  //start DMA
        DMA_SERQ = 0;
  NVIC_ENABLE_IRQ(IRQ_DMA_CH0);
        DMA_SERQ = 1;
  NVIC_ENABLE_IRQ(IRQ_DMA_CH1);
}

void setup(void)
{
    SIM_SCGC6 |= SIM_SCGC6_SPI0;
    SIM_SCGC6 |= SIM_SCGC6_DMAMUX;
    SIM_SCGC7 |= SIM_SCGC7_DMA;

    DMA_CR = 0;
    
//    while(!Serial);
    
    // here we must check if we are master or slave
    pinMode(23, INPUT_PULLUP);
    delay(10); // give some tie to settle
    if (digitalRead(23)) m_isMaster=0; else m_isMaster=1;
    pinMode(23, INPUT); // to avoid too much current (is it necesary?)
    
    Serial.printf("isMaster = %d\n",m_isMaster);
    
  if(m_isMaster)
  {
    CORE_PIN2_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); // SRE slew rate enable
    CORE_PIN14_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); // DSE drive strength enable
    CORE_PIN7_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); 
    CORE_PIN8_CONFIG = PORT_PCR_MUX(2);
  }
  else
  {
    CORE_PIN2_CONFIG = PORT_PCR_MUX(2); 
    CORE_PIN14_CONFIG = PORT_PCR_MUX(2); 
    CORE_PIN7_CONFIG = PORT_PCR_SRE | PORT_PCR_DSE | PORT_PCR_MUX(2); 
    CORE_PIN8_CONFIG = PORT_PCR_MUX(2);
  }
    
}

void loop(void)
{ startXfer();
  delay(100);
}
 
OK, the code was pre Teensy LC
To compile it on T3.6
add in the beginning
Code:
#define SPI0 KINETISK_SPI0

and compile with F_CPU = 144 MHz (to get a Bus clock of 48 MHz)

Edit: OK, our posts crossed
 
Is it possible to modify your program to recieve in 32bit. I tried using your program and i get the data without problems but the data is seperated into 4 positions inside the array. And the array locations are a little messed up :)
 
Is it possible to modify your program to recieve in 32bit. I tried using your program and i get the data without problems but the data is seperated into 4 positions inside the array. And the array locations are a little messed up :)

AFAIK,
the SPI word size is only 8 or 16 bit.

Note, my program was a test program to demonstrate the issue I had at that time.
for a operational program you have to remove the printout (here called logg) from the ISR
 
I was reading some forums and it seems they are both little endian...
I will try you changes and see what happens....
The raspberry sends 64bytes/transfer and the data is 32bit. The rpi sends commands like 0x324D433E.. using your program i get on the terminal 4D323E43. The bit order is strangely messed up:/
 
Do you know where can I find info on all of these commands ?
SPI_CTAR_FMSZ(15),DMA_TCD0_NBYTES_MLNO=2;,.... (and others) I'm really a beginner at these things :))))
 
Do you know where can I find info on all of these commands ?
SPI_CTAR_FMSZ(15),DMA_TCD0_NBYTES_MLNO=2;,.... (and others) I'm really a beginner at these things :))))

There are really three sources
the code you have
code from others, say in the spi library and the one developed by KurtE
and more relevant the reference document for the K20 processor, https://www.pjrc.com/teensy/K20P64M72SF1RM.pdf for the T3.2

most, if not all, symbols are defined in kinetis.h in the cores/teensy3 directory.
Yes, I know it is a little bit cumbersome, but that the reality.

on the word ordering
if RPI sends 0x32 4D 43 3E and you receive 0x4D 32 3E 43
you see the byteswap
if RPI sends the 4 bytes as b[0],b[1],b[2],b[3] the teensy receives as b[1],b[0], b[3],b[2]
You can easily swap the bytes after receiving.
 
I will take a look at the kinetis file.

About the word ordering... I did some more digging i inserted a white space after every Serial print (don't know why i didn't do that in the start) and i got this:
Data[0]=4D323E43. Everything is inside a single location. This is what I wanted. But still the order is wrong. This will sound wierd but I was digging on google and found the perfect explanation: The data looks like it is in Mid-big endian format. This explains alot but this type of endianess is very rare and that is why I'm sceptial about this theory
 
Status
Not open for further replies.
Back
Top