Teensy 4.1 SPI Slave - here it is

tjaekel

Well-known member
When you are looking (like me) for a SPI Slave implementation on Teensy 4.1...
(all the code provided by PJRC, Arduino, looking in Internet says: "just SPI master is supported")

Based on tonton81's approach: https://github.com/tonton81/SPISlave_T4 - I could bring up
this SPI Slave to work (but it needs some modifications, see attached ZIP with my code and sketch).

How does it work?
It configures some MCU registers directly (the mode for SPI), so it goes to the MCU Config registers for SPI, directly!
It sets an ISR handler for the SPI: so, the SPI slave runs in INT-mode (not DMA).
The format for SPI (e.g. SPI mode, LSB vs. MSB first) is fix and hard-coded (but could be changed, if you use the RT1062 reference manual, knowing which register to write with what).
Strange is just: the H-file includes the "SPISlave_T4.tpp" file (the CPP source code, but not as *.CPP file). Changing to use as CPP instead fails with compile error.
(maybe somebody could fix to have real *.H and *.CPP files).
So, the entire code comes via this "SPISlave_T4.h" included, including also this "SPISlave_T4.tpp" (watch the file extension not as *.CPP).

My example code works this way:
The ISR places all the received SPI Slave bytes into a shared buffer.
The buffer index keeps incrementing (so, the buffer is filled with the received bytes on SPI Slave input, every ISR called for a complete Rx bytes fills the buffer in ISR).
When the PCS comes, the SPI Slave Select signal (CS, SS) comes (this is enabled now as ISR trigger!) - it sets a flag so that main thread (loop) can realize the SPI Rx has completed (the deactivation of SPI CS was there).
Print all the received bytes, reset the notification flag and also the buffer index (a handshake between main() and ISR(), write from start again on next SPI Rx transaction).

The original implementation has some flaws, e.g.:
- the handling of the PCS signal (SPI Chip Select, CS) plus/as "Frame Complete Flag" (PCS goes high) was not enabled, not working (the order of bytes received were wrong, the logic was wrong).
- the ISR keeps looping inside the ISR ("not nice") and uses also Serial.print() calls inside ISR (it slows down the performance, dramatically)
- so, the maximum speed was limited to the USB VCP speed, so, just 1 MHz SPI was working (doing printfs or Serial in an ISR is not a "smart idea", esp. if USB VCP tickles out with 1 KHz clock)
- the order and number of received bytes was wrong (the logic inside ISR was wrong, because it has not really acted on "Frame Complete" (PCR))

I could improve and fix some issues:
- it runs now up to 22 MHz generated by a SPI master (faster fails with wrong number of bytes received or the order is wrong)

Important:
the original code and even my one use still the "swapped pin mapping":
So, the input as MOSI is located on the MISO Teensy 4.1 pin (as MISO, pin number 12)!
(a bit strange, because MOSI remains MOSI, per definition, never mind if slave or master)

So, if you are looking for a SPI Slave for Teensy 4.1... - give it a try.
 

Attachments

  • SPISlave_T4.zip
    2.9 KB · Views: 367
@tonton81 did that code years back {2018} when a T_3.6 or T_3.5 was newer and overloaded with IMU sensor data - SPI _Transfer was created to offload the data to a second Teensy for Serial.print()'ing that to external GUI at 24 MHz. It was done to work on LC to 3.6 of the day IIRC and only Master was needed and it was a tremendous high speed boost allowing RAW struct data export for remote printf() formatting of float data. The T_4.x's came later ...
 
When you are looking (like me) for a SPI Slave implementation on Teensy 4.1...
(all the code provided by PJRC, Arduino, looking in Internet says: "just SPI master is supported")

Based on tonton81's approach: https://github.com/tonton81/SPISlave_T4 - I could bring up
this SPI Slave to work (but it needs some modifications, see attached ZIP with my code and sketch).

How does it work?
It configures some MCU registers directly (the mode for SPI), so it goes to the MCU Config registers for SPI, directly!
It sets an ISR handler for the SPI: so, the SPI slave runs in INT-mode (not DMA).
The format for SPI (e.g. SPI mode, LSB vs. MSB first) is fix and hard-coded (but could be changed, if you use the RT1062 reference manual, knowing which register to write with what).
Strange is just: the H-file includes the "SPISlave_T4.tpp" file (the CPP source code, but not as *.CPP file). Changing to use as CPP instead fails with compile error.
(maybe somebody could fix to have real *.H and *.CPP files).
So, the entire code comes via this "SPISlave_T4.h" included, including also this "SPISlave_T4.tpp" (watch the file extension not as *.CPP).

My example code works this way:
The ISR places all the received SPI Slave bytes into a shared buffer.
The buffer index keeps incrementing (so, the buffer is filled with the received bytes on SPI Slave input, every ISR called for a complete Rx bytes fills the buffer in ISR).
When the PCS comes, the SPI Slave Select signal (CS, SS) comes (this is enabled now as ISR trigger!) - it sets a flag so that main thread (loop) can realize the SPI Rx has completed (the deactivation of SPI CS was there).
Print all the received bytes, reset the notification flag and also the buffer index (a handshake between main() and ISR(), write from start again on next SPI Rx transaction).

The original implementation has some flaws, e.g.:
- the handling of the PCS signal (SPI Chip Select, CS) plus/as "Frame Complete Flag" (PCS goes high) was not enabled, not working (the order of bytes received were wrong, the logic was wrong).
- the ISR keeps looping inside the ISR ("not nice") and uses also Serial.print() calls inside ISR (it slows down the performance, dramatically)
- so, the maximum speed was limited to the USB VCP speed, so, just 1 MHz SPI was working (doing printfs or Serial in an ISR is not a "smart idea", esp. if USB VCP tickles out with 1 KHz clock)
- the order and number of received bytes was wrong (the logic inside ISR was wrong, because it has not really acted on "Frame Complete" (PCR))

I could improve and fix some issues:
- it runs now up to 22 MHz generated by a SPI master (faster fails with wrong number of bytes received or the order is wrong)

Important:
the original code and even my one use still the "swapped pin mapping":
So, the input as MOSI is located on the MISO Teensy 4.1 pin (as MISO, pin number 12)!
(a bit strange, because MOSI remains MOSI, per definition, never mind if slave or master)

So, if you are looking for a SPI Slave for Teensy 4.1... - give it a try.

I have a couple of suggestions:-
Having two Libraries with the same name, especially with the similarities that exist, could (will) be extreemly confusing so
1) could you change the library name to remove the confusion i.e. Improved_SPISlave_T4 or T4_SPISlave, or
2) could you colaborate with TonTon to come up with a difinitive library.

Any way well done in getting it going so well.

Oh, as an aside, could you include some revision information in the code so we know if it has been updated.

EDIT: For my own use I have renamed it as T4_SPISlave.
 
Last edited:
I have a couple of suggestions:-
Having two Libraries with the same name, especially with the similarities that exist, could (will) be extreemly confusing so
1) could you change the library name to remove the confusion i.e. Improved_SPISlave_T4 or T4_SPISlave, or
2) could you colaborate with TonTon to come up with a difinitive library.

Any way well done in getting it going so well.

Oh, as an aside, could you include some revision information in the code so we know if it has been updated.

EDIT: For my own use I have renamed it as T4_SPISlave.

Indeed good work - maybe PR to update the @tonton81 2 yr old Slave code?
 
@tjaekel: Great work, thanks a lot! I haven't been able to get your script to work using SPI1 on Teensy 4.1. Just switched out
C++:
SPISlave_T4<&SPI, SPI_8_BITS> mySPI;
with
C++:
SPISlave_T4<&SPI1, SPI_8_BITS> mySPI;
and rewired accordingly. spiRxComplete is never true. I assume it has to do something with the definitions in
C++:
SPISlave_T4_FUNC SPISlave_T4_OPT::SPISlave_T4() {
  if ( port == &SPI ) {
    _LPSPI4 = this;
    _portnum = 3;
    CCM_CCGR1 |= (3UL << 6);
    nvic_irq = 32 + _portnum;
    _VectorsRam[16 + nvic_irq] = lpspi4_slave_isr;

    /* Alternate pins not broken out on Teensy 4.0/4.1 for LPSPI4 */
    SLAVE_PINS_ADDR;
    spiAddr[0] = 0; /* PCS0_SELECT_INPUT */
    spiAddr[1] = 0; /* SCK_SELECT_INPUT */
    spiAddr[2] = 0; /* SDI_SELECT_INPUT */
    spiAddr[3] = 0; /* SDO_SELECT_INPUT */
    IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_03 = 0x3; /* LPSPI4 SCK (CLK) */
    IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_01 = 0x3; /* LPSPI4 SDI (MISO) */
    IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_02 = 0x3; /* LPSPI4 SDO (MOSI) */
    IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_00 = 0x3; /* LPSPI4 PCS0 (CS) */
  }
}
but I'm not adept enough to know what to change. Could you help me out here? Thanks!
 
When you are looking (like me) for a SPI Slave implementation on Teensy 4.1...
(all the code provided by PJRC, Arduino, looking in Internet says: "just SPI master is supported")

Based on tonton81's approach: https://github.com/tonton81/SPISlave_T4 - I could bring up
this SPI Slave to work (but it needs some modifications, see attached ZIP with my code and sketch).

How does it work?
It configures some MCU registers directly (the mode for SPI), so it goes to the MCU Config registers for SPI, directly!
It sets an ISR handler for the SPI: so, the SPI slave runs in INT-mode (not DMA).
The format for SPI (e.g. SPI mode, LSB vs. MSB first) is fix and hard-coded (but could be changed, if you use the RT1062 reference manual, knowing which register to write with what).
Strange is just: the H-file includes the "SPISlave_T4.tpp" file (the CPP source code, but not as *.CPP file). Changing to use as CPP instead fails with compile error.
(maybe somebody could fix to have real *.H and *.CPP files).
So, the entire code comes via this "SPISlave_T4.h" included, including also this "SPISlave_T4.tpp" (watch the file extension not as *.CPP).

My example code works this way:
The ISR places all the received SPI Slave bytes into a shared buffer.
The buffer index keeps incrementing (so, the buffer is filled with the received bytes on SPI Slave input, every ISR called for a complete Rx bytes fills the buffer in ISR).
When the PCS comes, the SPI Slave Select signal (CS, SS) comes (this is enabled now as ISR trigger!) - it sets a flag so that main thread (loop) can realize the SPI Rx has completed (the deactivation of SPI CS was there).
Print all the received bytes, reset the notification flag and also the buffer index (a handshake between main() and ISR(), write from start again on next SPI Rx transaction).

The original implementation has some flaws, e.g.:
- the handling of the PCS signal (SPI Chip Select, CS) plus/as "Frame Complete Flag" (PCS goes high) was not enabled, not working (the order of bytes received were wrong, the logic was wrong).
- the ISR keeps looping inside the ISR ("not nice") and uses also Serial.print() calls inside ISR (it slows down the performance, dramatically)
- so, the maximum speed was limited to the USB VCP speed, so, just 1 MHz SPI was working (doing printfs or Serial in an ISR is not a "smart idea", esp. if USB VCP tickles out with 1 KHz clock)
- the order and number of received bytes was wrong (the logic inside ISR was wrong, because it has not really acted on "Frame Complete" (PCR))

I could improve and fix some issues:
- it runs now up to 22 MHz generated by a SPI master (faster fails with wrong number of bytes received or the order is wrong)

Important:
the original code and even my one use still the "swapped pin mapping":
So, the input as MOSI is located on the MISO Teensy 4.1 pin (as MISO, pin number 12)!
(a bit strange, because MOSI remains MOSI, per definition, never mind if slave or master)

So, if you are looking for a SPI Slave for Teensy 4.1... - give it a try.
Hi,
Thanks for the sample code.
I like to know if you encounter issue of delayed Slave reply of Master commands processing?
I made some changes to process Master commands and reply with the Slave.
for the first Master CMD1 Salve reply 0x00 (preloaded at begin stage) .
At this point i assume Slave read the master CMD1 and process it
Next master sends 0x00 for the Slave reply 0xFF (since RF FIFO empty)
On the next Master CMD2 Slave reply with the data that is result of Master CMD1 processing.

Please advice

Thanks!
 
It is not possible to use a SPI Slave in Full-Duplex mode: if the Master sends a command to Slave, but the Slave has to wait for FIFO, read from FIFO, decode the command, generate the response and write to the Slave Tx FIFO - it will take much longer as ONE SPI CLK cycle (it must be shorter, assume just Half a clock cycle left).

On the SPI Slave, the entire Slave Tx FIFO must be pre-loaded with the entire response to Master.

Otherwise: implement a "protocol": split it into two separate SPI transactions, with a delay (gap, pause) between the SPI transactions, in order to give the SPI Slave time to prepare the response.
(for Bluem ax: see the private conversation)

I use SPI Slave just as a uni-directional receiver (Rx only): I use one Master as Tx, another SPI as Slave for Rx: it allows me to feedback the SPI CLK signal, for runtime (delay) compensation (e.g. due to the length of cable, using level shifters).

A SPI Slave as a full-duplex device, where the "tail" of a SPI transaction depends on what was received at the start of SPI transaction (e.g. command) - works only when all is implemented in real HW (a chip, a FPGA). In MCU FW - not possible - FW is too slow to modify during an ongoing SPI transaction what has to be "clocked out" by the Master. The Master keeps going and FW has no time to fill the Slave Tx FIFO.
 
Back
Top