Minimal example of DMA-driven SPI with CS toggle every 8/16/32 bits?

cermak

Member
I've roughly got the asynchronous SPI.transfer(tx_buf, rx_buf, sizeof(tx_buf), event_responder) call to work , but I don't know how to get a chip-select line to toggle in between bytes/uint16s/uint32s (a feature I need to make some DACs and ADCs respond; they won't simply handle a long stream of bytes in a single transaction). Does anyone have a minimal example of this?

If not, can anyone point me to resources on this? I'm starting to look through the LPSPI chapter of the imxrt1060 manual and the SPI library (https://github.com/PaulStoffregen/SPI/blob/master/SPI.cpp), but any other resources would also be great. Thanks in advance!
 
Which pin are you using for CS? I'm pretty sure it would have to be one of the specific SPI hardware CS pins for this to work, but also I'm not sure if this capability is included in the SPI library.
 
@jmarsh i'd be happy to get this to work with any pin as DMA SPI-controlled CS. (My impression has been that the SPI library mostly doesn't really use the SPI peripheral to control CS pins; all the examples I've seen just toggle CS as a GPIO output before and after the transaction)
 
I think that's mainly because it's more flexible (allows any pin to be used as CS) and it's how the regular arduino SPI library was designed.

Teensy's SPI library does have a setCS() method to configure a CS pin to be driven by the SPI module - it might "just work" if you use that to select the PCS[0] pin for whichever SPI interface you're using. Only the CS0 pin has a chance of working because the library never changes the default value in the TCR register.
 
Attached an example on how it can be done. In this example I was using several ST MEMS sensors, a Microchip RTC, and a MAQ473 sensor. Some using SPI1, some SPI0 (aka SPI) and some use SPI2.

What you need from this zip is the T4_DMA_SPI*.* files as the library for using any of the three SPI ports on a T41, with DMA.

This T4 DMA SPI code also supports 3 wire SPI (bi-directional data line), and it allows toggling MOSI, MISO pin roles.

Intended use: first initialise the SPI slaves in your setup(), then configure a timer for triggering the 'real time' SPI DMA transactions. The SPI transactions are first pushed on a list of to be executed SPI DMA transactions. A transaction may or may not toggle the SPI chip CS line.
Then you trigger it to run those transactions in the background. Typically that trigger would be from a timer interrupt service.

You'll need to write something for the specific SPI chip that you're using.
 

Attachments

  • Example_T4_DMA_SPI.zip
    41.6 KB · Views: 17
Thanks so much @jmarsh and @sicco!

@jmarsh that was a very good suggestion*, and I can indeed get CS0 to toggle on each byte.
C++:
#include <SPI.h>
#include <EventResponder.h>
#define BUF_SIZE 10
uint16_t spi_tx_buf[BUF_SIZE];
uint16_t spi_rx_buf[BUF_SIZE];
EventResponder event_responder;
volatile bool dma_busy;
void asyncEventResponder(EventResponderRef event_responder) {
  dma_busy = false;
  SPI.endTransaction();
}
void setup() {
  for (int i = 0; i < BUF_SIZE; i++) spi_tx_buf[i] = i;
  SPI.setCS(10); // T4.1 pin 10 is default CS
  SPI.begin(); 
  event_responder.attachImmediate(&asyncEventResponder);
}
void loop() {
  dma_busy = true;
  SPI.beginTransaction(SPISettings(1000000, MSBFIRST, SPI_MODE0)); 
  SPI.transfer(spi_tx_buf, spi_rx_buf, sizeof(spi_tx_buf), event_responder);
  while (dma_busy)
    delay(1);
}

Produces the following (yellow is clock, purple is CS (pin 10), blue is MOSI)
Screenshot from 2024-12-08 12-37-27.png

The same code works on SPI1 with pin 1 as CS, but as you noted, does not work with the alternative CS1 pin (39). and also works with the alternate CS pin 38 (thanks @KurtE for catching my mistake there).

I need to figure out how to get this to work with a 16 or 32-bit payload. @sicco, I am starting to read through your library, thank you very much! I will post more when I run into questions/issues.

*like all good suggestions, I swear I tried this and it didn't work before, but clearly I didn't.
 
Last edited:
The same code works on SPI1 with pin 1 as CS, but as you noted, does not work with the alternative CS1 pin (39).
I assume you mean pin 38?
0AD_B0_03FLEXCAN2_RXXBAR1_INOUT17LPUART6_RXUSB_OTG1_OCFLEXPWM1_PWMX01GPIO1_IO03REF_CLK_24MLPSPI3_PCS0
38/A14AD_B1_12FLEXSPIA_DATA01ACMP_OUT00LPSPI3_PCS0SAI1_RX_DATA00CSI_DATA05GPIO1_IO28USDHC2_DATA4KPP_ROW01ENET2_1588_EVENT2_OUTFLEXIO3_FLEXIO12
39/A5AD_B1_13FLEXSPIA_DATA00ACMP_OUT01LPSPI3_SDISAI1_TX_DATA00CSI_DATA04GPIO1_IO29USDHC2_DATA5KPP_COL01ENET2_1588_EVENT2_INFLEXIO3_FLEXIO13

So both pins 0 and 28 are LPSPI3_PCS0,

For some of the SPIx objects we have multiple SPI CS pins available.
You can choose which one (if any) is active by setting the appropriate fields in the TCR register.
see the fields defined in the processor Reference manual: 48.5.1.15 Transmit Command (TCR)

Been awhile since I mucked with it, but you can choose which of the SPI CS pins/channel to use as CS, and then can paly around
with the CONT field and frame size and the like, For example you could setup the word size to be 32 bits.. There are no publicly
defined APIS to transfer 32 bit values, although there could be. Actually there is one defined as private. Not sure why...
But...
 
@KurtE Thanks! I misread the pinout. Pin 38 works too! I didn't realize that both pins were PCS0. (I will update the above post to note that too). I think I've got at least the beginnings of a handle on how to accomplish some what I'm going for, and when I get a minimal example of DMA-driven 16-bit and 32-bit transfers, I'll post it.
 
It seems that this is not straightforward to do a minimal example of this, as it requires modifying SPIClass::transfer, since that function explicitly sets the frame size to 7 and DMA to single-byte transfers.

I need to be able to do 16-bit transfers via DMA with the CS pin controlled by the SPI controller, and I'm trying to figure out the cleanest way to do this in one of my projects, and I see basically two possible approaches:
1. Update the existing SPI library to include this functionality.
2. Make my own fork of the Teensy SPI library and use it in my projects.

Is the former reasonable to do? I don't know what constraints there are re: uniformity of the SPI interface across "Arduino-compatible" devices, but if it were possible to update the library to expose this functionality, that would be amazing. I wrote a small PR to add this functionality for Teensy 4.x here: https://github.com/PaulStoffregen/SPI/pull/70, but its not beautiful and elegant, and there are parts I don't understand deeply enough to be sure there aren't bugs. If theres not some fundamental reason this functionality can't/shouldn't be added to the library, would anyone be willing to help me get this PR up to snuff and get it into the library?
 
You are right, originally when I added the methods: transfer(txbuf, rxbuf, cnt) as well as the Async version,
I had 16 bit versions of these as well. But was asked to remove the 16 bit versions, which I did...

Later the 16-bit as well as a 32-bit version: transfer16(tx, rx, cnt) was added, but it is marked private in the header file. Which confuses, me
as why have a private function, that is not used in the library, and no one can call? I sort of understand if it was put in a protected:
section, then you could create a sub-class that calls it.

But keeping my fingers crossed that hopefully it makes it in.
 
@KurtE thanks, thats good to know! If you have any feedback on the PR as well, I'd also certainly appreciate it. Thanks again!
What goes into this, is totally up to Paul, as it is his platform and his github project.
If he sounds at all receptive to this and asks, there are probably a few of us who will try it out and the like.

My guessing, but I believe you would probably have a better chance with multiple functions, like transfer16 and transfer32. There are
already some implementations for the non-DMA versions of some of these.
That instead of your version:
Code:
bool transfer(const void *txBuffer, void *rxBuffer, size_t count, EventResponderRef event_responder, uint8_t framesize=7);

You would have:
Code:
bool transfer16(const void *txBuffer, void *rxBuffer, size_t count, EventResponderRef event_responder);
\

And if he does except one like you mentioned, I would think you should pass in 8-32 instead of 7-31 as the parameter.

Questions you asked:
Should new API's such as this be applied to the Teensy 3.x? My guess is yes. And this adds complexity as the
DMA implementations are different for 3.6 versus 3.5 versus ...

I don't think you would need to implement for AVR boards.

Should other APIs be updated to add this consistency? I am guessing you would probably need to go the other way.

Need to be updated code you mentioned?

maybe, like < 2 would need to go to different versions of API. for different word size.

The arm_dcache_... probably does not need to change. This has to do with different memory regions, have a cache to the memory. And DMA always talks to actual physical memory. So when you are writing from memory, you need to flush whatever is in the cache to memory, and when you do reads you need to delete the cache such that new reads will get the update stuff that came in from dma..

Hope that helps
Kurt
 
Back
Top