External ADC with Teensy 4.0

DrM

Well-known member
I need to use an external 16bit 1MS/s ADC over SPI with the Teensy 4.0. We are going to assert the CONVERT and read the 16 bit data over SPI, hopefully at 1 MS/s.

Will the SPI library support reads at this rate on the T4.0?

Are there any issues or tricks?

Thank you very much
 
If you look at the Teensy Datasheet (There is a link on the data sheet).

It shows that:
Absolute maximum frequency of operation (fop) is 30 MHz. The clock driver in the LPSPI module for fperiph must be
guaranteed this limit is not exceeded.

Note: we have exceeded this speed when playing with some different displays, but that is what their data sheet says.

So if you need 1 million samples and each one returns 16 bit values. And there are not too many gaps between samples, you might make it at 30mhz

My guess is you would probably not get enough speed by simply calling SPI.transfer16(0) 1million times.

Although it would be simple enough to run simple test and see how many milliseconds it would take to do 1million.
 
@KurtE Thank you. Yes, that is why I felt I needed to ask. To be concrete, the part we are contemplating is the MCP33131D-10, I should preface that I have never programmed one of these before.

What I make out from the data sheet is that the conversion time is 700ns, and then it goes into input acquisition where it clocks the bits out on the SDO and that has to be 300ns to make the 1 MS/S rate. So, that means 16 bits/300ns = 53 MHz?

Working backwards, 16 bits/30MHz = 533nsecs. So, 700ns + 533ns = 1.23 usec, and therefore the best rate is 810 kHz? There is probably another 10ns here and there for logic transition. So, call it 800kHz.

I think that can be okay for this application (with some expectation management, sigh..).

Is there a fast way to do this, including the convert line?
 
BTW What is FlexSPI? Is that relevant for this question? It seems to run faster according to the datasheet for the MCU.
 
FlexSPI is usually used for memory chips. It's available at the pads on the bottom side of Teensy 4.1 where a PSRAM or extra flash memory chip would normally be added.

Theoretically, FlexSPI could be used for non-memory devices. But the main obstacle is its high clock speed. I seem to recall the slowest we were able to go was about 49 MHz, but to be honest I haven't looked at the hardware config for quite some time. There might be an option to divide the speed by 2 which I never explored. Most SPI chips other than memory have maximum clock speed in the 20 to 40 MHz range, so FlexSPI with 49 MHz minimum is just too fast.

How useful it could be for a SPI peripheral chip requiring special timing synchronous to a sampling rate is questionable. Pretty much everything about FlexSPI is designed around the needs of the IMXRT's internal memory buses.

FlexSPI programming is also much more complex than normal LPSPI. You have to program the operations into special lookup tables. I've done it a few times, only ever for memory chips, and every time it takes quite a bit of adjustment to get back into the practice of how it works. It is possible, but not for the faint of heart. It's among the most complex and difficult to understand microcontroller peripherals I've ever used.

FlexSPI is fully documented in chapter 27 of the reference manual starting on page 1601. If you want to play with FlexSPI programming, you really need to first read chapter 27.
 
This chip, the MCP33131D and family, list serial Clock Frequency Max as 100 MHz.
 
On further thought, it sounds like the effort to use FlexSPI exceeds the budget and benefit for this particular project. He will have to live with 16 bits at hopefully 800KS/s or 14 at 1MS/s.

So, any special advise on how to most rapidly iterate over setting CONVERT, waiting 700ns and reading the data back of SPI at 30MHz?

Thank you again.
 
This part?

https://www.microchip.com/en-us/product/MCP33131D-10

I would try first with the simplest possible code using SPI.transfer() and digitalWriteFast(). Get that working first before you dive into much more difficult low-level access to LPSPI. Maybe the performance will be acceptable? Even if it's too slow, having a working implementation will be really helpful as you attempt to optimize.
 
I second what Paul suggested.

You might then also try the buffer version of SPI.transfer16(buffer, retbuffer, count);\

Although it is in the SPI library as marked private. But easy to try just moving it into the public area of the classs (Or copy the implemention)
And see how fast you can get that to work.

There is a DMA version of SPI.transfer, At one point also had DMA transfer16 as well, but we did not integrate that in. Assuming it works fast enough , than you can swap the bytes.

The difference for transfer8 and 16 (other than bytes swapped), is a smaller gap of time between the two bytes of a word.
The simple transfer16 with buffer should give the same performance as the DMA version. The difference is it won't return until the whole transfer completes.
 
Yes that is the part. Yes of course, I would try the simple loop, exactly like that, verify and see if it might be fast enough. The T4 has been very impressive so far.

I was just working thru how dma might work, and thumbing through the eDMA. It seems like it would have to involve an ISR or loop against a clock, to set the convert and then wait and send a clock to the DMA. So, indeed it is hard to see how it would be much faster than just doing it all in the loop.

Okie, dokie, probably time to order the part and try it. Thanks for all the help, it's wonderful as always.
 
I would imagine an advanced approach might use FlexPWM output compare to generate the control waveforms and transmit a peripheral trigger through XBAR1, which would cause LPSPI to perform the 16 bit transfer, and of course LPSPI would trigger a DMA channel to store the data into a buffer.

Maybe the CCD which create the analog signal also needs control waveforms? It could get complicated. Each FlexPWM submodule can create 2 outputs (not including the extra X output) so you might need to use multiple submodules in sync if more than 2 waveforms are required to sequence everything.
 
It should be fun,

As mentioned, the DMA operation will not speed up the actual transfers.

That is the slowest way to do things is like:
Code:
for (int i = 0; i < 1000000; i++) {
   buffer[i] = SPI.transfer16(0);
}

As this code completely neuters the SPI hardware TX and RX buffers, by design, as it can only do one transfer and much wait until the system returns the actual value, before it returns,
and does not start to do anything else in SPI until the next transfer call is issued. So maybe only busy half the time.

Where as the calls like: SPI.transfer(buffer, retbuf, count);
In this case, the code will keep the 16 word TX buffer full, until the full transfer length, and it will pick off each of the return values from the SPI hardware RX buffer and place into return buffer.
So the SPI will run at the full speed. Also in this case there is also a shorter delay between words...

As I mentioned the DMA version will not be faster for the actual transfer. The main thing there is keeping the SPI doing stuff.

However where the DMA stuff makes a difference, is the actual transfer can be done with almost no overhead of the main processor. Which allows your code to do other things, like probably doing something with all of the data you are receiving.

So for example, you could setup a DMA transfer, to a logical Circular buffer. I typically end up using two DMASetting objects that are chained to each other. And I mark each of these two to interrupt me when they complete. For example, maybe each of these is sized for 4K bytes. So when the ISR is triggered, you can process the 4K bytes, while the SPI continues on to keep receiving more data in the other buffer.
Depending on how fast your process code is, that may be sufficient. If your processing code might be slower in some cases, you may need to expand on it, like maybe copy to external buffer, or maybe if fast enough in most cases, you could maybe link 4 or 8 of these buffers to each other, and hopefully the code will be able to keep up over the larger buffer.
But first baby steps.
 
@Paul That's amazing. The "control waveforms" refers to the ADC? (In the plain DMA, I dont see how to set the convert for each sample, and not only that it has its own timing issues)

The CCD in this instance, gets a pulse to shift data to the readout buffer, while its clock input is held high, and then with the resumption of the clock, the analog levels are presented on the output pin.

So, bottom line, for the length of the record, it only needs a clock. But at each clock, with some small delay for settling time, the ADC needs the leading edge on its convert input.


Most of then work something like that. If there is an extra pin it is usuall a shutter. In this case the shift includes a hold so in effect its both in one line.
 
Kurt is correct, DMA won't directly make this sort of application run any faster. It just reduces the burden on the CPU, which helps indirectly since you need to actually do something with the rapidly incoming data.

But again, building a simple prototype first with digitalWriteFast() and SPI.transfer() is essential. You really want to learn any special issues with the CCD and ADC the "easy" way first.
 
Of course. As it turns out there is not a lot for the CPU to do while the data is coming in, "nothing, ... really nothing" (ala Gertrude Stein).

The worst it could be is if the user would send a command at that moment, there is a Serial.available() in the main loop. We could prevent that on the application sides, but there must be a way on the Teensy side to hold that off until the read is done.


(I took two vaccines today, really knocked out and a little bit feverish, I'm sorry that I have to head to bed now, really wiped out, thank you for everything, its great fun and very elucidating).
 
Okay, so I am starting on the electrical design. I have a quick question

Which set of pins to use?

The website notes for the Teensy 4.0 say that "he first SPI port features a FIFO for higher sustained speed transfers". So, that means I should use pins 10-13? What about the LED on 13? Having that flashing inside the spectrometer casing might not be ideal.

Any suggestions?


Thank you
 
If I could beg everyone's patience, just this one more time. Here is the interface circuit between the T4 and the digitizer.

Did I use the right pins on the Teensy?

There is a note on the webpage that the first SPI has a FIFO, implying that the others do not. But, it looks like the LED is connect to the clock; So, instinctively I fee that is the one to use, but I am also a little concerned since stray light can be an issue for this application.


Screenshot from 2022-12-15 18-36-15.png
 
Based on labels the Mosi and Miso seem correct and the SCK. Odd it doesn't have a CS control, but that is the device end ...
 
Fine, thank you. But what about the LED? Do I need the FIFO or can I use another port and still get 30Mb/s for a 4KB transfer?
 
Pin 13 LED is the typical SPI SCK pin T_3.x and T_4,x - it will be blinking.

FIFO is low level driver specific detail for data transfer.
 
I understand all of that. The question is does each of the LPSPI ports have a receive FIFO? Can I use a different port to avoid the LED?

I found the section in the data sheet, Table 2 implies they all have it, and that makes more sense.

But the first SPI is the only one that is available from the pins around the edge of the board, That means we still have a problem.
 
Last edited:
I know, the datasheet says SPI-compatible, but I would try to use SAI/TDM and use frame-sync for conversion.
I have done that in the past for TI ADS series.
 
Back
Top