I2S Speed (768kHz?)

neltnerb

Well-known member
I'm so, so sorry to post but I'm not sure how to find this information. I know it must be somewhere in the forum but my search terms aren't fruitful.

To start out, I want to make clear that I am not that interested in the audio library, but need to use I2S so hopefully this is an okay place to ask. My primary goal is to output a signal on one channel and to read it on 1-2 other channels and then multiply the signals together and low-pass (i.e. a lock-in amplifier using a CODEC).

I've read a bunch of other posts about the I2S interface speed and I understand there is a library that you can do a define to change the speed.

I have this CODEC that can do 768kHz at 32-bit stereo in and out (https://www.ti.com/product/TAC5212), and I cannot imagine this will work with the Teensy 4.0 I2S interface but with my apologies I can't quite tell how to guess what the hardware limit is.

I found
of course, and Figure 31/Table 50 says the hardware can go down to a 15ns clock period. I'm not 100% sure how to use that to predict audio sample rates but I am estimating with 1/15ns (MCLK freq) / (32-bits * 4 channels) = 520kHz.

So assuming that it's physically unable to reach 768kHz, does it seem reasonable for it to run the I2S interface at 384kHz? It seems like 192kHz is readily possible, but for this application the faster the better because it's not real audio.

Relatedly but a different question I suppose -- is there a reason to use the audio library to multiply the signal received by a reference waveform? I know there's a multiply function in the audio library but this is pretty high speed, I'm guessing that the people who wrote the audio library are way better programmers than me so I wonder if it will be faster to do that than literally do multiplication at each received sample, get the envelope, and then low-pass.

Again, apologies because this feels like a question that has been answered.
 
Last edited:
4 channels? They would run in parallel, so BCLK for 32bit 768k is 64 x 768k = 49.152MHz, which has a period of ~20ns. There might be a ratio between MCLK and BCLK, but its not to do with channels, hopefully the device doesn't need MCLK anyway?
 
I can't quite tell how to guess what the hardware limit is.

I found
https://www.nxp.com/docs/en/nxp/data-sheets/IMXRT1060CEC.pdf of course, and Figure 31/Table 50 says the hardware can go down to a 15ns clock period. I'm not 100% sure how to use that to predict audio sample rates but I am estimating with 1/15ns (MCLK freq) / (32-bits * 4 channels) = 520kHz.

15ns is for MCLK, but the spec that really matters is 40ns for BCLK (bit clock). Ultimately the bit clock determines how many bits per second can move along each I2S data signal.

I2S is always stereo. If you use 32 bit data size (which means 64 bit frame size = BCLK / LRCLK, because stereo), then the I2S digital speed limit is 25 MHz / 64. So according to the I2S specs for Teensy, with the right code you ought to be able to get 384 kHz sample rate.

In the codec chip datasheet, that corresponds to this highlighted place in Table 7-6.

1709899421668.png


If you wanted to get 768 kHz, you would need to reduce the data size to only 16 bits (BCLK / LRCLK = 32), which gives the 768 kHz sample rate. But you can't get all 32 bits at that high sample rate. Teensy can't do it, and if I'm reading Table 7-6 correctly (will admit, only quickly skimmed the codec datasheet) looks like this codec chip can't give you all 32 bits either, if running at 768 kHz.
 
so I wonder if it will be faster to do that than literally do multiplication at each received sample, get the envelope, and then low-pass.

The word "faster" can mean different things, so when talking about how to design the code, important to understand your goals.

The audio library is pretty efficient. It's also highly resilient to disruption from other libraries utilizing interrupts or needing CPU time for their tasks. This is achieved by receiving, processing, transmitting audio data in blocks, where the default block size is 128 samples. Data transfers from hardware to the buffers are done by DMA. So it is "fast" in the sense of minimizing CPU usage, which allows quite large and complicated audio processing to be achieved. But the trade-off is latency of up to 128 samples.

Again, apologies because this feels like a question that has been answered.

Yes, over and over we've had people on this forum who wanted to achieve the other meaning of fast, processing each sample individually for minimum possible latency. It's quite inefficient CPU-wise. It's also extremely difficult to accomplish reliably, even at 44.1 or 48 kHz sample rate, because even just using USB or timer interrupts can mean you'll miss 1 or more audio samples. It also involves pretty incredible overhead, because you'll either need an interrupt for each audio sample or a tight polling loop. Your code will also suffer overhead like calling functions, setting up pointers to buffers, loading constants into CPU registers... just to process 1 sample.

With block processing, you suffer all that overhead too, but only once for each block. So the overhead gets amortized over 128 samples (or whatever your block size is) which results in much less overall CPU usage for reasonably large blocks sizes.

Even on extremely fast PCs and Macs, audio is almost always processed in blocks. If you search for specialty audio interfaces used my musicians for recording while also listening to already recorded tracks, you'll find lots of info about loopback latency and reducing block sizes. The takeway point to get is even on the fastest modern PCs with latest AMD or Intel chips and Apple Silicon Macs, block processing is used for a good reason.

But unlike a PC or Mac, on Teensy you're running on bare metal with full control over pretty much everything. So it is theoretically possible to process each sample with an interrupt. Plenty of other tests have shown it's possible to run at 1 to 2 million interrupts per second on Teensy 4 (theoretically the hardware can do more but faster than 2 million is almost never practical due to software overhead issues). Just know that making such a design 100% reliable at capturing every audio sample is extremely difficult. Use of any other interrupts usually means you miss at audio sample if you try this way, especially at a sample rate of 384 or 768 kHz!
 
Last edited:
T4.1 should be able to sample at 768 kHz as acording to Table 7.6, Teensy must only provide 24.576 kHz Bitclock.
IMO, you have two options:
16 bit I2S (2channel according to I2S protocol, where frame-sync is 16 bit high-16 bit low, resulting in 32 bit per frame)
32 bit mono TDM (single channel with frame-sync ony 1 bit high and 31 bit low)

in both cases you must write you own SAI interface.

32 bit stereo (2 channels (I2S or TDM) is not possible at 768 kHz (as shown in Table 7.6)

I have a TLV320ADC6140 (not a codec) with identical limits, but I admit that I never tried 768 kHz.
 
Thanks for all the helpful insight, I think I know enough to make some progress. The simplest possible lock-in would require one mono input and one mono output, thanks for the thoughts on interrupts here and the risks and potential benefits.
 
You mentioned that it can only do stereo over I2S and I don't doubt that you're correct, I know little about the protocol. Windows has a USB driver to talk to it (which goes to their own MCU board) that claims it can do 32-bit stereo on 8 channels at 768kHz but, well, that explains why that driver keeps crashing.

I'm pretty sure you are correct that even if the Teensy were able to go faster as you mentioned earlier the CODEC wouldn't keep up at that speed and resolution due to the transition timing. I suspect that's why the eval board kept croaking at high sample rates, the entire windows driver goes unstable after a minute or two. I had hope that using the Teensy might avoid bad windows driver programming and non-realtime OS issues but I am in agreement with you that (assuming I2S can only do stereo) it would require two stereo channels (one out one in) and thus require either 16-bit accuracy or 364kHz.

I will be pretty thrilled if I can get either, especially since I now know the CODEC couldn't handle 768kHz anyway with multiple channels.
 
I2S protocol is stereo. LRCLK (called FSYNC on this chip) is 50% duty cycle in I2S, where left is transmitted in one phase of LRCLK and right is transmitted in the other phase.

Maybe you could configure that chip for some non-I2S protocol, perhaps with FSYNC asserted for only 1 bit clock cycle. Teensy's SAI port is highly configurable with 5 registers. So is the DMA controller. Both have quite a learning curve and even if you're familiar with them, proper settings for any new protocol requires quite a bit of fiddling and experimentation (at least for me...)
 
The CODEC says it also supports TDM and "Left-justified [edited for typo]", I'll have to look them up but sounds like a good idea if those support mono.
 
Left justified is a variant of I2S with a 1 bit timing shift of the data. It is easier for some chips to handle. Audio ADCs and DACs are all at least stereo because why wouldn't they be...
 
Oh for sure it's capable of it. I just only need mono out. But it seems like that won't change the situation much because the audio data format is default stereo, yeah.

I'm pleased in any case, I am going to be thrilled if I can get to 364kHz with such an odd approach.
 
Back
Top