Updated 8x8 and 16x16 audio

palmerr

Well-known member
Folks,

With the CS42448 going end of life, I'm updating my 8x8 audio hardware (Audio Toy) and drivers to use multiple TLV320AIC3104 chips (4 per board).

These CODECs pay proper attention to TDM transmission, having settings to enter 256 clocks/frame mode, set the TDM slots used and to allow DOUT to go tri-state when not sending data.

However, the COCDECs require an inverted BCLK, as the data is valid on the falling BCLK edge, rather than the rising edge as for the CS42448.

Is there an easy way to invert the BCLK in the Teensy TDM driver, or should I just invert the signal with hardware?

BTW, the new design allows two boards to be stacked for 16x16 channels and has PGAs that allow direct connection of microphones without preamps.
 
Thought I remembered something … there’s a thread here … though as so many do it peters out without a report of success.

How are you planning to do the 16x16 stack? By this time i don’t think Paul’s demultiplexer is going to see the light of day…
 

h4yn0nnym0u5e,


This particular TLV chip has no I2C address lines, so I decode the I2C SCK line on each board with a 2:4 analogue mux. The enable pin on the mux is driven by a Teensy pin - with a jumper to differentiate between the Boards. That takes care of the I2C control.

256 BCLKs gives a practical limit of 16 x 16-bit slots.

I took a quick look at Paul's approach for the CS42448 a while ago and I remember it involved shifting the LRCLK pulse and muxing the DO lines on the CS chips. In this case, the TLV CODECs have all that built in - you just have to tell them that it's TDM mode and which slot to occupy, the chip tri-states the DO pin when not driving its slot.

WMXZ,

Thanks for the tip.

I inverted the BCLK with hardwired logic and shifted the slot one bit later to get the right DO/BCLK/LRCLK alignments and it worked fine. I'm now in two minds about staying 'Teensy Audio TDM' standard (with hardware inversion) and going non-standard with your mod.

I'll try out your soft approach and then decide - the software approach would simplify the hardware. If the I2Sx_RCR2 and I2Sx_TCR2 registers are directly available in user mode, then the software solution seems straightforward.
 
Last edited:
Ah yes, being able to configure the slots per-chip makes life a lot easier! Any particular reason you chose an analogue mux rather than a TCA9548A for the I2C routing? It would save on Teensy pins, though possibly more expensive…

There are moves afoot to allow more than just one pair of TDM streams, so you might want to add some jumpers to route the other three audio data pins to the TLV chips, and possibly to the second I2S bus, too. Unfortunately OUT1B is not easy to get to on Teensy 4.0 :( But you’d end up being able to stack 6 boards giving 48/48 I/O :cool:

You can access the registers directly from user space, though obviously it would be cleaner to have the library provide a function to do it.
 
Thanks for the thought about the I2C mux. I simply didn't think of it! I'll see if there's a 4way version available, as it will save some Teensy pins and be a more elegant solution, overall.

I've left enough overhang space on the PCB for a T4.1, making OUT1B easier to get at if needed. As for the jumpers for the extra data pins, that sounds like a good idea if I can make them fit on an already very complex board.

The reason for T4.1 compatibility is that I also want to get back to audio over ethernet. I'm heading towards Vincent Buren's VBAN protocol, as it has neat (and really cheap, if not always free) send and receive apps for the major PC and tablet platforms and fairly easy integration into most DAWs. Paul also mentioned an Apple protocol a while ago, but I can't find the reference again. Anyway, that's a discussion for another thread!
 
TCA9546 seems to be a 4-way version, though the TCA9548 in VQFN is physically smaller, as far as I can see. Both can have one of 8 different addresses, which you'd want if you have ambitions to a 6-board stack (bragging rights, if nothing else...).
 
The mux needs to be done per board, so the 1:4 mux is a better fit. (TCA/PCA 9543/4). It is bigger than the tiny TSSOP-10 analog mux I'm using currently and the wiring is more complex (individual SDA/SCL and resistors for each chip) than the mux version (individual SCK and one resistor each). I'll keep playing with the approach, however, as it's more elegant with the I2C mux and saves 4 Teensy pins.

I'm not really interested in bragging rights, but am keen to ensure that Teensy Audio users can get the hardware they need to deliver on their dreams!
 
Progress:

I have populated a Rev A board and have I2C multiplexing working across the four CODECS and am now working on getting the TDM slots properly aligned and enabled for the ADCs and DACs. The programming model for the chip is quite complex (more than 150 writable registers, with some needing to be written in a specific order!) so progress is somewhat tedious on the software side.

Rev B will use a TCA/PCA9544 for I2C multiplexing, as there is a suitably small package available.

Hand soldering the VQFN CODEC chips was a pain, so I'd recommend buying boards with at least these chips assembled - I'll share the full design on JLCPCB when it's done.
 
The RevA board is working nicely, however there is an input spike at ~21kHz, the magnitude of which is directly proportional to the signal level (at least for inputs from 0.9 FSD to 0.05 FSD). There is no frequency spike on a sine wave output.

The code was compiled at 600Mhz for the T4.0 using the latest teensyduino.

The spur frequency seems to be a sideband of 1/2Fs as its frequency is 21.025kHz with a primary signal of 1kHz and 21.85kHz with a 200Hz primary. The magnitude of the spike increases with frequency, being 50dB below the fundamental at 200Hz, and 5dB greater than the fundamental at 8kHz, where the spur frequency has moved down to 14.05kHz.

Adding and removing the LRCLK series resistor doesn't change the magnitude, and the frequency isn't exactly half the sample rate, so I doubt if it's crosstalk from LRCLK.

I tried some input filtering (150 ohms/2700pF), but that didn't make any difference to the magnitude of the spur.

The artefact is there and the same magnitude with both single-ended and differential inputs.

There's no active logic else on the PCB, other than the T4.0, the TLV320AIC3104s and a single inverter for BCLK.

Any clues?
 

Attachments

  • test.jpg
    test.jpg
    109.1 KB · Views: 23
  • pcb.jpg
    pcb.jpg
    226.5 KB · Views: 25
  • schematic.pdf
    89.8 KB · Views: 19
Last edited:
That seems to be behaving like aliasing, but having said that I don't know enough about how and where that might arise!

One thing I could and did calculate ... that input filtering with 150R/2700pF has a -3dB point of about 393kHz, so it won't make any difference. I found an online tool, though I only used it to check my hand calculation of 1/(2·pi·R·C) ... promise.
 
Yes, your calculation of the pole frequency is correct.

CODECs oversample at 128 or 256 times Fs and single pole filters start to roll off at 1/10 of the corner frequency, so an antialiasing filter is usually set at around 10x the maximum desired sample frequency.

I think I actually have a TDM timing issue on the input which is bleeding a bit across two channels or sampling when the data isn't stable. The aliasing artefact doesn't appear if I feed the right channel input instead of the left.

Next test will be to connect the TDM inputs to the outputs and see what comes out!
 
There is some awesome work being done, that I've dreamed of for ages. Very nice.

I've done a lot with this type stuff so let me know if I can help.
 
JayShoe,
This is why I chose the 3105.
I must have missed something when reading the 3105 datasheet. How are the different I2C addresses selected?

My choice of the 3104 was for the differential inputs, as I do a lot of pro audio work where that's essential. On the 3104 I've used an I2C mux arrangement, which works fine.

I'm still narrowing down the aliasing issue, above, which I now suspect is a 3104 register programming issue, as it is only on the R channel across all four devices and only on inputs and not outputs.

BTW, once I have this version working, I'll take a look at the 4 stream work and add jumpers for the boards to select the appropriate DO/DI lines. 16 x 4 is a nice number of channels to be able to access! (or 8 x 4 in 24 bit mode).

BTW, have you got multiple 3105's working in TDM mode?
 
Last edited:
The CS42448 uses 32-bit TDM slots and the 3104/3105 chips use 16-bit slots. It appears that the audio library code assumes this and does some 'special' data juggling to put the 32-bit data into alternate audio-library buffer slots.

The relevant (input) code seems to be in the ISR code as the DMA blocks are processed, specifically lines 134-139:
Code:
    for (i=0; i < 16; i += 2) {
            uint32_t *dest1 = (uint32_t *)(block_incoming[i]->data);
            uint32_t *dest2 = (uint32_t *)(block_incoming[i+1]->data);
            memcpy_tdm_rx(dest1, dest2, src);
            src++;
        }

The memcpy_tdm_rx( ) code seems to do the actual juggling (lines 94...)

Code:
// TODO: needs optimization...
static void memcpy_tdm_rx(uint32_t *dest1, uint32_t *dest2, const uint32_t *src)
{
    uint32_t i, in1, in2;
    for (i=0; i < AUDIO_BLOCK_SAMPLES/2; i++) {
        in1 = *src;
        in2 = *(src+8);
        src += 16; // was  "src += 8;" prior to jMarsh's response.
        *dest1++ = (in1 >> 16) | (in2 & 0xFFFF0000);
        *dest2++ = (in1 << 16) | (in2 & 0x0000FFFF);
    }
}


While it appears equivalent (simply processing the samples individually instead of in pairs) this works properly for the 3104.

Code:
static void memcpy_tdm_rx_16(uint16_t *dest1, uint16_t *dest2, const uint32_t *src)
{
    uint32_t i, in1;
    for (i=0; i < AUDIO_BLOCK_SAMPLES; i++) { // 2 samples per word
        in1 = *src;
        *dest1++ = (uint16_t)((in1 >> 16) & 0x0000FFFF);
        *dest2++ = (uint16_t)(in1 & 0x0000FFFF);
        src += 8; // 2 samples per word
    }
}

...with the appropriate 16-bit destination pointers defined in the ISR
Code:
        for (i=0; i < 16; i += 2) { // channel pairs
            uint16_t *dest1 = (uint16_t *)block_incoming[i]->data;
            uint16_t *dest2 = (uint16_t *)block_incoming[i+1]->data;
            memcpy_tdm_rx_16(dest1, dest2, src);
            src++;
}

Any clues as to why one works and the other produces the weird aliasing described above?
 
Last edited:
Any clues as to why one works and the other produces the weird aliasing described above?
This doesn't look right:
Code:
// TODO: needs optimization...
static void memcpy_tdm_rx(uint32_t *dest1, uint32_t *dest2, const uint32_t *src)
{
    uint32_t i, in1, in2;
    for (i=0; i < AUDIO_BLOCK_SAMPLES/2; i++) {
        in1 = *src;
        in2 = *(src+8);
        src += 8;
        *dest1++ = (in1 >> 16) | (in2 & 0xFFFF0000);
        *dest2++ = (in1 << 16) | (in2 & 0x0000FFFF);
    }
}

The same src data is being used twice; first as in2, then as in1 during the next iteration. I think src should be incremented by 16 instead of 8.
 
jMarsh,

You are quite correct, I miscopied a partially edited piece of code from Paul's original version.

It should read.
Code:
static void memcpy_tdm_rx(uint32_t *dest1, uint32_t *dest2, const uint32_t *src)
{
    uint32_t i, in1, in2;
    for (i=0; i < AUDIO_BLOCK_SAMPLES/2; i++) {
        in1 = *src;
        in2 = *(src+8);
        src += 16;
        *dest1++ = (in1 >> 16) | (in2 & 0xFFFF0000);
        *dest2++ = (in1 << 16) | (in2 & 0x0000FFFF);
    }
}

I have fixed the original post so as not to confuse others.
 
The original library is intended to output 16 16-bit samples in each frame: the CS42448 gets its 32-bit samples by dint of the designer only wiring the even-numbered ports on the TDM output object. Purportedly tested by Paul, but maybe not as thoroughly as you!
 
…hit “post” too soon…

Maybe the omission of the masks for in1 is the issue? Hard to say…

For my multi-output update I had to completely re-write the blocks-to-buffer code, because the interleaving changes with the number of outputs in use. I’d be grateful if you can find time to test it with your hardware at some point, as I don’t have any 16-bit capable codecs.
 
Argh. Sorry. We’re talking about inputs! Here’s a PR intended to fix an input problem, though it never got merged … it was ignored for 3 years, then the contributor closed it and replaced it with a PR for some totally broken multi-IO code … so I wrote mine, which I believe to be fully working and will submit a PR when a few people have tried it.
 
OK,

I'll do that at some point. Now that I have the TDM in and out working, I want to keep testing functionality on a stable software base while I finish the Rev B hardware.

I'm working on having it PCBA after the next revision, as I'm tired of trying to accurately solder VQFN packages.

I'll probably make them available on Tindie, as that's the easiest commercial platform for me.

I'm about to start reading the 4-channel posts, so that I can make DI/DO re-routable on the next version.

It looks like the available T4.0 TDM/I2S pins are 2, 6, 7, 8, & 9 (discounting 32 on the back or a T4.0 and an extended pin on T4.1). Is that correct?
 
Last edited:
2 and 5 belong to SAI2; 6, 7, 8, 9 and 32 to SAI1. The TDM objects in the Audio library use SAI1, and the TDM2 use SAI2.

Pin 32 is what’s used for AudioOutputTDMB and AudioInputTDMD. It may be the hardware can be configured to “discount” pin 32 and use pin 9 for the second output, but my code doesn’t cater for that option - you’d end up with either too many TDM objects, or having to have constructor parameters, which the Audio library doesn’t really make it easy to do and maintain.
 
OK, that's workable. Looks like we've got 6x16 audio channels possible.

That's a lot of buffers, particularly if there's some processing involved!

Both SAIs churning through 16 ins and 16 outs is probably enough for anything likely to be needed. Going past that may be an interesting exercise, but may not have practical application, other than a really big audio snake.
 
Now that TDM input is fixed for odd channels (see Audio lib PRs 440 and 480), a second tranche of boards has been ordered for testing in 16x16 mode, with mappable DI and DO pins for even further expansion, for final testing before going to the PCBA stage.
 
Back
Top