USB interface for multi channel outputs, not just stereo

Excellent work @acumartini! Apologies for my quietness the last week, I'm back on working on these software changes and my schematics for the mixer this week. I'll start by refactoring these changes into an easy-to-consume fork of the "cores" repository to open up contributions and experimentation for folks.

For what it's worth, the deinterlacing/memcpy code could almost certainly be optimized significantly (it's a pretty naive implementation right now), so I'm still optimistic that 8-in/8-out is achievable.

Glad to hear @UHF - as the project matures, this is exactly my target. I'm designing them with the hopeful target of keeping the cost of manufacture under $100 and offering them as kits.

@celoranta: I believe some others on the forum had been working on boards that had 2 codec chips (and two TDM lines) for 16 channels of input or output. Not sure of the status of it though. I'm personally sticking to 8 with the hopes of keeping the board and design cost lower, but if you don't need USB Audio for example, you'll have a larger performance margin for higher-bandwidth TDM communication. It all depends on your application.
 
Excellent work @acumartini! Apologies for my quietness the last week, I'm back on working on these software changes and my schematics for the mixer this week. I'll start by refactoring these changes into an easy-to-consume fork of the "cores" repository to open up contributions and experimentation for folks.

For what it's worth, the deinterlacing/memcpy code could almost certainly be optimized significantly (it's a pretty naive implementation right now), so I'm still optimistic that 8-in/8-out is achievable.

Glad to hear @UHF - as the project matures, this is exactly my target. I'm designing them with the hopeful target of keeping the cost of manufacture under $100 and offering them as kits.

@celoranta: I believe some others on the forum had been working on boards that had 2 codec chips (and two TDM lines) for 16 channels of input or output. Not sure of the status of it though. I'm personally sticking to 8 with the hopes of keeping the board and design cost lower, but if you don't need USB Audio for example, you'll have a larger performance margin for higher-bandwidth TDM communication. It all depends on your application.
Really looking forward to seeing this!

It's possible the (de)interlacing code could make use of the DSP instructions - it's a pretty common need in the audio library, though I can't say I've looked closely at whether it's consistently been optimised. Happy to take a look at this, if no-one gets to it before me.

Paul Stoffregen did start a hardware project to interleave two CS42448 parts together to get 16 in + 16 out, though that seems to have stalled. Can't think why...:D
 
@mcginty;

You are correct: I have no need for USB connections. This is for live gigs, and I’ll be playing my guitars on stage while the processing will be in the mixer rack at front of house. USB doesn’t travel so well!

I will see if I can locate the threads you’re referencing. I’m working with two Cirrus CS5368 for 16 channels, with hopes of muxing them together to use just one Teensy TDM input, and then adding a second of these devices to input #2 for a total of 32 channels.

(Theoretically this system will power a doubleneck six-string bass / guitar, each with hexaphonic electromagnetic pickups plus two standard monophonic pickups each; plus hexaphonic piezo pickups on each as well… with four channels to spare. OR a doubleneck 8-string guitar/8-string extended-range bass, each with piezo hex and EM hex sets. )
 
Alrighty, I've refactored the work and now have the cores fork at https://github.com/mcginty/teensy-cores. I'll continue to update it with changes as we go.

@h4yn0nnym0u5e, if you're interested in doing some performance research and optimization that would be most most welcome! I'm sure there are some clever uses of the Arithmetic and SIMD instructions that could help us out here.

@acumartini, I'm assuming you also modified the AudioOutputUSB::update() function, but you didn't include it in your code block. Would need that for your modifications to compile and drop it into the fork!
 
I've done a "first try" pull request, though for some reason there's a conflict because git thinks I've made your changes, too! For some reason I can't fork your repo as well as the PJRC one, so it's all a bit weird.

As you'll see, I tend to work on branches and then merge to master when the branch looks stable - it may be helpful if you create a branch for me to make PRs to, which would avoid possibly polluting your master if I drop a goolie we don't spot before the pull is done...

EDIT: just put a 'scope on the receive and de-interlacing code (usb_audio_receive_callback), and it's only taking 4 to 5µs to run, so probably not worth tinkering with right now.
 
Last edited:
@mcginty I have my own audio framework that processes a single sample at a time and only buffers at IO boundaries. Here is my equivalent to the Audio lib update function, which handles both input and output (hopefully this will be of some use to your impl):

Code:
void USBAudio::process()
{
    Module::process();

	// input
	// if (PLATFORM_SWITCH1_LOW) Serial.printf("DEBUG RX %d %d\n", usb_audio_buffer_in_read_idx, usb_audio_buffer_in_write_idx);
	if (usb_audio_buffer_in_read_idx < usb_audio_buffer_in_write_idx || (usb_audio_buffer_in_rollover && usb_audio_buffer_in_read_idx != usb_audio_buffer_in_write_idx)) {
		if (usb_audio_rx) {
			uint16_t diff = usb_audio_buffer_in_rollover ? 
				(AUDIO_RX_BUFFER_SIZE - usb_audio_buffer_in_read_idx) + usb_audio_buffer_in_write_idx : 
				usb_audio_buffer_in_write_idx - usb_audio_buffer_in_read_idx;
			if (diff < AUDIO_RX_BUFFER_SIZE / 2) { // prevent overrun
				usb_audio_feedback_acc -= AUDIO_RX_BUFFER_SIZE / 2 - diff;
			} else { // prevent underrun
				usb_audio_feedback_acc += diff - AUDIO_RX_BUFFER_SIZE / 2;
			}
			usb_audio_rx = false;
		}

		for (uint8_t i = 0; i < AUDIO_CHANNELS; i++) {
			out[i].value = ((float)usb_audio_buffer_in[i][usb_audio_buffer_in_read_idx] / (float)INT16_MAX) * gain[i];
		}
		usb_audio_buffer_in_read_idx++;

		if (usb_audio_buffer_in_read_idx >= AUDIO_RX_BUFFER_SIZE) {
			usb_audio_buffer_in_read_idx = 0;
			usb_audio_buffer_in_rollover = false;
		}
	} else if (usb_audio_rx) {
		// Serial.println("DEBUG RX: underrun");
		usb_audio_feedback_acc += AUDIO_RX_BUFFER_SIZE * 4; // buffer underrun
		usb_audio_rx = false;
	}

	// ouptut
	uint8_t buffer_idx = usb_audio_buffer_out_ping ? 0 : 1;
	if (usb_audio_buffer_out_idx[buffer_idx] < AUDIO_TX_BUFFER_SIZE) {
		for (uint8_t i = 0; i < AUDIO_CHANNELS; i++) {
			usb_audio_buffer_out[buffer_idx][i][usb_audio_buffer_out_idx[buffer_idx]] = (int16_t)(ctrl_d[i].value() * ctrl_d[i + AUDIO_CHANNELS].value() * (float)INT16_MAX);
		}
		usb_audio_buffer_out_idx[buffer_idx]++;
	}
}
 
Awesome! This is really coming together. I'll review the PRs when I'm back home later today @h4yn0nnym0u5e!

That makes sense @acumartini, and very likely where my mixer's code is going to end up too to minimize latency. Thanks for the code samples, this makes sense. And also going back to your original posting's comments, a big TODO before "production-worthiness" is a correctly implemented accumulator feedback for the asynchronous endpoint in both directions, lest we drop frames.
 
Great, thanks @mcginty. I’m hopeful the “conflicts” are actually pretty minor.

Had to stop for today, getting late in the UK, but I think there could be issues with high sample rates and/or small audio blocks. As it stands there can be at most two blocks queued for transmission, so sample rates higher than 256k will break the system, or similarly over 64k if the block size is reduced from 128 to 32 samples, for example. This probably needs addressing…

I looked at the feedback accumulator… then looked away! Over to you on that…
 
Another PR done, this time on your Audio library fork, which implements and documents 4-, 6- and 8-input and output USB objects in the design GUI, and adds them to keywords.txt. For these you'll need this afternoon's cores update, too.
 
Meanwhile, another PR for cores, with some mostly minor fixes, but one that is vital for other sample rates. Tested so far only at 48kHz and 96kHz on a High-speed USB port.

For information, my current test code is below. USB audio output from the PC is routed to outputs 1-7 of a CS42448 TDM audio adaptor; the Teensy generates 8 sine waves from 110Hz up to 880Hz, and feeds them to the PC's USB audio inputs, and the 110Hz goes to audio adaptor output 0, so you can tell the code is running.
Code:
#include <Audio.h>

// GUItool: begin automatically generated code
AudioInputUSBOct         usb_oct_in;       //xy=482,409
AudioSynthWaveform       wav1;      //xy=483,298
AudioSynthWaveform       wav2;           //xy=485,620
AudioSynthWaveform       wav3;           //xy=485,665
AudioSynthWaveform       wav4;           //xy=485,710
AudioSynthWaveform       wav5;           //xy=485,755
AudioSynthWaveform       wav6;           //xy=485,800
AudioSynthWaveform       wav7;           //xy=485,845
AudioSynthWaveform       wav8;           //xy=485,890
AudioOutputTDM           tdm;           //xy=693,396
AudioOutputUSBOct        usb_oct_out;       //xy=696,744

AudioConnection          patchCord1(usb_oct_in, 0, tdm, 2);
AudioConnection          patchCord2(usb_oct_in, 1, tdm, 4);
AudioConnection          patchCord3(usb_oct_in, 2, tdm, 6);
AudioConnection          patchCord4(usb_oct_in, 3, tdm, 8);
AudioConnection          patchCord5(usb_oct_in, 4, tdm, 10);
AudioConnection          patchCord6(usb_oct_in, 5, tdm, 12);
AudioConnection          patchCord7(usb_oct_in, 6, tdm, 14);
AudioConnection          patchCord8(wav1, 0, tdm, 0);
AudioConnection          patchCord9(wav1, 0, usb_oct_out, 0);
AudioConnection          patchCord10(wav2, 0, usb_oct_out, 1);
AudioConnection          patchCord11(wav3, 0, usb_oct_out, 2);
AudioConnection          patchCord12(wav4, 0, usb_oct_out, 3);
AudioConnection          patchCord13(wav5, 0, usb_oct_out, 4);
AudioConnection          patchCord14(wav6, 0, usb_oct_out, 5);
AudioConnection          patchCord15(wav7, 0, usb_oct_out, 6);
AudioConnection          patchCord16(wav8, 0, usb_oct_out, 7);

AudioControlCS42448      cs42448;      //xy=688,547
// GUItool: end automatically generated code

AudioSynthWaveform* waves[] = 
  {&wav1, &wav2, &wav3, &wav4, 
   &wav5, &wav6, &wav7, &wav8};
   
extern uint32_t feedback_accumulator;
   
void setup() 
{
  Serial.begin(115200);
/*
  while (!Serial)
    ;
*/

  Serial.println(AUDIO_SAMPLE_RATE_EXACT);
  Serial.println(feedback_accumulator);
  
  AudioMemory(50); // needs plenty, as blocks are used for USB buffering
  
  cs42448.enable();
  cs42448.volume(1.0f);
  
  for (int i=0;i<8;i++)
    waves[i]->begin(0.25f,(i+1)*110.0f,WAVEFORM_SINE);

  pinMode(0,OUTPUT);
 // pinMode(1,OUTPUT);

  pinMode(LED_BUILTIN,OUTPUT);
}

void loop() 
{
  digitalWriteFast(LED_BUILTIN,1);
  delay(10);
  digitalWriteFast(LED_BUILTIN,0);
  delay(240);
  digitalWriteFast(LED_BUILTIN,1);
  delay(10);
  digitalWriteFast(LED_BUILTIN,0);
  delay(240);
  digitalWriteFast(LED_BUILTIN,1);
  delay(10);
  digitalWriteFast(LED_BUILTIN,0);
  delay(240);
  digitalWriteFast(LED_BUILTIN,1);
  delay(10);
  digitalWriteFast(LED_BUILTIN,0);
  delay(990);

}
 
Last edited:
Looks like some work is definitely needed on the feedback accumulator stuff. I found this post which looked hopeful, but implementing a version of it supposedly tailored to my current 96k setup (it's a pain to get Windows to change it, and I may as well torture test as I go!) doesn't seem to be working. I get an underrun and audible glitch every 30s or so, which is probably more a reflection of the clock mismatch I happen to have in my system than any reproducible result.

A couple of points occur to me, though I freely confess I'm no expert on the USB internals:
  • nothing ever actually seems to call sync_event() after the initial configuration
  • feedback_accumulator only ever seems to get modified in the input code, so USB transmit-only systems will likely have issues
 
@h4yn0nnym0u5e, I've started the arduous work of reading through the USB Audio Spec and comparing it to the descriptor as well as documenting more: https://github.com/mcginty/teensy-cores/pull/4

Interestingly, while the input side of the Teensy's USB descriptor is an Asynchronous Isochronous endpoint with a feedback synch endpoint, the output side of the descriptor does not in fact have an asynchronous synch endpoint, and reports itself as an Adaptive Isochronous endpoint, with no feedback synch endpoint...
 
Interestingly, while the input side of the Teensy's USB descriptor is an Asynchronous Isochronous endpoint with a feedback synch endpoint, the output side of the descriptor does not in fact have an asynchronous synch endpoint, and reports itself as an Adaptive Isochronous endpoint, with no feedback synch endpoint...

I was wondering about this and am really interested to see what your investigation digs up!
 
Hi all!

As a side note, I had 5 PCBs made of that CS42448 test board, and have... 4 left, and am happy to mail them to anybody else in North America that want to try their hand at soldering one (can send my Digikey cart to show you what to order too). Feel free to message me.

Would be happy to collaborate if anybody else is working on similar projects. I'm doing this work to start on a DJ/DIY-focused mixer (kind of similar to the Teenage Engineering TX-6, but hackier of course ;). I made my first prototype with an STM32, and damn, so much respect to Paul and everybody who's worked on the Teensy and its libraries for creating such a fantastic playground.

A little late on the draw here, but if you still have any CS42448 boards you’d like to divest yourself of I’d happily cover shipping and any other handling costs.

Our goals are nearly in parity and I’d love to collaborate, though in the interest of full disclosure I’m a little over-encumbered at the moment. I’m happy to help with the caveat that it would be a bit haphazard. The likelihood is very high I have code to contribute that doesn’t crossover with your own - this might prove to be a productive start.

Feel free to reply or DM.
 
It would be good to complete this, though @mcginty seems to have gone AWOL in 2023. I think we got to the point that a lot of stuff was working, or very nearly so, with the following on my radar as needing completion:
  • USB sync - I was getting the occasional glitches using Windows, though mcginty seemed to be having better luck on Linux
  • working and valid packet sizes at all sample rates and channel counts; at the moment invalid packet sizes of >1024 bytes are working with 1ms intervals, but changing them to <1024 bytes at (say) 500µs intervals doesn't work
  • do something sensible for the relevant USB descriptors when more bandwidth is specified than full-speed USB can provide
This is all quite deep USB stuff which I don't understand - if you do, it would be great to have your input!
 
I'm wanting to implement a hardware USB-USB (which will be USB-TDM-USB in practise) bridge that will be 5.1 out and mic in, so effectively one 6i2o and one 2i6o. I have two Teensy 4.1s I'm hoping to do this with. It looks like the code you've got here is the best chance of doing that - this is something that seems to be sort-of half done in half a dozen projects. I'll see if I can get it going over the next week - at least I can perhaps provide some testing feedback.

My use case is I'm using pipewire on a linux host to route all the audio for my desk setup, and this allows me to arbitrary change routing for any of my audio devices and change processing based on either controls or scripts/automations without having to spend thousands on a top end integration system. I'm currently using Voicemeeter to send VBAN audio, which is fine, but I really want to have a solution that doesn't require software on the client devices (I've achieved this for webcams by using a hdmi capture adapter which OBS sends a 'preview' to. The commercial options are either pretty expensive or outrageously expensive.
 
Sorry, I decided to check in this thread instead of opening another one. What's the current state of multi-channel USB audio? It seems there are several attempts to make it going, such as in 4x4 configurations, however, they didn't make to the main repository. Is there more work to be done?
 
Same as it was nearly a year ago at my post #42, I'm afraid. I'd like to close this out, but @mcginty bailed, and everyone else who's expressed "an interest in helping out" seems to be a one-post wonder. I simply don't have the low-level USB skills to be able to craft a complete and correct descriptor, together with the associated feedback code, to get 100% reliable completely glitch-free multi-channel audio at sample rates up to 96kHz through both Windows and Linux. I think we got close, but only with the use of a technically incorrect descriptor, and minor but still detectable glitches which only show up after a while and on very close examination. If you're not too picky, it might do. For some use cases. I'm picky...

The development petered out in this thread, which has a few links to the works-in-progress.
 
There is an open pull request for adding multiple USB channels as well as configurable sample rates.
 
This would be awesome differentiating feature to Daisy, and I badly need it in at least 4x4 config for my use case. Unfortunately, I also have no experience in USB stack development to help driving it.
 
Hi, I finally also gave it try and implemented a multichannel version of the usb in- and output. Escpecially the usb input seems to work on my Windows 11 notebook nicely, but I would be glad if somebody else could also test my implementation. If somebody is interested, the code can be found here: https://github.com/alex6679/teensy-4-usbAudio/tree/main
I have to admit, my focus was on the usb input and I spent quite some effort on improving the computation of the feedback to the host. At the usb output I did not change that much. The only mechanism that prevents buffer over- and underruns here, is the duplication or the skipping of single samples. On my notebook for example the output duplicates a sample every 5-6 seconds in order to prevent a buffer underrun. I know, that is not the best solution, but maybe a slight improvement over the current implementation.
Also, I switched the USB standard from 1 to 2. I guess in hindsight this was probably not necessary.
I am curiuos about your experiences with my implementation.
 
Last edited:
Since I found some time, I want to describe my changes to the usb input, especiall how I changed the feedback:
- I don't only use the raw number of samples that are currently stored in the buffer for the feedback computation. Instead I also measure the time of the last packet of samples that received and use the rate at which the host sends samples to the Teensy. E.g at 44.1khz and when the last packet of samples arrived 0.5ms ago, I add 44.1kHz*0.5ms = 22.05 virtual samples to the buffer count. (As if the samples arrive sample by sample and not in packets) This value does not depend that much on the short term timing of the isrs and therefore I think it is better suited for the computation of the feedback.
- If I understand the current code correctly, at the moment the controller for the feedback is a pure integral controller. I use a PI controller and added the proportional part, so that deviations from the target number of buffer samples are faster corrected. The parameters of the PI controller can optionally be set in the constructor of the usb input. However, with my notebook they did not seem to be that critical and I hope that the default values work for many hosts.
- When an audio stream starts, or if there really is a buffer under- or overrun, the buffer is resetted and the number of buffered samples is set to the target number (filled with zeros). Hence, the controller starts a the optimum value and does not need to recover the buffer from e.g. a near underrun that just happened by chance at the beginning of a stream.
- Not related to the feedback, but maybe noteworthy: The bInterval is set depending on the number of channels, sampling rate and sample bith depth so that the usb packets do not exceed 1024 bytes.

Here an example plot of the buffered number of samples (the number of samples was plotted every 200ms). The blue line shows the actual samples and the orange line the acutal samples + the virtual samples. The stream just started and the PC sends the samples a little bit too fast to the Teensy.
bufferedSamplesStart_new.png

After a while the number of buffered samples settles close to the target value of 79.3 samples:

bufferedSamplesSettled_new.png

I let that test run for quite a while, and it seemed that the buffered samples remain just very close to the target value, without anything ineresting happening.
Unfortunately, I only have a Windows 11 notebook and can't test how the input works with other hosts.
 

Attachments

  • requestedFrequSettled_new.png
    requestedFrequSettled_new.png
    181.6 KB · Views: 51
This looks great, I definitely plan to give it a go when I have time to do it properly. I’m understanding from what you say that we should be able to get 8 channels at 96kHz working properly? That needs 1536 bytes/ms, so a bInterval of 0.5ms or 4 microframes. I never managed to get that to work…

Will be testing on Windows 10 x64.
 
Back
Top