Help implementing full USB Audio Class 2.0 (UAC2) async on Teensy 4.1 (32-bit / 786 kHz)

krilix

Member
Hi all,
I’m working with Teensy 4.1 (i.MX RT1062) and I’d like to attempt a custom, true USB Audio Class 2.0 implementation.


I understand the current Teensy USB audio support is closer to UAC1 / adaptive, and that full async UAC2 is not officially supported. I’m intentionally looking beyond the stock implementation and am prepared to maintain a custom core if needed.


My target goals:


  • Full UAC2 descriptors
  • Asynchronous mode (device-driven clock with feedback endpoint)
  • USB → I2S output to an external DAC
  • 32-bit audio
  • Sample rates up to 768 kHz / 786 kHz (experimentally, even if not all hosts support it)

What I’m hoping to get guidance on:


  • Which core files are best to modify (usb_desc.c, usb_audio.c, others?)
  • Any known limitations of the RT1062 USB device controller for async UAC2
  • Handling feedback endpoints correctly on Teensy
  • Strategies to decouple USB SOF timing from audio processing
  • Practical bandwidth and buffering limits at very high sample rates

I know these rates are extreme and not guaranteed across all hosts. This is partly experimental and partly to understand the true limits of the platform.


If anyone has attempted partial UAC2 support, async feedback, or high-bit rate USB audio on Teensy, I’d really appreciate any pointers or warnings before I go too far.

ps: i found this repo on github "https://github.com/laiudm/laiudm-Teensy4-192k-USB-Audio/tree/main" which changes core and desc files im hoping that we can get fiels for full uac 2 and hopefully that works
 
I'm curious which DAC chip you would use?

But to answer about whether it's possible, the first big issue I see is the I2S maximum BCLK frequency. The maximum is 25 MBit/sec. A 64 bit frame size (32 bits for each left and right) at 768 kHz would require 49.152 Mbit/sec.

1768834877489.png
 
Thanks Paul, for the clarification.


Given the I2S BCLK limit, I agree that 32-bit stereo at 768 kHz is not feasible. Reworking the numbers, 32-bit stereo at 384 kHz would require:


  • 64-bit frame (32 L + 32 R)
  • 384 kHz × 64 = 24.576 Mbit/s

Which appears to be right at the upper edge of the SAI/I2S capability on RT1062.


If staying at or below ~24–25 Mbit/s BCLK, would 32-bit / 384 kHz stereo I2S be considered realistic on Teensy 4.1, assuming careful clocking and minimal overhead?


For DACs, I’m considering parts that accept standard 32-bit I2S at 384 kHz "ESS" prolly ess9038q2m or ess9039q2m, so no unusual framing beyond classic stereo I2S.


Thanks again for pointing out the hard limit, it helped narrow the design space considerably.
 
You may want to take a look at this thread. I’ve had USB running at 96k/8 channels/16 bit, so it’s on the way. Barring hardware constraints for your codec, of course.
 
If staying at or below ~24–25 Mbit/s BCLK, would 32-bit / 384 kHz stereo I2S be considered realistic on Teensy 4.1, assuming careful clocking and minimal overhead?

Yes, I believe it should be feasible. You'll need to dive into the USB and I2S code, but the existing code should give you a pretty good starting point.

Isochronous protocol on every microframe should be packet size of 384 bytes, which is well under the 1024 byte limit.

I can't think of any hardware or other hard limits you'll face. This data rate isn't terribly fast. Especially if you're just having the USB hardware dump raw data into buffers, even if you memory to memory copy all the data a couple times into the buffers the I2S hardware reads with DMA, you'll need only a small fraction of Teensy 4.x CPU power.

However, one part of what you've said doesn't make sense to me:

Asynchronous mode (device-driven clock with feedback endpoint)

You're saying you want async mode, but the description is of adaptive mode.
 
Thanks Paul, that’s helpful and I see where my wording caused confusion.


What I’m ultimately aiming for is true asynchronous USB audio, where the audio sample clock is generated locally on the device (from an audio PLL / external oscillator), and the USB host is kept in sync via a feedback endpoint, as defined in UAC2.


In my previous description I mixed that up with adaptive behavior. To clarify:


  • Adaptive mode: device clock tracks USB SOF rate
  • Asynchronous mode: device runs its own clock and reports rate back to host via feedback endpoint

My intent is the second case.


For the initial implementation, I agree that starting from the existing Teensy USB audio code makes sense, especially since the data rate for 32-bit / 384 kHz stereo is modest and well within USB HS isochronous limits, as you pointed out.


The main work I expect is:


  • Modifying descriptors to expose async endpoints
  • Implementing feedback endpoint handling
  • Decoupling I2S clocking from USB SOF timing

Thanks for confirming there aren’t obvious hardware limits at this rate, that gives me confidence the remaining challenges are mostly architectural and software-side.
 
That's what we already implement.
Indeed.

And I’ll reiterate, a huge amount of work has been done on improving the existing Teensyduino implementation, which you can find discussed and linked to in this thread.

UAC2 is already done, up to 96k/8 channel/16bits has been tested and works, which is much closer to your desired rate than stock 44/2/16, and slot sizes of 16 or 24 bits are catered for. I’ve not tested 24-bit data, but I think someone did. I do know that the feedback implementation is measurably better.

The main work I expect is:
  • Modifying descriptors to expose async endpoints
  • Implementing feedback endpoint handling
  • Decoupling I2S clocking from USB SOF timing
I think you’ll be able to save quite a lot of this effort if you take a closer look at existing code, and try it out to see where extensions are needed for your purposes. @alex6679 has done most of the heavy lifting on this, and while he doesn’t have a lot of spare bandwidth right now, he is pretty responsive if you have a specific demonstrable issue. My contribution has been mostly to generate the issues :devilish: … though I do have a few PRs to my credit, too.
 
Thanks, that’s very useful context.


I hadn’t fully appreciated how far the existing work had progressed beyond stock Teensyduino. Knowing that UAC2 async with feedback is already working and tested up to 96 kHz / 8 channels / 16-bit, with improved feedback stability, definitely changes the starting point.


Given that, my plan is less about re-architecting and more about extending what’s already there, specifically toward:


  • Higher sample rates (targeting 384 kHz stereo)
  • Wider slot sizes (32-bit containers)
  • Verifying where the practical limits appear (USB side vs SAI/I2S side)

I’ll take a closer look at the code and the linked thread, and try to get it running first before attempting any extensions. Once I can reproduce the existing behavior, I should have a much clearer idea of what actually needs changing versus what’s already solved.


If and when I hit a concrete limitation or regression at higher rates, I’ll try to narrow it down to a specific, demonstrable issue before reaching out.


Thanks again for pointing this out, it likely saves a lot of duplicated effort.
 
For context, part of my motivation here is that I’ve already implemented a full UAC2 asynchronous device on STM32H723, including:


  • Async isochronous endpoints with feedback
  • Device-driven audio clock
  • High sample rates
  • USB → SAI with DMA

So I’m reasonably familiar with the UAC2 model and the feedback side of things, but I’m still very new to the Teensy codebase and RT1062 specifics. My hope is that, by building on the existing Teensy UAC2 work, I can focus mainly on understanding where the architectural limits are rather than redoing solved problems.

I’m not suggesting this as a drop-in solution for Teensy, but rather as a working reference for UAC2 async behavior, especially around descriptor layout and feedback handling. I’m sharing it in case any parts are useful for comparison while I get familiar with the RT1062 USB and audio code.

My intention is to first bring up the existing Teensy UAC2 implementation as-is, then see where extensions are actually needed for higher rates on this platform.


Happy to clarify or answer questions about the STM32 side if that helps.


Thanks again, this is really helpful.
 

Attachments

  • uac_2 stm32h723.zip
    1.8 MB · Views: 19
Sorry, let me clarify that wording.


I’ve been using Teensy boards for ~6 years, including Teensy 3.x and 4.x, and I currently have several Teensy 4.1 boards here. I’ve worked extensively with the Audio Library and the existing USB audio examples.


When I said “getting familiar with the RT1062 USB and audio code”, I specifically meant the lower-level implementation details (USB descriptors, endpoint handling, and SAI/USB interaction on RT1062), not the Teensy platform or audio framework in general.


So this isn’t a first-time bring-up situation, it’s more about digging deeper into the internals before extending them.
 
Back
Top