USB Audio Frame Sync on Teensy 4.0

Even a bit better, good enough control of buffer level that the amount of buffering could be reduced for less latency.

Code:
                if (diff > 0)
                   feedback_accumulator = 44.2 * (1<<24);       // fast
                else if (diff < 0)
                   feedback_accumulator = 44.0 * (1<<24);       // slow
                else
                   feedback_accumulator = 44.1 * (1<<24);       // nominal

But the question is how does this perform on PCs and Macs.
Answer ... after a couple of years ... for me ... not very well! Re-opening this thread because of this one, where all seems to be going pretty well apart from a very occasional glitch. I've tried to adapt the above approach slightly (because I'm also trying out different sample rates so magic numbers related to 44.1kHz aren't quite sufficient), but either (a) I don't fully understand what's going on, or (b) I do understand it, a bit, and it's Not Quite Right.

I can't for the life of me see how the feedback accumulator is being sent to the PC (Windows 10 x64, in my case); there's a single call to sync_event(), then as far as I can tell (and I've instrumented the function, I think), it never gets called again, so it really should make no difference whether feedback_accumulator has a sensible value in it.

The other thing is that the above code is only called for input audio (PC->Teensy), so I can't see how / if sync can be maintained in an output-only (Teensy->PC) situation.

Have I missed something with either or both of the above points? Entirely likely, but can some kindly soul tell me what it is, please?!
 
If my research in https://github.com/mcginty/teensy-cores/pull/4 is correct, the reason `sync_event` is never called again is because the USB output is registered as an "adaptive" type audio endpoint, rather than "asynchronous" - the latter being what uses the feedback mechanism.

EDIT: Adaptive does seem to also use the feedback mechanism, but there is no synch endpoint (bSynchEndpoint) set for the USB Audio Output interface in the descriptor. That seems like maybe the issue? USB *Input* has one, but not output.
 
edit: some corrections.

I misunderstood (apologies), and the Audio Input is indeed giving feedback to the host, and you can look at ALSA's current frequency based on the feedback from the device, for example:

Code:
grep "Momentary freq" /proc/asound/card1/stream0

At least for me, the momentary frequency teeters between 44100-44102Hz, which seems quite acceptable, and seems like the algorithm is working.

I started by added some debug output to both the sync function (called at a rate specified by the descriptor), and the update callback (called every (micro)frame, so every 125us in High Speed), and correlated the feedback value that the sync was providing to the host and the number of samples per second seen by the callback.

When I did that, though, I got massive (and ever-increasing) momentary frequency rises as well as both over and underruns.

Long story short: instrumenting the USB Audio loop is fraught as it affects the timing significantly.
 
Last edited:
As I'm on Windows, it's a bit difficult to reproduce the above! However, I do have an oscilloscope so I can generate pin pulses at key points in the code without unduly affecting the timings. Which is nice. Are you instrumenting within Linux, or on the Teensy?

It'll be a day or two before I can get back to this, but meanwhile, to re-iterate:
  • as far as I can tell, sync_event() in usb_audio.cpp is only ever called once
  • the feedback_accumulator is only updated for inputs, so even if sync_event() were being called there is no mechanism to keep in sync with outputs
 
Haven't managed to make any useful progress on this, and none of the previous contributors to this thread seem to be interested :(, but I have spotted one other thing that could cause an issue, which is that with 8 channels @ 96kHz the wMaxPacketSize ends up being calculated as over 1024 bytes (96*8*2 = 1536 ... more after allowing for the occasional "extra sample" packet), which is invalid. So for this instance, I think bInterval needs to be changed to suit. It probably gets worse if we want to allow for 24- or 32-bit samples in the future.
 
Sorry for my late reply (I've been focusing on my PCB design this last week)! Thanks for the followup.

I instrumented with a mix of Linux's ALSA debugging info and Serial4 output on the Teensy (using the debug printf solution scattered around the library). sync_event is called at a regular interval for me. The feedback_accumulator is updated in the update() function since that's directly tied to the "rate of consumption" from something like the I2S output, which does make sense, as you want to know the difference between the sample rate of the I2S interface and that of the USB host.

I've noticed while leaving my Teensyaudio on overnight that overruns/underruns still occur though, so something still seems off with the async feedback algorithm. Haven't gotten to the bottom of it yet.

As for high-sample-rate high-bit-rate audio limitations with wMaxPacketSize, I haven't approached that yet, but I imagine USB Audio 2.0 has better flexibility in that regard? One thing to note is that bInterval seems to not be *allowed* to be the value it's set at in the Teensy's mainline, at least as I could read from the specification. I left notes as I went through the code in this commit, which might be helpful although I can't guarantee the accuracy of it. Consider it a reverse engineer's in-progress notes :p.

https://github.com/mcginty/teensy-cores/commit/d10ba133e18371374478d05d2571e15ba70015d6
 
Sorry for my late reply (I've been focusing on my PCB design this last week)! Thanks for the followup.

I instrumented with a mix of Linux's ALSA debugging info and Serial4 output on the Teensy (using the debug printf solution scattered around the library). sync_event is called at a regular interval for me. The feedback_accumulator is updated in the update() function since that's directly tied to the "rate of consumption" from something like the I2S output, which does make sense, as you want to know the difference between the sample rate of the I2S interface and that of the USB host.

I've noticed while leaving my Teensyaudio on overnight that overruns/underruns still occur though, so something still seems off with the async feedback algorithm. Haven't gotten to the bottom of it yet.

As for high-sample-rate high-bit-rate audio limitations with wMaxPacketSize, I haven't approached that yet, but I imagine USB Audio 2.0 has better flexibility in that regard? One thing to note is that bInterval seems to not be *allowed* to be the value it's set at in the Teensy's mainline, at least as I could read from the specification. I left notes as I went through the code in this commit, which might be helpful although I can't guarantee the accuracy of it. Consider it a reverse engineer's in-progress notes :p.

https://github.com/mcginty/teensy-cores/commit/d10ba133e18371374478d05d2571e15ba70015d6
Well that's all very odd. On Windows I simply can't see the sync_event() function being called more than once, though after a closer reading of the code I can see it's supposed to be a callback from somewhere within the depths of the USB library. I don't have a Linux box to test on.

Please do look at the USB 2.0 specification, specifically section 9.6.6 Table 9-13 on Standard Endpoint Descriptor, and ignore the ancient comments that Paul put in about "Standard AS Isochronous Audio Data Endpoint Descriptor USB DCD for Audio Devices 1.0, Section 4.6.1.1, Table 4-20, page 61-62" - there is such a descriptor for the Full Speed (12Mb/s) configuration lower down in usb_desc.c, which has bInterval correctly set to the only valid value of 1, but this is definitely wrong for a High Speed device, which has historically used 4 to get an interval of 2^(4-1) = 8 microframes = 1ms, and said as much in the comments. I say "definitely wrong", it may be that the resulting poll interval of 125µs works for some magic reason I've not yet fathomed, but most of the code currently appears to expect an interval of 1ms so I can't see how!
 
Oops, thank you for correcting me there. Sorry, I'm very new to developing with USB. Well, the poll interval of 125µs did seem to work just fine which is good news :). I suppose it makes sense, as the buffers just get filled up as data comes in, and the async feedback mechanism as you noted is only changed in the update() function whose timing is not controlled by the USB interval. The good news is that now it seems more clear that the interrupt handling code is truly timing-agnosting ;P.

How are you checking your sync_event() function being called via Windows?
 
To be fair, I've never delved this far into USB before either ... still need to check the 125µs poll does work for me.

I've been checking the sync_event() by incrementing a counter within the function in the Teensy code, then outputting the count value every 0.5s or so via the serial port (I'm compiling for Serial+MIDI+Audio). Yes, I did make sure the counter is declared volatile! And it hardly ever counts - typically only at start-up, plus when Windows re-connects after sleeping. Again, more checking needed, I've been diverted by trying to flip between different sample rates and I/O counts without having to uninstall ... oh, and the day job ...
 
Sorry just to clarify, in your application are you using AudioInputUSB, AudioOutputUSB, or both?

sync_event() is only called for AudioInputUSB, as AudioOutputUSB reports itself as an adaptive endpoint and has no associate bSynchAddress set. While I haven't fully grokked the role of an adaptive endpoint, that seems to be the reason why sync_event() wouldn't be called for the outgoing case.

Looking at these two bits of information from the USB 2.0 Spec:

Screenshot 2022-12-13 at 15.22.20.jpg

Adaptive source endpoints produce data at a rate that is controlled by the data sink. The sink provides feedback (refer to Section 5.12.4.2) to the source, which allows the source to know the desired data rate of the sink. For adaptive sink endpoints, the data rate information is embedded in the data stream. The average number of samples received during a certain averaging time determines the instantaneous data rate. If this number changes during operation, the data rate is adjusted accordingly.

I don't know why they have to word things so unclearly, but it seems like in the Adaptive case the Teensy *could* be receiving feedback and adjusting its datarate accordingly (resampling or adjusting a clock, presumably) and is not? It's unclear from reading this quickly whether the sink (the PC in this case) is then expected to resample on its side if the source does not.
 
I had been using both, but more recently only actively using AudioOutputUSB, though both objects are in the sketch. So that's probably the reason sync_event() isn't being called, but recent tries using both input and output have just resulted in crashes. As noted in #35, I need to revert back to a semi-working system and re-check whether I see sync_event() firing when AudioInputUSB is in use.

I don't think there's anything the Teensy is doing to sync to the PC for the Adaptive / AudioInputUSB Endpoint - if it could, then you'd expect USB to be capable of update responsibility, and it's clearly documented as not being so. Plus there would be a need to sync audio adaptors and such to the USB, which would be fairly horrible! So as you've speculated elsewhere, possibly Adaptive is the wrong choice and both should be Asynchronous, with the PC being responsible for re-sampling as needed. That would appear to be the better design choice, as the PC has a lot more grunt.

It's gonna get even more hairy when we have a system where the master clock is from an S/PDIF input to the Teensy, and the both the audio adaptor and USB run from that! I think we quietly ignore that scenario for now, and just use AsyncAudioInputSPDIF3.
 
if it could, then you'd expect USB to be capable of update responsibility, and it's clearly documented as not being so.
Yes sorry, by "could" i meant that according to the USB specification, the Teensy has the *option* of reacting to explicit sink feedback, but is not currently. Strangely, grepping for uses of "usb_audio_sync_feedback", that variable is modified in tx_event() but never used, which indicates that maybe some adaptive efforts were a work-in-progress?

It's gonna get even more hairy when we have a system where the master clock is from an S/PDIF input to the Teensy, and the both the audio adaptor and USB run from that! I think we quietly ignore that scenario for now, and just use AsyncAudioInputSPDIF3.

Ah yeah that would get very interesting. In my current project, I'm using AudioOutputTDM so my input feedback drifts based on the difference between the host clock and the TDM/I2S clock.

Don't know if this is interesting, but I left my Teensy on overnight playing audio from my Linux host, and logged the ALSA "Momentary Frequency" (the observed sample rate based on the datarate from the Teensy) every 300ms, and this is what the reported samplerate looks like as it drifts back/forth from the target 48000Hz samplerate I'm using:

plot.png
 
Strangely, grepping for uses of "usb_audio_sync_feedback", that variable is modified in tx_event() but never used, which indicates that maybe some adaptive efforts were a work-in-progress?
True ... it all has the flavour of something punted out in a half-finished state because it mostly worked and there was more urgent stuff to do ... bit like a lot of my code!

I definitely see (relative) clock drift, the glitches tend to start out every ~30s at the beginning of the day, and get slightly more frequent as time goes on. That is an interesting graph, and strongly suggests the feedback calculation needs a good seeing-to; I think it's been commented on somewhere previous in this thread (yes, posts #21 to #25) ... can you tell where / when the audible glitches occur, relative to the observed sample rate at the time?
 
True ... it all has the flavour of something punted out in a half-finished state because it mostly worked and there was more urgent stuff to do ... bit like a lot of my code!

I definitely see (relative) clock drift, the glitches tend to start out every ~30s at the beginning of the day, and get slightly more frequent as time goes on. That is an interesting graph, and strongly suggests the feedback calculation needs a good seeing-to; I think it's been commented on somewhere previous in this thread (yes, posts #21 to #25) ... can you tell where / when the audible glitches occur, relative to the observed sample rate at the time?

Is your glitching at 96kHz, or the library-standard 44.1kHz?

While I observe that drift there were no underflows or overflows, so the feedback code works but could maybe be improved to not allow that slow drift (one "period" between 47990 and 48010 was on the order of 60 seconds at 48kHz). It's worth noting that I also did experience over/underflows when I used the default buffer size of 128, and had to increase it to 256 to stop seeing them. This could be reduced again if the feedback code allows for left drift, however.
 
At 96kHz, hence mucking about to be able to test at various different sample rates without having to give Windows a kicking every time I change something - the caching is a right pain then! I’ll put changing the audio block size on the list, too (assume that’s what you mean, the buffer is typically bigger than that for pretty much any channel count over the original 2).
 
OK, so I've done a PR to align my automatic selection of bInterval for High Speed with your documentation effort. Seems "no worse than before"...

You're right, I do get sync_event() calls if I send to the Teensy; I think the crashes were due to the various printf() calls in usb_audio.cpp: having diked (some of) those out I could test the AudioInputUSB without the system collapsing on me. Didn't get to the AUDIO_BLOCK_SAMPLES testing yet.
 
Sorry I missed your PR from a couple days ago! I'll check it out. FWIW, I was spending time myself with various levels of complexity in trying to fix the feedback overrun/underrun problem, and have ended up with this: https://github.com/mcginty/teensy-cores/commit/0382c12488e5d824344ebe5da4aef19dfc94c7e1

It takes the accumulator variable and instead turns that "diff" calculation into an averaged out "pressure" overtime that is used as a multiplier onto the nominal requested sample rate. I'll add more documentation to that code soon, but this seems to have fixed the feedback issue so far. I'll keep it running and see if any issues arise.
 
Quick update, I'm now able to stream 8 channels of 96kHz 16-bit audio with no overruns/underruns and a pretty stable "momentary frequency." Pushed my code to my repository
 
Got it, total failure at this end. Not a sausage. B****r all. I am disappointed with this result...

Python election sketches aside, I'm getting a fair few warnings... I've also not re-re-re-merged all my changes which create valid USB descriptors and also allow me to test with serial support by selecting the Serial+MIDI+Audio build option. That's all in my PR, should have no effect on your feedback code.

EDIT: now my code is failing at 96k :mad: output seems functional but input (to Teensy) isn’t. No idea why not, but if you try my PR be prepared for disappointment…
 
Last edited:
Hah, well let me know how that goes as it continues to not be a sausage.

I left the teensy on overnight at 8x 96kHz 16bit and no over/under runs and a relatively stable Momentary Frequency hovering around 96001 and 96002Hz. I also added some experimental code to have a "flood gate" that won't report or react to underflows until the first "ready" buffer is... ready.

Seems promising!
 
Will do, though spare time is going to be scarce for a while, I suspect … good news that you’ve had stable results, suggests strongly that I’m mucking up somewhere along the line!

One thing I forgot to ask, have you tried 128-sample blocks at 44.1kHz? I noticed for 96k you’d gone to 240-sample blocks, as noted previously. Once feedback is working for both output and input it’d be good to figure out if it’s possible to use the standard block size, as a fair few objects fail otherwise.
 
Just tried 44.1kHz with 128 sample blocks and didn't notice any problems. Haven't left it running overnight, though :).
 
PR now amended. I've had to back out my attempts to make USB packet sizes valid, as they seem to break something fundamental. However, I've fixed a few intermittent crashes, the code to allow all Audio+<stuff> build options is still useful, and I've synced up with Paul's latest cores updates, so I think it's still worth merging, as long as it works for you! Tested at 44.1kHz / 128 samples and 96k / 240 samples.
 
Back
Top