Lower latency audio

Status
Not open for further replies.
https://www.pjrc.com/teensy/td_libs_AudioNewObjects.html

I see from this that the each step in an audio chain should take about 2.9 milliseconds.
This is regardless of how fast or how little the Teensy does the work.
It will wait 128 samples, before updating again. And if the work is not ready it should cause clipping like it would on a pc daw.

I read the average audio hardware latency to be about 7ms.

So that 2.9*3 = 8.7 ms
counting on the input and output adding delay, I am guessing 10ms total would be about right. I should consider that the upper limit of whats acceptable.

The instructions however do offer up the option to lower the samples from 128 to some other number.
Maybe this number can be arbitrary but maybe not. Considering it is a power of 2 I am guessing not.

Changing it to 64 samples per update should shift the update time to 1.45ms. So a theoretical 6 steps in the audio chain for the same time period.

32 samples seems less plausible but it would allow 12 steps.
This assumes that the teensy is able to do the work in that time. Load times would probably also become more of an issue. And it is probably less efficient in all sorts of ways, otherwise lower sample size would be standard.

Anyway is this roughly accurate?

The other question I have that is slightly related is, how much latency is added from sending i2s from one teensy to a second teendy? I am guessing it is something like 3-5ms. And what about teensy sending i2s to the dac?
 
Some but not all audio objects can use a different power-of-two for the samples per block - I know the FFT
objects assume 128, but many are agnostic. I would risk lower than 8 though and often the audio objects
update code uses unrolled loops processing chunks at a time.

I am unsure where the power of two constraint applies, but I suspect some DMA driven I/O objects assume it.

Some ADCs and DACs have lower latency (basically if not sigma-delta there should be only single sample
latency, with sigma-delta chips there is many samples of latency for the digital processing.
 
https://www.pjrc.com/teensy/td_libs_AudioNewObjects.html

I see from this that the each step in an audio chain should take about 2.9 milliseconds.
This is regardless of how fast or how little the Teensy does the work.
It will wait 128 samples, before updating again. And if the work is not ready it should cause clipping like it would on a pc daw.

I read the average audio hardware latency to be about 7ms.

So that 2.9*3 = 8.7 ms
counting on the input and output adding delay, I am guessing 10ms total would be about right. I should consider that the upper limit of whats acceptable.

The instructions however do offer up the option to lower the samples from 128 to some other number.
Maybe this number can be arbitrary but maybe not. Considering it is a power of 2 I am guessing not.

Changing it to 64 samples per update should shift the update time to 1.45ms. So a theoretical 6 steps in the audio chain for the same time period.

32 samples seems less plausible but it would allow 12 steps.
This assumes that the teensy is able to do the work in that time. Load times would probably also become more of an issue. And it is probably less efficient in all sorts of ways, otherwise lower sample size would be standard.

Anyway is this roughly accurate?

The other question I have that is slightly related is, how much latency is added from sending i2s from one teensy to a second teendy? I am guessing it is something like 3-5ms. And what about teensy sending i2s to the dac?

Not quite right, no :)

Although the audio designer shows the signal flow as a chain (or a mesh for more complex designs), the whole "engine" is run in its entirety every 2.9ms. Neglecting any buffering in the input and output hardware, you will thus get a latency of only 2.9ms from input to output ... except ... if your engine has a "loop" in it, where a later stage drives an earlier one, then there will be an additional 2.9ms on that loopback path only. Adding extra steps will thus not , in general, increase overall latency.

As noted by @MarkT, the internals of the various AudioStream objects can be a bit arbitrary, inconsistent and undocumented! My "deepest" experience to date is of updating the envelope generator, which deals with chunks of 8 samples at a time - other objects may well use different granularity. The easiest thing to do is build your application using the default libraries, see if the latency is OK, and then try reducing the blocksize gradually until it keels over...

Cheers

Jonathan
 
I use AUDIO_BLOCK_SAMPLES of 16 for effects to keep latency low. At that block size the only object I've found not to work is the USB audio out.

I haven't been able to work out how to consistently calculate expected latency (@jonathon your 2.9ms end to end doesn't ring true with practical experience), I typically rely on measuring it with a scope in to out. I normally do have a feedback loop or two but with the standard block size (128) I can get quite significant latency measured in to out that is greatly reduced by changing block size.

Cheers Paul
 
* more to last - I believe WAV file player and FFT will also fail at 16 sample per block.

Also@jonathan's advice to try it, see if latency is too much and if so reduce the block size is a good approach.
 
h4yn0nnym0u5e this is great news.
2.9 seconds for oscillator teensy to get 128 samples done. I am guessing only then does it start outputting those samples to i2s-out. I think i2s is then more or less real time.

I am planning on having one or more oscillator teensy feed into a mixer/effect teensy.

Timeline
0.1ms note signal comes in.
2.9ms teensy 1 finishes 128 samples
5.8ms teensy 2 receives the last of 128 samples
8.7ms teensy 2 finishes 128 samples
8.72ms teensy 2 begins sending 128 samples to i2s DAC which immediately sends out the audio signal.

If it works something like this then doing the work to get it to 64 sample size might be worth it but it shouldn't be a deal breaker, and should be done at the end.
I am a hack when it comes to noticing latency, but I know many people can tell ~8.7ms from 4.35ms. A general feel that it is slightly less good, less responsive because of lag.

houtson this is also good news. If effect can really be lowered to 16 sample size then the latency added by the second teensy becomes much more negligible.
Just getting the second teensy down to 64 sample size would bring the predicted latency from 8.7 to 5.8. Because i2s can be split to multiple sources, the output could go to a logic board as well.
That teensy or esp32 could run analysis objects and draw the GUI to the screen.
 
I encourage you to actually measure the latency.

Who, me? Well, I did, anyway :)

I created a simple audio engine, and physically connected i2s2 out R to i2s1 in R. Hence i2s2 out L will be the same as out R, but delayed by the overall latency of the engine. I then recorded the USB output, and the result is:
Simple engine latency.jpg

The selected area shows the in-to-out latency to be 281 samples, or about 6.4ms. So there's an additional latency of 153 samples over and above the expected 128. This is with a Teensy 4.1 and rev D audio board. I then made the in-to-out chain more complex, but ended up with the same latency, as I expected:
Chained engine latency.jpg
Here's the code I used: uncomment the line in red to revert to the simpler engine:
Rich (BB code):
/*
 * Generate a frequency-modulated waveform, output it
 * to the right channel, and feed the input
 * right channel to the left channel.
 *
 * In hardware, connect the right output to the right
 * input, which will allow us to check the latency.
 */
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

//#define SIMPLE_TEST
#if defined SIMPLE_TEST
// GUItool: begin automatically generated code
AudioSynthWaveform       waveform1;      //xy=160,506
AudioSynthWaveformModulated waveformMod1;   //xy=332,512
AudioSynthWaveform       waveform2;      //xy=348,563
AudioInputI2S            i2s1;           //xy=513,437
AudioMixer4              mixer1;         //xy=518,531
AudioOutputI2S           i2s2;           //xy=721,525
AudioOutputUSB           usb1;           //xy=722,449
AudioConnection          patchCord1(waveform1, 0, waveformMod1, 0);
AudioConnection          patchCord2(waveformMod1, 0, mixer1, 0);
AudioConnection          patchCord3(waveform2, 0, mixer1, 1);
AudioConnection          patchCord4(i2s1, 1, i2s2, 0);
AudioConnection          patchCord5(i2s1, 1, usb1, 0);
AudioConnection          patchCord6(mixer1, 0, i2s2, 1);
AudioConnection          patchCord7(mixer1, 0, usb1, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=714,576
// GUItool: end automatically generated code


#else

// GUItool: begin automatically generated code
AudioInputI2S            i2s1;           //xy=247,249
AudioSynthWaveformDc     dc1;            //xy=248,295
AudioSynthWaveform       waveform1;      //xy=295,349
AudioEffectMultiply      multiply1;      //xy=386,262
AudioSynthWaveformModulated waveformMod1;   //xy=480,356
AudioSynthWaveform       waveform2;      //xy=498,400
AudioEffectWaveshaper    waveshape1;     //xy=550,266
AudioMixer4              mixer1;         //xy=704,285
AudioMixer4              mixer2;         //xy=704,386
AudioOutputUSB           usb1;           //xy=917,321
AudioOutputI2S           i2s2;           //xy=917,361
AudioConnection          patchCord1(i2s1, 1, multiply1, 0);
AudioConnection          patchCord2(dc1, 0, multiply1, 1);
AudioConnection          patchCord3(waveform1, 0, waveformMod1, 0);
AudioConnection          patchCord4(multiply1, waveshape1);
AudioConnection          patchCord5(waveformMod1, 0, mixer2, 0);
AudioConnection          patchCord6(waveform2, 0, mixer2, 1);
AudioConnection          patchCord7(waveshape1, 0, mixer1, 0);
AudioConnection          patchCord8(mixer1, 0, usb1, 0);
AudioConnection          patchCord9(mixer1, 0, i2s2, 0);
AudioConnection          patchCord10(mixer2, 0, usb1, 1);
AudioConnection          patchCord11(mixer2, 0, i2s2, 1);
AudioControlSGTL5000     sgtl5000_1;     //xy=906,258
// GUItool: end automatically generated code

// GUItool: end automatically generated code
#endif

const int myInput = AUDIO_INPUT_LINEIN;
//const int myInput = AUDIO_INPUT_MIC;


void setup() {
  Serial.begin(115200);
  // Audio connections require memory to work.  For more
  // detailed information, see the MemoryAndCpuUsage example
  AudioMemory(20);

  // Enable the audio shield, select input, and enable output
  sgtl5000_1.enable();
  sgtl5000_1.inputSelect(myInput);
  sgtl5000_1.volume(0.5);

  waveformMod1.begin(0.8f,100.0f,WAVEFORM_TRIANGLE);
  waveformMod1.frequencyModulation(1.0f);
  waveform1.begin(0.5f,1.0f,WAVEFORM_TRIANGLE); // modulate @ 1Hz
  waveform2.begin(0.1f,700.0f,WAVEFORM_TRIANGLE); // "marker" waveform

#if !defined(SIMPLE_TEST)
  float shapeArray[]={-1.0f,-0.25f,0.0f,0.25f,1.0f};
  waveshape1.shape(shapeArray,5);
  dc1.amplitude(1.0f);
  mixer1.gain(0,1.0f);
  mixer2.gain(0,1.0f);
  mixer2.gain(1,1.0f);
#endif // !defined(SIMPLE_TEST)
}

void loop() {
 delay(1000);
 Serial.println("blip!");
}

Cheers

Jonathan
 
Last edited:
Did you measure it compared to USB? USB has it's own latency.. :)

Don’t think I needed to using this technique - USB-right is the reference signal, and USB-left the delayed one. I believe the OP was interested in i2s in to out delay, and I think I’ve measured that. Do show me where my logic falls down, if you think it does :D
 
Status
Not open for further replies.
Back
Top