Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 10 of 10

Thread: Lower latency audio

  1. #1
    Junior Member
    Join Date
    Jul 2021
    Posts
    8

    Lower latency audio

    https://www.pjrc.com/teensy/td_libs_...ewObjects.html

    I see from this that the each step in an audio chain should take about 2.9 milliseconds.
    This is regardless of how fast or how little the Teensy does the work.
    It will wait 128 samples, before updating again. And if the work is not ready it should cause clipping like it would on a pc daw.

    I read the average audio hardware latency to be about 7ms.

    So that 2.9*3 = 8.7 ms
    counting on the input and output adding delay, I am guessing 10ms total would be about right. I should consider that the upper limit of whats acceptable.

    The instructions however do offer up the option to lower the samples from 128 to some other number.
    Maybe this number can be arbitrary but maybe not. Considering it is a power of 2 I am guessing not.

    Changing it to 64 samples per update should shift the update time to 1.45ms. So a theoretical 6 steps in the audio chain for the same time period.

    32 samples seems less plausible but it would allow 12 steps.
    This assumes that the teensy is able to do the work in that time. Load times would probably also become more of an issue. And it is probably less efficient in all sorts of ways, otherwise lower sample size would be standard.

    Anyway is this roughly accurate?

    The other question I have that is slightly related is, how much latency is added from sending i2s from one teensy to a second teendy? I am guessing it is something like 3-5ms. And what about teensy sending i2s to the dac?

  2. #2
    Senior Member
    Join Date
    Jul 2020
    Posts
    1,791
    Some but not all audio objects can use a different power-of-two for the samples per block - I know the FFT
    objects assume 128, but many are agnostic. I would risk lower than 8 though and often the audio objects
    update code uses unrolled loops processing chunks at a time.

    I am unsure where the power of two constraint applies, but I suspect some DMA driven I/O objects assume it.

    Some ADCs and DACs have lower latency (basically if not sigma-delta there should be only single sample
    latency, with sigma-delta chips there is many samples of latency for the digital processing.

  3. #3
    Senior Member h4yn0nnym0u5e's Avatar
    Join Date
    Apr 2021
    Location
    Cambridgeshire, UK
    Posts
    638
    Quote Originally Posted by audiovoice View Post
    https://www.pjrc.com/teensy/td_libs_...ewObjects.html

    I see from this that the each step in an audio chain should take about 2.9 milliseconds.
    This is regardless of how fast or how little the Teensy does the work.
    It will wait 128 samples, before updating again. And if the work is not ready it should cause clipping like it would on a pc daw.

    I read the average audio hardware latency to be about 7ms.

    So that 2.9*3 = 8.7 ms
    counting on the input and output adding delay, I am guessing 10ms total would be about right. I should consider that the upper limit of whats acceptable.

    The instructions however do offer up the option to lower the samples from 128 to some other number.
    Maybe this number can be arbitrary but maybe not. Considering it is a power of 2 I am guessing not.

    Changing it to 64 samples per update should shift the update time to 1.45ms. So a theoretical 6 steps in the audio chain for the same time period.

    32 samples seems less plausible but it would allow 12 steps.
    This assumes that the teensy is able to do the work in that time. Load times would probably also become more of an issue. And it is probably less efficient in all sorts of ways, otherwise lower sample size would be standard.

    Anyway is this roughly accurate?

    The other question I have that is slightly related is, how much latency is added from sending i2s from one teensy to a second teendy? I am guessing it is something like 3-5ms. And what about teensy sending i2s to the dac?
    Not quite right, no

    Although the audio designer shows the signal flow as a chain (or a mesh for more complex designs), the whole "engine" is run in its entirety every 2.9ms. Neglecting any buffering in the input and output hardware, you will thus get a latency of only 2.9ms from input to output ... except ... if your engine has a "loop" in it, where a later stage drives an earlier one, then there will be an additional 2.9ms on that loopback path only. Adding extra steps will thus not , in general, increase overall latency.

    As noted by @MarkT, the internals of the various AudioStream objects can be a bit arbitrary, inconsistent and undocumented! My "deepest" experience to date is of updating the envelope generator, which deals with chunks of 8 samples at a time - other objects may well use different granularity. The easiest thing to do is build your application using the default libraries, see if the latency is OK, and then try reducing the blocksize gradually until it keels over...

    Cheers

    Jonathan

  4. #4
    Senior Member houtson's Avatar
    Join Date
    Aug 2015
    Location
    Scotland
    Posts
    249
    I use AUDIO_BLOCK_SAMPLES of 16 for effects to keep latency low. At that block size the only object I've found not to work is the USB audio out.

    I haven't been able to work out how to consistently calculate expected latency (@jonathon your 2.9ms end to end doesn't ring true with practical experience), I typically rely on measuring it with a scope in to out. I normally do have a feedback loop or two but with the standard block size (128) I can get quite significant latency measured in to out that is greatly reduced by changing block size.

    Cheers Paul

  5. #5
    Senior Member houtson's Avatar
    Join Date
    Aug 2015
    Location
    Scotland
    Posts
    249
    * more to last - I believe WAV file player and FFT will also fail at 16 sample per block.

    Also@jonathan's advice to try it, see if latency is too much and if so reduce the block size is a good approach.

  6. #6
    Junior Member
    Join Date
    Jul 2021
    Posts
    8
    h4yn0nnym0u5e this is great news.
    2.9 seconds for oscillator teensy to get 128 samples done. I am guessing only then does it start outputting those samples to i2s-out. I think i2s is then more or less real time.

    I am planning on having one or more oscillator teensy feed into a mixer/effect teensy.

    Timeline
    0.1ms note signal comes in.
    2.9ms teensy 1 finishes 128 samples
    5.8ms teensy 2 receives the last of 128 samples
    8.7ms teensy 2 finishes 128 samples
    8.72ms teensy 2 begins sending 128 samples to i2s DAC which immediately sends out the audio signal.

    If it works something like this then doing the work to get it to 64 sample size might be worth it but it shouldn't be a deal breaker, and should be done at the end.
    I am a hack when it comes to noticing latency, but I know many people can tell ~8.7ms from 4.35ms. A general feel that it is slightly less good, less responsive because of lag.

    houtson this is also good news. If effect can really be lowered to 16 sample size then the latency added by the second teensy becomes much more negligible.
    Just getting the second teensy down to 64 sample size would bring the predicted latency from 8.7 to 5.8. Because i2s can be split to multiple sources, the output could go to a logic board as well.
    That teensy or esp32 could run analysis objects and draw the GUI to the screen.

  7. #7
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    1,089
    I encourage you to actually measure the latency.

  8. #8
    Senior Member h4yn0nnym0u5e's Avatar
    Join Date
    Apr 2021
    Location
    Cambridgeshire, UK
    Posts
    638
    Quote Originally Posted by jonr View Post
    I encourage you to actually measure the latency.
    Who, me? Well, I did, anyway

    I created a simple audio engine, and physically connected i2s2 out R to i2s1 in R. Hence i2s2 out L will be the same as out R, but delayed by the overall latency of the engine. I then recorded the USB output, and the result is:
    Click image for larger version. 

Name:	Simple engine latency.jpg 
Views:	75 
Size:	138.4 KB 
ID:	25341

    The selected area shows the in-to-out latency to be 281 samples, or about 6.4ms. So there's an additional latency of 153 samples over and above the expected 128. This is with a Teensy 4.1 and rev D audio board. I then made the in-to-out chain more complex, but ended up with the same latency, as I expected:
    Click image for larger version. 

Name:	Chained engine latency.jpg 
Views:	74 
Size:	139.5 KB 
ID:	25342
    Here's the code I used: uncomment the line in red to revert to the simpler engine:
    Code:
    /*
     * Generate a frequency-modulated waveform, output it 
     * to the right channel, and feed the input
     * right channel to the left channel.
     * 
     * In hardware, connect the right output to the right
     * input, which will allow us to check the latency.
     */
    #include <Audio.h>
    #include <Wire.h>
    #include <SPI.h>
    #include <SD.h>
    #include <SerialFlash.h>
    
    //#define SIMPLE_TEST
    #if defined SIMPLE_TEST
    // GUItool: begin automatically generated code
    AudioSynthWaveform       waveform1;      //xy=160,506
    AudioSynthWaveformModulated waveformMod1;   //xy=332,512
    AudioSynthWaveform       waveform2;      //xy=348,563
    AudioInputI2S            i2s1;           //xy=513,437
    AudioMixer4              mixer1;         //xy=518,531
    AudioOutputI2S           i2s2;           //xy=721,525
    AudioOutputUSB           usb1;           //xy=722,449
    AudioConnection          patchCord1(waveform1, 0, waveformMod1, 0);
    AudioConnection          patchCord2(waveformMod1, 0, mixer1, 0);
    AudioConnection          patchCord3(waveform2, 0, mixer1, 1);
    AudioConnection          patchCord4(i2s1, 1, i2s2, 0);
    AudioConnection          patchCord5(i2s1, 1, usb1, 0);
    AudioConnection          patchCord6(mixer1, 0, i2s2, 1);
    AudioConnection          patchCord7(mixer1, 0, usb1, 1);
    AudioControlSGTL5000     sgtl5000_1;     //xy=714,576
    // GUItool: end automatically generated code
    
    
    #else
    
    // GUItool: begin automatically generated code
    AudioInputI2S            i2s1;           //xy=247,249
    AudioSynthWaveformDc     dc1;            //xy=248,295
    AudioSynthWaveform       waveform1;      //xy=295,349
    AudioEffectMultiply      multiply1;      //xy=386,262
    AudioSynthWaveformModulated waveformMod1;   //xy=480,356
    AudioSynthWaveform       waveform2;      //xy=498,400
    AudioEffectWaveshaper    waveshape1;     //xy=550,266
    AudioMixer4              mixer1;         //xy=704,285
    AudioMixer4              mixer2;         //xy=704,386
    AudioOutputUSB           usb1;           //xy=917,321
    AudioOutputI2S           i2s2;           //xy=917,361
    AudioConnection          patchCord1(i2s1, 1, multiply1, 0);
    AudioConnection          patchCord2(dc1, 0, multiply1, 1);
    AudioConnection          patchCord3(waveform1, 0, waveformMod1, 0);
    AudioConnection          patchCord4(multiply1, waveshape1);
    AudioConnection          patchCord5(waveformMod1, 0, mixer2, 0);
    AudioConnection          patchCord6(waveform2, 0, mixer2, 1);
    AudioConnection          patchCord7(waveshape1, 0, mixer1, 0);
    AudioConnection          patchCord8(mixer1, 0, usb1, 0);
    AudioConnection          patchCord9(mixer1, 0, i2s2, 0);
    AudioConnection          patchCord10(mixer2, 0, usb1, 1);
    AudioConnection          patchCord11(mixer2, 0, i2s2, 1);
    AudioControlSGTL5000     sgtl5000_1;     //xy=906,258
    // GUItool: end automatically generated code
    
    // GUItool: end automatically generated code
    #endif 
    
    const int myInput = AUDIO_INPUT_LINEIN;
    //const int myInput = AUDIO_INPUT_MIC;
    
    
    void setup() {
      Serial.begin(115200);
      // Audio connections require memory to work.  For more
      // detailed information, see the MemoryAndCpuUsage example
      AudioMemory(20);
    
      // Enable the audio shield, select input, and enable output
      sgtl5000_1.enable();
      sgtl5000_1.inputSelect(myInput);
      sgtl5000_1.volume(0.5);
    
      waveformMod1.begin(0.8f,100.0f,WAVEFORM_TRIANGLE);
      waveformMod1.frequencyModulation(1.0f);
      waveform1.begin(0.5f,1.0f,WAVEFORM_TRIANGLE); // modulate @ 1Hz
      waveform2.begin(0.1f,700.0f,WAVEFORM_TRIANGLE); // "marker" waveform
    
    #if !defined(SIMPLE_TEST)
      float shapeArray[]={-1.0f,-0.25f,0.0f,0.25f,1.0f};
      waveshape1.shape(shapeArray,5);
      dc1.amplitude(1.0f);
      mixer1.gain(0,1.0f); 
      mixer2.gain(0,1.0f); 
      mixer2.gain(1,1.0f); 
    #endif // !defined(SIMPLE_TEST)
    }
    
    void loop() {
     delay(1000);
     Serial.println("blip!");
    }
    Cheers

    Jonathan

  9. #9
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,735
    Did you measure it compared to USB? USB has it's own latency..

  10. #10
    Senior Member h4yn0nnym0u5e's Avatar
    Join Date
    Apr 2021
    Location
    Cambridgeshire, UK
    Posts
    638
    Quote Originally Posted by Frank B View Post
    Did you measure it compared to USB? USB has it's own latency..
    Don’t think I needed to using this technique - USB-right is the reference signal, and USB-left the delayed one. I believe the OP was interested in i2s in to out delay, and I think I’ve measured that. Do show me where my logic falls down, if you think it does

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •