Audio Library

Status
Not open for further replies.
With Visual Micro/Studio and other IDEs, with auto-completion on variable and function names, like the linux tab key in the shell commands, one doesn't need to high abbreviation to reduce typing these days!
Verbose is better!
 
@Pete: The update to AudioSynthWaveform appears to limit its output to an absolute max of 16383, intentional? For my current project I need it to push more like 17551 or so, and I see scope for wanting to drive it higher than that - I can work around it either by hacking or reverting my copy of it or I can stick an AudioMixer4 in between it and my target(s) for it but it seems a pity to have to.
 
Hey Pete, quick hack was to change the amount bits are shifted by to two less in the final shift, as in;
Code:
    case TONE_TYPE_SINE:
      for(int i = 0;i < AUDIO_BLOCK_SAMPLES;i++) {
        // The value of ramp_up is always initialized to RAMP_LENGTH and then is
        // decremented each time through here until it reaches zero.
        // The value of ramp_up is used to generate a Q15 fraction which varies
        // from [0 - 1), and multiplies this by the current sample
        if(ramp_up) {
          // ramp up to the new magnitude
          // ramp_mag is the Q15 representation of the fraction
          // Since ramp_up can't be zero, this cannot generate +1
          ramp_mag = ((ramp_length-ramp_up)<<15)/ramp_length;
          ramp_up--;
          // adjust tone_phase to Q15 format and then adjust the result
          // of the multiplication
      	// calculate the sample
          tmp_amp = (short)((arm_sin_q15(tone_phase>>16) * tone_amp) >> 17);
//          *bp++ = (tmp_amp * ramp_mag)>>15;
          [COLOR="#0000FF"]*bp++ = (tmp_amp * ramp_mag)>>13;[/COLOR]
        } 
        else if(ramp_down) {
          // ramp down to zero from the last magnitude
          // The value of ramp_down is always initialized to RAMP_LENGTH and then is
          // decremented each time through here until it reaches zero.
          // The value of ramp_down is used to generate a Q15 fraction which varies
          // from [0 - 1), and multiplies this by the current sample
          // avoid RAMP_LENGTH/RAMP_LENGTH because Q15 format
          // cannot represent +1
          ramp_mag = ((ramp_down - 1)<<15)/ramp_length;
          ramp_down--;
          // adjust tone_phase to Q15 format and then adjust the result
          // of the multiplication
          tmp_amp = (short)((arm_sin_q15(tone_phase>>16) * last_tone_amp) >> 17);
//          *bp++ = (tmp_amp * ramp_mag)>>15;
          [COLOR="#0000FF"]*bp++ = (tmp_amp * ramp_mag)>>13;[/COLOR]
        } else {
          // adjust tone_phase to Q15 format and then adjust the result
          // of the multiplication
//          tmp_amp = (short)((arm_sin_q15(tone_phase>>16) * tone_amp) >> 17);
          [COLOR="#0000FF"]tmp_amp = (short)((arm_sin_q15(tone_phase>>16) * tone_amp) >> 15);[/COLOR]
          *bp++ = tmp_amp;
        } 
        
        // phase and incr are both unsigned 32-bit fractions
        tone_phase += tone_incr;
        // If tone_phase has overflowed, truncate the top bit 
        if(tone_phase & 0x80000000)tone_phase &= 0x7fffffff;
I only did it to the section for SINE because I haven't checked the other wave forms but I expect it will be same to those; haven't checked waveform very thoroughly after change but, again, expect it is OK.
 
I've partially fixed it. If the amplitude is set at, or very near, 1.0 the output will flat top. But now it will at least let you exceed 16383. The code is at: https://github.com/el-supremo/Audio
The problem only affects generation of the sine wave.
I'll keep digging into why the amplitude isn't correct near 1.0

Pete
 
Last edited:
I've fixed the amplitude now. I was testing it by playing the audio into the PC, recording it with Goldwave and then looking at the resulting waveform. That isn't the right way to do it.
I wrote a sketch to write the audio buffers to a WAV file and checked that. It is now outputting the proper amplitude when you specify the amplitude as 1.0

Pete
 
I have a project where I need to play a .wav from sd card while at a fixed rate (maybe 20 times per second or so) updating the positions of two servos based on preprocessed data stored in progmem synced/based on the .wav file. The teensy (3.1 with audio board) also needs to handle a few buttons and LEDs.

Recently ran into problems on another project with sd .wav playback and communication with RF wall wart relays (http://code.google.com/p/rc-switch/ library) didn't play nice together, I assume it's because the rc-switch library uses delay based timing that caused either the RF transmissions or audio playback to not wok correctly. My quick fix for that was to simply put the RF communication code on another teensy but it made me a bit more aware that I need to plan a bit more when doing both audio-playback and other tasks simultaneously on the same board.

Just reading some data (and possibly some simple decoding) and two analogWrites for the servo positions shouldn't be that demanding but I just wanted to discuss what approaches there are to ensure stable .wav playback along with other processing at a (nearl) fixed rate.

One idead I had is if it would be a good idea to be able to register a callback that will be called on the completion of processing of an audio buffer that should give you maximum uninterrupted (by the audio computations) processing time for other tasks, what to you think?
 
Paul: Yes, I was using the audio library for SD wav playback. There's no audio code in the RC-switch library I just tried to give a brief example of a project I did with the audio library in a real project where I ran into problems when going from many (a few at least) simple parts tested working fine on their own but running into problems when putting them all together into one program. The idea was then to discuss strategies on how to avoid that in the future by understanding more how to work with the audio library in the field with programs that do both audio processing and other work.

I could describe that project more to try to understand what went wrong and learn from that but right now it's not on top of my list and I no longer have all the hardware to test on, if you think it would be beneficial to

The main thing I see that's different when working with the Audio library is that every ~2.9ms you have to led the audio part do it's processing on the next block and that is a larger disruption of your normal program flow than most usual interrupts, and even if you might use AudioNoInterrupts(); you can only do it for a very short time or the audio playback will suffer. In the case of the RC-Switch library I believe the problem comes from one message transmission taking longer time than the interval the audio blocks/buffer needs to be updated (http://code.google.com/p/rc-switch/wiki/KnowHow_LineCoding) meaning I probably couldn't have solved the problem on one teensy without rewriting the RC-Switch transmission code in some clever way.

If code needs some fairly exact timing stretching over a period longer than 2.9ms - audio chain processing I guess you have to resort to more advanced strategies like using DMA or interrupts + AudioNoInterrupts();

Is it possible to say anything about how long it's safe to disable Audio update interrupt with intact audio output?

For when you want code to be executed at regular intervals but not requiring exact timing for periods stretching over two or more audio blocks I see two different approaches:

1) Use something like elapsedMillis/elapsedMicros to check when the right time has passed, then execute your code and either a) don't mind if Audio update jumps in and slightly change the timing or b) same as a) but carefully using AudioNoInterrupts() on critical sections

2) Syncing your non-audio code with the audio code by either executing it during audio update by making your own audio object or trying to fit it between audio updates where a callback for code to be executed at the end of an audio blocks processing might be convenient?

I guess I'm mostly wondering about/want to discuss the stuff to come under the headline "Understanding Audio Library Scheduling" from http://www.pjrc.com/teensy/td_libs_AudioProcessorUsage.html ...

Paul again: might also mean it will find it's way to more sparkfun distributors in europe?
 
Hi,
I'm using a Teensy 3.1 and the Audio board and library to do capture audio and run FFT over it.
This is using the default 44100 (I think?) sampling rate, and the FFT256 with naverage set to various values from 1 to 20.

The results coming back are odd though -- when playing a clean, single frequency with a (outboard) tone generator, I'd expect to see a single FFT bin registering highly, or at most two bins if the frequency falls right between them.

However what I actually get is one bin registering a bit higher than the other bins, but many of the other bins registering a moderately high value as well.
If I play silence, all bins report close to zero. If I play a 1000 hz beat (metronome style), then you can see ALL the bins bouncing up and down; higher around the 1000hz mark, but only by a bit.

Do you have any ideas what could be going wrong?

I've previous used FFTs successfully on other platforms, eg. libfftw on the beaglebone, so I believe my basic use/understanding of FFTs is solid.
 
Looking at the current ( ab3b21ec ) version of analyze_fft256.cpp, I'm wondering where the conversion from the audio data format is done to the Q15 data format used by the DSP's fft routine? (Or if, indeed, any conversion needs to be done at all?)
I see the audio data comes in on uint16_t data type; is it really *unsigned* (vs the signed Q15 type) or is that not a problem here?
 
Are you referring to copy_to_fft_buffer ?
If so, what is being done in this routine is to copy a buffer of 128 16-bit signed samples to a buffer ready for the FFT routine. The FFT requires that its input be a series of complex numbers of the form real1, imag1, real2, imag2, etc, where realN,imagN are the real and imaginary parts of the Nth complex number. The samples from the Audio library are only real, so the imaginary parts have to be filled with zero.
To do this, the code loads the real sample *src as an unsigned 16-bit integer and then stores it in *dst which is an unsigned 32-bit integer. Because of this, the high order 16-bits of *dst will always be zero (the imaginary part) and the low order 16-bits will be the real part. So it will store the real and imaginary part of one complex number in one instruction.
It's faster doing it this way than storing real and imaginary parts separately as 16-bit signed numbers. The FFT routine won't know the difference ;)

Pete
 
Yeah, I was looking at that --
1) why do you use an unsigned-int pointer if the data is signed? That just seems counter-intuitive to me. I'm not an expert in C/C++ though. Does the conversion work OK?
2) Is q15 (the fft input/output) compatible with the signed-16bit-declared-as-unsigned format? I have no idea, but was questioning it, especially when functions like sqrt_uint32() are called on it!
 
Sorry to bother you, but again, is it OK that you're passing q15 data into something like multiply_16tx16t_add_16bx16b() which is asking for uint32_t parameters?

I'm just looking over the code and trying to work out why I'm getting more noise than signal as results :(
 
@wintrmute:
I've written a sketch to test the FFT. It uses AudioSynthWaveform to generate samples of a sinewave of known frequency. The output is passed through AudioAnalyzeFFT256 and the result can be plotted on an LCD or printed on the serial monitor.
The line below is typical serial monitor output for a sine wave with an amplitude of 1.0 and frequency of 1723Hz:
Code:
10, 9, 9, 10, 9, 11, 12, 15, 20, 37, 16380, 28, 12, 6, 4, 0, 0, 0, 0, 0, ...
All remaining bins to the right are zero.
This clearly shows the peak and the rest of the bins are down in the noise.

If you are still getting unexpected results it might help to post your code and describe how you've wired up your audio to the board.

Pete



Code:
#include <Audio.h>
#include <Wire.h>
#include <SD.h>

#include <LiquidCrystal.h>

// Generate audio samples using AudioSynthWaveform, use
// AudioAnalyzeFFT256 to generate a spectrum and output
// it to the LCD or the serial monitor


/*
2a
Typical output (where the frequency of 1723Hz was chosen because it
falls in the centre of a bin):
tone_freq = 1723, sine, amplitude = 0.0500, FFT = 16, Hanning
...
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 820, 3, 0, 0, 0, 0, 0, 0, 0, ....
(all the rest to the right are zero)
...
*/

// 1 = Display the spectrum on the LCD
// 0 = Print the spectrum on the Serial Monitor
const int LCD_DISPLAY = 0;

// Number of Spectra to average for one output of FFT
// and name of window to use (or NULL for the default)
int num_avg = 16;
const int16_t *window = NULL;
AudioAnalyzeFFT256  myFFT(num_avg,window);

// Allows the program to print info about the window being used
struct map_window_name {
  const char *name;
  const int16_t *array;
} map_win_name[] = {
// The first entry MUST be the default
  { "Hanning",         AudioWindowHanning256},
  { "Bartlett",        AudioWindowBartlett256},
  { "Blackman",        AudioWindowBlackman256},
  { "Flattop",         AudioWindowFlattop256},
  { "BlackmanHarris",  AudioWindowBlackmanHarris256},
  { "Nuttall",         AudioWindowNuttall256},
  { "BlackmanNuttall", AudioWindowBlackmanNuttall256},
  { "Welch",           AudioWindowWelch256},
  { "Hamming",         AudioWindowHamming256},
  { "Cosine",          AudioWindowCosine256},
  { "Tukey",           AudioWindowTukey256},
  { "Unknown",         NULL},
};

// Map a window array address into its name
const char *window_name(const int16_t *ar)
{
  int i;
  if(ar == NULL)return(map_win_name[0].name);
  for(i=0;map_win_name[i].array != NULL;i++) {
    if(ar == map_win_name[i].array)return(map_win_name[i].name);
  }
  return(map_win_name[i].name);
}



// Create the Audio components.  These should be created in the
// order data flows, inputs/sources -> processing -> outputs
//

// Type of tone. SINE, SAWTOOTH, SQUARE or TRIANGLE
short type = TONE_TYPE_SINE;
#define TONE_FREQ 1723
#define TONE_LENGTH_MS 2000
#define SILENCE_LENGTH_MS 200
// Number of tones to send
int tone_count = 3;
// state of tone generator
int tone_state = 0;
// tone amplitude
float t_amp = 0.05;
char *wave_names[4] = {
  "sine",
  "sawtooth",
  "square",
  "triangle"
};
AudioSynthWaveform     myEffect;

AudioOutputI2S      audioOutput;        // audio shield: headphones & line-out

// Send the synthesized tone to the FFT and to the left and
// right audio output channels
AudioConnection c1(myEffect, 0, audioOutput, 0);
AudioConnection c2(myEffect, 0, myFFT, 0);
AudioConnection c3(myEffect, 0, audioOutput, 1);

// Create an object to control the audio shield.
// 
AudioControlSGTL5000 audioShield;


// Use the LiquidCrystal library to display the spectrum
// Define 8 special characters for the bar graph display
LiquidCrystal lcd( 2, 3, 4, 5, 6, 7, 8);
byte bar1[8] = {
  0,0,0,0,0,0,0,255};
byte bar2[8] = {
  0,0,0,0,0,0,255,255};
byte bar3[8] = {
  0,0,0,0,0,255,255,255};
byte bar4[8] = {
  0,0,0,0,255,255,255,255};
byte bar5[8] = {
  0,0,0,255,255,255,255,255};
byte bar6[8] = {
  0,0,255,255,255,255,255,255};
byte bar7[8] = {
  0,255,255,255,255,255,255,255};
byte bar8[8] = {
  255,255,255,255,255,255,255,255};

// Index of first bin to be displayed on the LCD
int start_index = 0;
// character blocks which display the spectrum
char blocks[2][17];

void setup() {
  Serial.begin(115200);
  while(!Serial);
  delay(2000);
  
  // Audio connections require memory to work.  For more
  // detailed information, see the MemoryAndCpuUsage example
  AudioMemory(12);

  // Enable the audio shield and set the output volume.
  audioShield.enable();
  //  audioShield.inputSelect(myInput);
  audioShield.volume(60);

  // I want output on the line out too
  audioShield.unmuteLineout();
  //  audioShield.muteHeadphone();

  blocks[0][16] = 0;
  blocks[1][16] = 0;
  
  lcd.begin(16, 2);
  lcd.print("Audio Spectrum");
  delay(2000);
  lcd.createChar(0, bar1);
  lcd.createChar(1, bar2);
  lcd.createChar(2, bar3);
  lcd.createChar(3, bar4);
  lcd.createChar(4, bar5);
  lcd.createChar(5, bar6);
  lcd.createChar(6, bar7);
  lcd.createChar(7, bar8);

 
  myEffect.set_ramp_length(144);
  // Start the tone generator with zero amplitude
  myEffect.begin(t_amp,TONE_FREQ,type);
  Serial.print("tone_freq = ");
  Serial.print(TONE_FREQ);
  Serial.print(", ");
  Serial.print(wave_names[type]);
  Serial.print(", amplitude = ");
  Serial.print(t_amp,4);
  Serial.print(", FFT = ");
  Serial.print(num_avg);
  Serial.print(", ");
  Serial.println(window_name(window));
  
  // Only need one tone when printing the results to
  // the serial monitor
  if(!LCD_DISPLAY)tone_count = 1;
}

// Map an amplitde to a special character
char cmap(int i)
{
  if(i == 0)return(' ');
  return(i-1); 
}


// buffer and index for input from serial monitor
char tmp[128];
int idx = 0;

unsigned long last_time;
void loop() {
  int log_mag;
  int op;
  char *p;

  if(tone_count) {
    switch(tone_state) {
    case 0:  // Start the tone generator and timer
      myEffect.amplitude(t_amp);
      last_time = millis();
      // Wait for timer to expire
      tone_state = 1;
      break;

    case 1:  // Wait for timer to expire
      if(millis() - last_time < TONE_LENGTH_MS)break;
      myEffect.amplitude(0);
      // Now "send" silence
      tone_state = 2;
      last_time = millis();
      break;

    case 2:
      if(millis() - last_time < SILENCE_LENGTH_MS)break;
      // count this tone and then reset for next one
      tone_count--;
      tone_state = 0;
      break;
    }
  }

  if(LCD_DISPLAY) {
    if(start_index < 0)start_index = 0;
    if(start_index > 112)start_index = 112;
    if (myFFT.available()) {
      for(int i=0;i < 16;i++) {
        op = myFFT.output[i+start_index];
        if(op < 0)op = 0;
        log_mag = 0;
        while(op) {
          log_mag++;
          op >>= 1;
        }
        //sprintf(tmp,"%4d (%2d),",op,log_mag);
        //Serial.print(tmp);
        //      log_mag = log(myFFT.output[i])/log(1.4);
        if(log_mag > 8) {
          blocks[1][i] = 7;
          if(log_mag >= 16)blocks[0][i] = 7;
          else blocks[0][i] = cmap(log_mag - 8);
        } 
        else {
          blocks[0][i] = ' ';
          blocks[1][i] = cmap(log_mag);
        }
      }
      //Serial.println("");
      lcd.setCursor(0,0);
      lcd.write(blocks[0],16);
      lcd.setCursor(0,1);
      lcd.write(blocks[1],16);
    }
  } 
  else {
    if(tone_count) {
      if (myFFT.available()) {
        for(int i=0;i < 128;i++) {
          Serial.print(myFFT.output[i]);
          Serial.print(", ");
        }
        Serial.println("");
      }
    }
  }
  while(Serial.available() > 0) {
    char c = Serial.read();
    if((c != '\n') && (c != '\r')) {
      tmp[idx++] = c;
      if(idx >= 127) {
        Serial.println("Error: line too long");
        idx = 0;
        break;
      }
      continue;
    }
    tmp[idx] = 0;
    if(idx == 0)continue;
    idx = 0;
    c = tmp[0];
    p = &tmp[1];
    while(*p && *p == ' ')p++;
    if(*p == 0) {
      idx = 0;
      continue;
    }
    switch(tolower(c)) {
    // By default the LCD display shows the first 16 bins.
    // Typing the command 's N' where N is a number from
    // 0 to 119 will display the 16 bins starting at index N
    case 's':
      start_index = atoi(p);
      Serial.println(start_index);
      break;
    }
  }
  // Volume control
  float vol = analogRead(15);
  vol = vol / 10.24;
  audioShield.volume(vol);
}
 
@Paul:
When I first started testing the code in my previous post I got some anomalous results which I haven't been able to reproduce. The signal was spread across at least 6 bins either side of the peak. I will try to reconstruct the code which did that and then try to figure out what was happening.

Pete
 
@wintrmute:
How are you initializing the FFT? If you do it like the Audio library's example:
Code:
AudioAnalyzeFFT256  myFFT(20);
you will get the default Hann(ing) window. When I use that default to analyze an 861Hz tone I get this output:
Code:
3, 0, 3, 5, 1238, 2447, 1223, 2, 0, 0, 0, ...
As you can see, what should be a single peak in one bin is also smeared to the ones on either side.
However, if I specify it like this (no windowing used):
Code:
AudioAnalyzeFFT256  myFFT(num_avg,NULL);
the output is:
Code:
8, 7, 9, 12, 21, 4913, 17, 8, 5, 4, 3, ...
which has an obvious peak which is what I was expecting it to look like.
Maybe try different windows to see if they affect what you are getting.

Pete
@Paul: the above results were the anomaly I mentioned previously. It took me a while to figure out what I had changed in my code to get such different results.
 
Quick reply for now - I've been interstate for a friend's wedding for a few days, and just came home and started reading mail.
I've been using real audio-in rather than the in-library sound generation; I should check my cables and soldering in case there's something adding noise there.
I'm using the default windowing function, yes.. I've been trying to use very small n_avg values, as I want real-time visual reactions to occur.
I'll try a few window functions in the morning, including the NULL one. Looks like that might get me the results I see with libfftw in Linux on the BeagleBone, as I don't recall setting any window functions when using it.

Thanks so much for investigating this, and for helping out.
 
By the way - when using the Adafruit_Neopixel library, plus the teensy Audio library, the audio I hear throw the audio board's output jack is terribly distorted.
If I comment out strip.show() in my code though, it sounds fine.
I wonder if that has something to do with my audio FFT problems? :(
 
Running a test with no LED libraries running, and an 800 Hz test tone via line-in, I get the following output in the bins when using AudioAnalyzeFFT256 myFFT(15, NULL):
105, 120, 140, 220, 560, 998, 264, 155, 109, 86, 70

So there is a *bit* of a peak around the right place, but it's nowhere near as distinct as your test.
 
In my test I was generating a digital signal which contained only one specific tone at a known amplitude.
Your audio input to line-in may not be a high enough amplitude. If possible, can you generate 871Hz instead of 800Hz? 871Hz falls in the middle of the bin and should give you a clearer signal.

Pete
 
At 871 Hz exactly, at the highest level I can produce from a phone output (ie. probably sub-line-leve), I get:
80, 85, 106, 152, 307, 5881, 341, 162, 104, 79

Curiously, though, if I go to say 899 Hz, I then get a somewhat-smeared output of:
333, 349, 412, 550, 981, 5463, 671, 438, 324, 259

Still with a nice big peak in the middle, but a lot of noise elsewhere.
 
@wintrmute: my phone is a terrible source for such things and it would not surprise me to find yours is too. The majority of low end sound cards on PCs aren't much better.

Have you many other options for a source you feel you could trust? Have you any other equipment with which you could 'rate' the output of your phone?

When I plug the headphone output of my phone into the line in of the audio adapter and plug headphones into the headphone socket on the audio adapter, with the phone not intentionally outputting anything, and then turn up the gain on the audio adapter very much I hear all sorts of noise (primarily ac leak and similes from the environment around me). Effectively Pete's test, using an internal signal generator and going straight to the FFT routine with that, is in noiseless environment and your test is in a very noisy one.
 
Status
Not open for further replies.
Back
Top