AudioPlayMemory array layout?

Status
Not open for further replies.

Gav

Member
Hi All,


I've got some questions about the behaviour of the AudioPlayMemory object, and how it expects the array containing the samples to be laid out.

I've read the old documentation here: https://www.pjrc.com/teensy/td_libs_AudioPlayMemory.html as well as the AudioPlayMemory page in the teensy audio tool.
and some comments here:
https://forum.pjrc.com/threads/29121-Recording-audio-to-buffer?highlight=2's+complement
and looked at the source code for "wav2sketch".

And I think I've mostly wrapped my head around it, but I can't reconcile some things with the numbers I see in the example sketches.

Here's my understanding of the layout of AudioPlayMemory, please let me know if I've gotten anything wrong:
1) The data is stored as an array of "const unsigned int", each element of which is 32-bits (4 bytes) in size.
2) The data of each sample is _signed_ in a 2's complement form.
3) Different sample rates (44100, 22050, 11025 Hz) can be selected,
4) Samples can be in either an 8-bit compressed format, or uncompressed 16-bit PCM. (16-bit samples are stored in little-endian format, with low-byte containting the least-significant bits)
5) The high 8-bits of the first word of the array specify the format of the compression, and the low 24-bits specifies the length of the samples. According to the documentation the length is specified in bytes, but I suspect it is actually specified in samples.

Here's my diagram:
audioplaymemory_memory_usage_v01.jpg


The problem is when I look at example sketches and see how other people have used the AudioPlayMemory objects, I get different numbers for array size than what I'd predict based on the length specification in the header.
Specifically the numbers are wrong for all the 16-bit formats if I assume the units of length are bytes:

audioplaymemory_usage_v01.png

If I assume the units of length are samples, then it looks much better, but some of the files still don't match the pattern:


audioplaymemory_usage_v02.jpg

Is what I'm seeing just a bug in wav2sketch (or the files "guitar_a2_note" and "guitar_b3_note" in the example sketches), or have I made a mistake somewhere?



Cheers,
Gavin
 
I would suggest that the number in the lower 3 bytes of the first word are not "number of bytes" but "number of samples". Looking at:

// Audio data converted from WAV file by wav2sketch

#include "AudioSampleKick.h"
#include <Arduino.h>
// Converted from kick.wav, using 22050 Hz, 16 bit PCM encoding
PROGMEM
const unsigned int AudioSampleKick[2561] = {
0x820013EC,0xFFDC0027,0xFF710095,0x038DFF4C,0xFBA10105,0x0037FB6E,0x09D2011A,0x007504CA,


The word is 820013EC. So format 82 with 13EC = 5100 samples. As there are 2 samples per word then of [2561] one word of that is the format/length then there are 2560 words that each contain two (16 bit) samples so a total of 2560 * 2 = 5100 samples - which is what the 13EC figure would appear to represent.
 

Attachments

  • kick.jpg
    kick.jpg
    17.8 KB · Views: 60
I would suggest that the number in the lower 3 bytes of the first word are not "number of bytes" but "number of samples". Looking at:

Yep, that certainly looks more consistent, but you can see from my second screenshot that the two example files "guitar_a2_note" and "guitar_b3_note" still don't fit the pattern. (Although other 16-bit files with the same sample rate do).

I'm keen to find out what the proper format for AudioPlayMemory should be, as I'm having a lot of trouble playing arrays that were NOT made by wav2sketch.

I can play all of Paul's premade arrays fine, but I can't seem to get any audible sound out of arrays containing samples I make myself. For example I've made code that (after boot) populates an array with simple samples of a sinewave, but when I call playMem1.play(TX_array) I don't hear anything. Whilst running playMem1.play(AudioSampleSnare) allows me to play the canned snare drum sample with no issues.

I'll make a cut down example sketch demonstrating the issue and post it tonight.
 
OK, I've got it all working now. Turned out my original code was accidentally overwriting the header, which presumably prevented the AudioPlayMemory object from recognizing the array as legitimate.


In case it's useful to anyone else, here's a quick (and inelegant) sketch which populates an array with a simple calculated sinewave, then plays it through the speaker three times:
Code:
#include <Audio.h>
#include <Wire.h>
#include <SPI.h>
#include <SD.h>
#include <SerialFlash.h>

// GUItool: begin automatically generated code
AudioPlayMemory          playMem1;       //xy=682.6666870117188,276.00000500679016
AudioOutputI2S           audioOutput;    //xy=1079.6666870117188,362.00000500679016
AudioConnection          patchCord1(playMem1, 0, audioOutput, 0);
AudioControlSGTL5000     audioShield;    //xy=913.6666870117188,232.00000500679016
// GUItool: end automatically generated code


const int TX_array_len_samples = 128 * 100 * 3;
const int TX_array_len_words = TX_array_len_samples / 2;
int TX_array[TX_array_len_words + 1];

void setup() {
  AudioMemory(10);
  delay(2000);
  audioShield.enable();
  audioShield.volume(0.5);

  populate_tx_array(1000.0, 5000.0);
  //  print_tx_array();
  //  print_tuba_array();

  for (int i = 0; i < 3; i++) {
    playMem1.play(TX_array);
    while (playMem1.isPlaying()) {};
    delay(500);
  }
}

void loop() {

}


uint16_t make_unsigned_16_bit_sample(float t) {
  // First make the signal as a signed int:
  int sample_a = (cos( 2 * PI * 1000.0 * t) / 2.5 ) * 65535;

  // Convert to unsigned int type, with same contents:
  unsigned int result = 0;
  if (sample_a >= 0) {
    result = sample_a;
  } else {
    result = (-sample_a + 1);
    result = ~result + 1;
  }
  return result;
}

void populate_tx_array(float start_freq, float stop_freq) {
  TX_array[0] = 0x81 << 24 | (TX_array_len_samples); //0x81 = 44100 Hz, 16 bit uncompressed PCM
  for (int i = 1; i < TX_array_len_words ; i++) {
    float t_a = (2.0 * i + 0) / 44100.0; // time in seconds
    float t_b = (2.0 * i + 1) / 44100.0; // time in seconds
    uint16_t sample_a = make_unsigned_16_bit_sample(t_a);
    uint16_t sample_b = make_unsigned_16_bit_sample(t_b );

    TX_array[ i] = (sample_b << 16) | (sample_a ); // two samples per array entry.
  }
}

void print_tx_array() {
  Serial.print("The array has length: ");
  Serial.print(TX_array_len_words);
  Serial.println(" words");
  for (int i = 0; i < TX_array_len_words ; i++) {
    int16_t sample_a = (int16_t)(TX_array[i] & 65535);
    int16_t sample_b   = (int16_t)(TX_array[i] >> 16);
    Serial.print(sample_a);
    Serial.print("\t\t");
    Serial.println(sample_b);
  }
  delay(5000);
}

I should probably mention that If you actually need a waveform, this isn't the way you should do it, and the built in audio library objects are the way to go.

The reason I'm using it in this case is that I'm preparing an array upfront for transmitting, then recording the received waveform to another array. Once I have both arrays populated I can do cross-correlation on them and figure out sonar type depth scan info.
 
Status
Not open for further replies.
Back
Top