Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 4 of 4

Thread: AudioPlayMemory array layout?

  1. #1
    Junior Member
    Join Date
    Jul 2019
    Location
    Sydney, Australia
    Posts
    11

    AudioPlayMemory array layout?

    Hi All,


    I've got some questions about the behaviour of the AudioPlayMemory object, and how it expects the array containing the samples to be laid out.

    I've read the old documentation here: https://www.pjrc.com/teensy/td_libs_...layMemory.html as well as the AudioPlayMemory page in the teensy audio tool.
    and some comments here:
    https://forum.pjrc.com/threads/29121...27s+complement
    and looked at the source code for "wav2sketch".

    And I think I've mostly wrapped my head around it, but I can't reconcile some things with the numbers I see in the example sketches.

    Here's my understanding of the layout of AudioPlayMemory, please let me know if I've gotten anything wrong:
    1) The data is stored as an array of "const unsigned int", each element of which is 32-bits (4 bytes) in size.
    2) The data of each sample is _signed_ in a 2's complement form.
    3) Different sample rates (44100, 22050, 11025 Hz) can be selected,
    4) Samples can be in either an 8-bit compressed format, or uncompressed 16-bit PCM. (16-bit samples are stored in little-endian format, with low-byte containting the least-significant bits)
    5) The high 8-bits of the first word of the array specify the format of the compression, and the low 24-bits specifies the length of the samples. According to the documentation the length is specified in bytes, but I suspect it is actually specified in samples.

    Here's my diagram:
    Click image for larger version. 

Name:	audioplaymemory_memory_usage_v01.jpg 
Views:	10 
Size:	100.1 KB 
ID:	22007


    The problem is when I look at example sketches and see how other people have used the AudioPlayMemory objects, I get different numbers for array size than what I'd predict based on the length specification in the header.
    Specifically the numbers are wrong for all the 16-bit formats if I assume the units of length are bytes:

    Click image for larger version. 

Name:	audioplaymemory_usage_v01.png 
Views:	17 
Size:	46.4 KB 
ID:	22008

    If I assume the units of length are samples, then it looks much better, but some of the files still don't match the pattern:


    Click image for larger version. 

Name:	audioplaymemory_usage_v02.jpg 
Views:	16 
Size:	121.4 KB 
ID:	22009

    Is what I'm seeing just a bug in wav2sketch (or the files "guitar_a2_note" and "guitar_b3_note" in the example sketches), or have I made a mistake somewhere?



    Cheers,
    Gavin

  2. #2
    Junior Member
    Join Date
    Sep 2020
    Posts
    18
    I would suggest that the number in the lower 3 bytes of the first word are not "number of bytes" but "number of samples". Looking at:

    // Audio data converted from WAV file by wav2sketch

    #include "AudioSampleKick.h"
    #include <Arduino.h>
    // Converted from kick.wav, using 22050 Hz, 16 bit PCM encoding
    PROGMEM
    const unsigned int AudioSampleKick[2561] = {
    0x820013EC,0xFFDC0027,0xFF710095,0x038DFF4C,0xFBA1 0105,0x0037FB6E,0x09D2011A,0x007504CA,


    The word is 820013EC. So format 82 with 13EC = 5100 samples. As there are 2 samples per word then of [2561] one word of that is the format/length then there are 2560 words that each contain two (16 bit) samples so a total of 2560 * 2 = 5100 samples - which is what the 13EC figure would appear to represent.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	kick.jpg 
Views:	9 
Size:	17.8 KB 
ID:	22013  

  3. #3
    Junior Member
    Join Date
    Jul 2019
    Location
    Sydney, Australia
    Posts
    11
    Quote Originally Posted by wrightflyer View Post
    I would suggest that the number in the lower 3 bytes of the first word are not "number of bytes" but "number of samples". Looking at:
    Yep, that certainly looks more consistent, but you can see from my second screenshot that the two example files "guitar_a2_note" and "guitar_b3_note" still don't fit the pattern. (Although other 16-bit files with the same sample rate do).

    I'm keen to find out what the proper format for AudioPlayMemory should be, as I'm having a lot of trouble playing arrays that were NOT made by wav2sketch.

    I can play all of Paul's premade arrays fine, but I can't seem to get any audible sound out of arrays containing samples I make myself. For example I've made code that (after boot) populates an array with simple samples of a sinewave, but when I call playMem1.play(TX_array) I don't hear anything. Whilst running playMem1.play(AudioSampleSnare) allows me to play the canned snare drum sample with no issues.

    I'll make a cut down example sketch demonstrating the issue and post it tonight.

  4. #4
    Junior Member
    Join Date
    Jul 2019
    Location
    Sydney, Australia
    Posts
    11
    OK, I've got it all working now. Turned out my original code was accidentally overwriting the header, which presumably prevented the AudioPlayMemory object from recognizing the array as legitimate.


    In case it's useful to anyone else, here's a quick (and inelegant) sketch which populates an array with a simple calculated sinewave, then plays it through the speaker three times:
    Code:
    #include <Audio.h>
    #include <Wire.h>
    #include <SPI.h>
    #include <SD.h>
    #include <SerialFlash.h>
    
    // GUItool: begin automatically generated code
    AudioPlayMemory          playMem1;       //xy=682.6666870117188,276.00000500679016
    AudioOutputI2S           audioOutput;    //xy=1079.6666870117188,362.00000500679016
    AudioConnection          patchCord1(playMem1, 0, audioOutput, 0);
    AudioControlSGTL5000     audioShield;    //xy=913.6666870117188,232.00000500679016
    // GUItool: end automatically generated code
    
    
    const int TX_array_len_samples = 128 * 100 * 3;
    const int TX_array_len_words = TX_array_len_samples / 2;
    int TX_array[TX_array_len_words + 1];
    
    void setup() {
      AudioMemory(10);
      delay(2000);
      audioShield.enable();
      audioShield.volume(0.5);
    
      populate_tx_array(1000.0, 5000.0);
      //  print_tx_array();
      //  print_tuba_array();
    
      for (int i = 0; i < 3; i++) {
        playMem1.play(TX_array);
        while (playMem1.isPlaying()) {};
        delay(500);
      }
    }
    
    void loop() {
    
    }
    
    
    uint16_t make_unsigned_16_bit_sample(float t) {
      // First make the signal as a signed int:
      int sample_a = (cos( 2 * PI * 1000.0 * t) / 2.5 ) * 65535;
    
      // Convert to unsigned int type, with same contents:
      unsigned int result = 0;
      if (sample_a >= 0) {
        result = sample_a;
      } else {
        result = (-sample_a + 1);
        result = ~result + 1;
      }
      return result;
    }
    
    void populate_tx_array(float start_freq, float stop_freq) {
      TX_array[0] = 0x81 << 24 | (TX_array_len_samples); //0x81 = 44100 Hz, 16 bit uncompressed PCM
      for (int i = 1; i < TX_array_len_words ; i++) {
        float t_a = (2.0 * i + 0) / 44100.0; // time in seconds
        float t_b = (2.0 * i + 1) / 44100.0; // time in seconds
        uint16_t sample_a = make_unsigned_16_bit_sample(t_a);
        uint16_t sample_b = make_unsigned_16_bit_sample(t_b );
    
        TX_array[ i] = (sample_b << 16) | (sample_a ); // two samples per array entry.
      }
    }
    
    void print_tx_array() {
      Serial.print("The array has length: ");
      Serial.print(TX_array_len_words);
      Serial.println(" words");
      for (int i = 0; i < TX_array_len_words ; i++) {
        int16_t sample_a = (int16_t)(TX_array[i] & 65535);
        int16_t sample_b   = (int16_t)(TX_array[i] >> 16);
        Serial.print(sample_a);
        Serial.print("\t\t");
        Serial.println(sample_b);
      }
      delay(5000);
    }
    I should probably mention that If you actually need a waveform, this isn't the way you should do it, and the built in audio library objects are the way to go.

    The reason I'm using it in this case is that I'm preparing an array upfront for transmitting, then recording the received waveform to another array. Once I have both arrays populated I can do cross-correlation on them and figure out sonar type depth scan info.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •