Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 23 of 23

Thread: Read ID3 tags from audio files

  1. #1

    Read ID3 tags from audio files

    I would like to read ID3 tags from audio files stored on an SD card, using a Teensy 4.1.

    My first go to is to try and find a handy library, but so far I've had no joy there. The ID3 official website lists a few C/C++ libraries, one of which (id3v2lib) might be light enough to work but it isn't built to compile with the Arduino IDE and is beyond my skills to port across at the moment.

    Does anyone know of any libraries that will read ID3 tags?

    I have also taken some time to try and decode the ID3 header and frames manually referencing the ID3v2.3 documentation - so far I have yet to find an efficient way to search through hundreds of hex bytes. If no libraries exist then I would some helpful pointers/resources in this area would also be very useful!

  2. #2
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Here's a snippet from a larger project I wrote:

    Code:
    // https://id3.org/id3v2.4.0-structure
    
    typedef struct
    {
        char id[3];      //"ID3"
        uint8_t version[2]; //04 00 Version 4
        uint8_t flag_unsynchronization: 1;
        uint8_t flag_extendedheader : 1;
        uint8_t flag_experimental : 1;
        uint8_t flag_footerpresent : 1;
        uint8_t flags_zero: 4;
        uint32_t size;
    } __attribute__((packed)) ID3TAG;
    
    
    typedef struct
    {
        uint32_t size;
        uint8_t numFlagBytes;
        uint8_t flag_zero1: 1;
        uint8_t flag_update: 1;
        uint8_t flag_crc: 1;
        uint8_t flag_restrictions: 1;
        uint8_t flag_zero2: 4;
    } __attribute__((packed)) ID3TAG_EXTHEADER;
    
    
    typedef struct
    {
        char id[4];      //Tag ID
        uint32_t size;
        uint8_t flags[2];   //Several bits.. not needed here
    } __attribute__((packed)) ID3FRAME;
    
    
    
    
    static
    uint32_t id3_unsyncsafe(uint32_t data)
    {
        data = __builtin_bswap32(data);
    
    
        auto out = 0;
        auto mask = 0x7f000000ul;
    
    
        do {
            out >>= 1;
            out |= data & mask;
            mask >>= 8;
        } while (mask);
    
    
        return out;
    }
    
    
    /*
    id3_getString
    The first byte tells the encoding:
        $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
        $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
            strings in the same frame SHALL have the same byteorder.
            Terminated with $00 00.
        $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
            Terminated with $00 00.
        $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
    */
    static String id3_getString(File file, unsigned int len)
    {
        if (len > 127) len = 127;
        char buf[len + 1];
        file.readBytes(buf, len);
        buf[len] = 0;
        const auto encoding = buf[0];
        auto *s = &buf[1];
        switch(encoding) {
            case 0x00:
                //log_v("Encoding is ISO-8859-1");
                return latin1UTF8(s);
            case 0x01:
                //log_v("Encoding is UTF-16 with BOM");
                return UTF16UTF8(s, len);
            case 0x02:
                //log_v("Encoding is UTF-16BE without BOM");
                return UTF16UTF8(s, len);
            case 0x03:
                //log_v("Encoding is UTF-8");
                return String(s);
        }
        return String(s);
    }
    
    
    void AudioCodecs::_id3(void)
    {
        //Try to skip or parse ID3 Tags - fill buffer
        ID3present = false;
        auto numTagstoFind = 2;
    
    
        ID3TAG id3;
        auto lenRead = 0;
        lenRead+= file.readBytes((char*)&id3, sizeof(id3));
    
    
        if (strncmp (id3.id, "ID3", 3) != 0)
            return; //Identifier not found
    
    
        //size is a "syncsafe" integer
        id3.size = id3_unsyncsafe(id3.size) + 10;
    
    
       // log_e("ID3 size: %u", id3.size);
        if (id3.version[0] < 2 || id3.version[0] == 0xff || id3.version[1] == 0xff || id3.flags_zero != 0)
            return;
    
    
        log_d("ID3v%u.%u found", (unsigned)id3.version[0], (unsigned)id3.version[1] );
    
    
        //TODO: Handle ID3V2
        if (id3.version[0] == 2 || id3.version[0] > 4 ) {
            log_w("Can't handle this ID3 version");
            goto end;
        }
    
    
        if (id3.flag_extendedheader) {
            log_v("Has extended header. Skipping.");
            ID3TAG_EXTHEADER extheader;
            lenRead += file.readBytes((char*)&extheader, sizeof(extheader));
            extheader.size = id3_unsyncsafe(extheader.size);
            if (extheader.size < 6) goto end; //error
            lenRead += extheader.size;
        }
    
    
        //ID3 Frames:
        file.seek(lenRead);
        do {
            ID3FRAME frameheader;
            lenRead += file.readBytes((char*)&frameheader, sizeof(frameheader));
            frameheader.size = id3_unsyncsafe(frameheader.size);
            if (frameheader.id[0] == 0 || frameheader.id[1] == 0  || frameheader.id[2] == 0 ) break;
            if (lenRead + frameheader.size >= id3.size) break;
    
    
            //log_v("ID3 Frame: %c%c%c%c size:%u", frameheader.id[0], frameheader.id[1], frameheader.id[2] , frameheader.id[3], frameheader.size);
    
    
            if ( strncmp("TIT2", frameheader.id, 4) == 0) {
                //log_e("tit2");
                setTitle( id3_getString(file, frameheader.size).c_str() );
                numTagstoFind--;
            }
            else if ( strncmp("TPE1", frameheader.id, 4) == 0) {
                //log_e("tpe1");
                setDescription( id3_getString(file, frameheader.size).c_str() );
                numTagstoFind--;
            }
    
    
            if (numTagstoFind == 0) break;
            file.seek(lenRead + frameheader.size);
        } while(1);
        //vTaskDelay( pdMS_TO_TICKS(1) );
    
    
    end:
        //Skip to first mp3/aac frame:
        file.seek(id3.size);
    }
    You'll have to write the missing functions. file has to be open and at position(0).
    It reads "TIT2" and "TPE1" - should be easy to extend.

  3. #3
    Wow that's a lot, thank you! Certainly a way above my current knowledge level but I'll take some time to decipher it.

    Is this for ID3 V2 as well, or just V1?

  4. #4
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Not sure.. *g*... i think it works with v2 only. Have not tested v1 (have no files with v1)

    I think v1 is not needed anymore.. it's outdated for many many years now.

    (Edit: The comment is misleading.... )

  5. #5
    it's outdated for many many years now.
    Yes, for a couple of decades now I think! But it seems to be easier to parse due to its tighter structure, so I've found a few projects that read V1 but not V2.

    I was confused by this which, on first glance, seems to stop the whole thing if it finds version 2?
    Code:
    if (id3.version[0] == 2 || id3.version[0] > 4 ) {
            log_w("Can't handle this ID3 version");
            goto end;
        }

  6. #6
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    yup, looks like I wrote that for ID3v3
    However, it's still simple.

  7. #7
    That makes sense!

    I've crowbarred the whole thing into an Arduino sketch by replacing all functions that the IDE didn't like (mostly log_ functions). The only thing I couldn't seem to replace is the encoding section:
    Code:
    switch(encoding) {
            case 0x00:
                //log_v("Encoding is ISO-8859-1");
                return latin1UTF8(s);
            case 0x01:
                //log_v("Encoding is UTF-16 with BOM");
                return UTF16UTF8(s, len);
            case 0x02:
                //log_v("Encoding is UTF-16BE without BOM");
                return UTF16UTF8(s, len);
            case 0x03:
                log_v("Encoding is UTF-8");
            return String(s);
        }
    Is it necessary for this use case? And if so, would you happen to know what library is required to resolve the missing encoding form functions?

  8. #8
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    You can replace the log* functions with a Serial.printf() or Serial.println...
    And for the first steps, to get it running, you can comment-out the switch.
    ID3 supports different string encodings. This is for characters like , €, etc.
    Without that, it will still work but those characters will not be decoded.
    But this topic has nothing todo with ID3... I think you'll find some code that can decode them on the net...

  9. #9
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Quote Originally Posted by Frank B View Post
    You can replace the log* functions with a Serial.printf() or Serial.println...
    And for the first steps, to get it running, you can comment-out the switch.
    ID3 supports different string encodings. This is for characters like , €, etc.
    Without that, it will still work but those characters will not be decoded.
    But this topic has nothing todo with ID3... I think you'll find some code that can decode them on the net...

    Btw, the Arduino-IDE uses UTF-8 as far i know.

    https://stackoverflow.com/questions/...o-utf-8-in-c-c

  10. #10
    Perfect, all compiles with no issues!

    The next problem is actually retrieving the frame data. I've added some more debug Serial.prints and the whole thing appears to run all the way through. However, it returns "PE1" for the TIT2 title, and ends before finding TPE1. It's got me a little stumped! I don't know if you experienced something like this during your other project?
    Or more likely I've just poorly implemented it...

    For reference, sketch:
    Code:
    #include <SD.h>
    #define SDCARD_CS_PIN    BUILTIN_SDCARD
    
    SdFat sd;
    SdFile dir;
    File file;
    
    #include "SPI.h"
    
    char title[256]; 
    char artist[256];
    
    void setup() {
      // put your setup code here, to run once:
      Serial.begin(9600);
    
      if (!(SD.sdfs.begin(SdioConfig(FIFO_SDIO)))) {
        while (1) {
          Serial.println("Unable to access the SD card");
          delay(500);
        }
      }
    
      file = SD.open("/MusicBee/Music/Deadmau5/5 years of mau5/1-04 - Some Chords.mp3");
      if(!file){
        Serial.println("File failed to open");
      }
      
      id3Read();
      file.close();
      
      Serial.print("Title: ");
      Serial.println(title);
      Serial.print("Artist: ");
      Serial.println(artist);
    
    }
    
    void loop(){
      
    }
    
    // https://id3.org/id3v2.4.0-structure
    
    typedef struct
    {
        char id[3];      //"ID3"
        uint8_t version[2]; //04 00 Version 4
        uint8_t flag_unsynchronization: 1;
        uint8_t flag_extendedheader : 1;
        uint8_t flag_experimental : 1;
        uint8_t flag_footerpresent : 1;
        uint8_t flags_zero: 4;
        uint32_t size;
    } __attribute__((packed)) ID3TAG;
    
    
    typedef struct
    {
        uint32_t size;
        uint8_t numFlagBytes;
        uint8_t flag_zero1: 1;
        uint8_t flag_update: 1;
        uint8_t flag_crc: 1;
        uint8_t flag_restrictions: 1;
        uint8_t flag_zero2: 4;
    } __attribute__((packed)) ID3TAG_EXTHEADER;
    
    
    typedef struct
    {
        char id[4];      //Tag ID
        uint32_t size;
        uint8_t flags[2];   //Several bits.. not needed here
    } __attribute__((packed)) ID3FRAME;
    
    
    static uint32_t id3_unsyncsafe(uint32_t data)
    {
        data = __builtin_bswap32(data);
    
    
        auto out = 0;
        auto mask = 0x7f000000ul;
    
    
        do {
            out >>= 1;
            out |= data & mask;
            mask >>= 8;
        } while (mask);
    
    
        return out;
    }
    
    
    /*
    id3_getString
    The first byte tells the encoding:
        $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
        $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
            strings in the same frame SHALL have the same byteorder.
            Terminated with $00 00.
        $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
            Terminated with $00 00.
        $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
    */
    static String id3_getString(File file, unsigned int len)
    {
        if (len > 127) len = 127;
        char buf[len + 1];
        file.readBytes(buf, len);
        buf[len] = 0;
        const auto encoding = buf[0];
        auto *s = &buf[1];
    //    switch(encoding) {
    //        case 0x00:
    //            //log_v("Encoding is ISO-8859-1");
    //            return latin1UTF8(s);
    //        case 0x01:
    //            //log_v("Encoding is UTF-16 with BOM");
    //            return UTF16UTF8(s, len);
    //        case 0x02:
    //            //log_v("Encoding is UTF-16BE without BOM");
    //            return UTF16UTF8(s, len);
    //        case 0x03:
    //            //log_v("Encoding is UTF-8");
    //            return String(s);
    //    }
        return String(s);
    }
    
    
    //class AudioCodecs
    //{
    //  public:
    //    void _id3();
    //  private:
    //    bool ID3present;
    //};
    
    //void AudioCodecs::_id3(void)
    void id3Read()
    {
        Serial.println("Reading ID3 header");
        //Try to skip or parse ID3 Tags - fill buffer
        bool ID3present = false;
        auto numTagstoFind = 2;
    
    
        ID3TAG id3;
        auto lenRead = 0;
        lenRead+= file.readBytes((char*)&id3, sizeof(id3));
    
    
        if (strncmp (id3.id, "ID3", 3) != 0){
            Serial.println("No ID3 header found");
            return; //Identifier not found
        }
    
        //size is a "syncsafe" integer
        id3.size = id3_unsyncsafe(id3.size) + 10;
    
    
       // log_e("ID3 size: %u", id3.size);
        if (id3.version[0] < 2 || id3.version[0] == 0xff || id3.version[1] == 0xff || id3.flags_zero != 0){
            Serial.println("Abort!");
            return;
        }
    
    //    log_d("ID3v%u.%u found", (unsigned)id3.version[0], (unsigned)id3.version[1] );
    
    
        //TODO: Handle ID3V2
        if (id3.version[0] == 2 || id3.version[0] > 4 ) {
    //        log_w("Can't handle this ID3 version");
            Serial.println("Can't handle this ID3 version");
            goto end;
        }
    
    
        if (id3.flag_extendedheader) {
    //        log_v("Has extended header. Skipping.");
            Serial.println("Has extended header. Skipping.");
            ID3TAG_EXTHEADER extheader;
            lenRead += file.readBytes((char*)&extheader, sizeof(extheader));
            extheader.size = id3_unsyncsafe(extheader.size);
            if (extheader.size < 6) goto end; //error
            lenRead += extheader.size;
        }
    
    
        //ID3 Frames:
        file.seek(lenRead);
        do {
            Serial.println("Searching for frames");
            ID3FRAME frameheader;
            lenRead += file.readBytes((char*)&frameheader, sizeof(frameheader));
            frameheader.size = id3_unsyncsafe(frameheader.size);
            if (frameheader.id[0] == 0 || frameheader.id[1] == 0  || frameheader.id[2] == 0 ){ 
              Serial.println("Blank data, break");
              break;
            }
            if (lenRead + frameheader.size >= id3.size){
              Serial.println("End of ID3, break");
              break;
            }
    
    
            //log_v("ID3 Frame: %c%c%c%c size:%u", frameheader.id[0], frameheader.id[1], frameheader.id[2] , frameheader.id[3], frameheader.size);
    
            if ( strncmp("TIT2", frameheader.id, 4) == 0) {
                //log_e("tit2");
    //            setTitle( id3_getString(file, frameheader.size).c_str() ); //save the title somewhere
                Serial.print("Found title: ");
                Serial.println(id3_getString(file, frameheader.size).c_str());
                strcpy(title,id3_getString(file, frameheader.size).c_str());
                numTagstoFind--;
            }
            else if ( strncmp("TPE1", frameheader.id, 4) == 0) {
                //log_e("tpe1");
    //            setDescription( id3_getString(file, frameheader.size).c_str() ); 
                strcpy(artist,id3_getString(file, frameheader.size).c_str());
                numTagstoFind--;
            }
    
    
            if (numTagstoFind == 0){ 
              Serial.println("Found all tags");
              break;
            }
            file.seek(lenRead + frameheader.size);
        } while(1);
        //vTaskDelay( pdMS_TO_TICKS(1) );
    
    
    end:
        //Skip to first mp3/aac frame:
        file.seek(id3.size);
    }
    Serial Monitor output:
    Code:
    Reading ID3 header
    Searching for frames
    Searching for frames
    Found title: ��S
    Searching for frames
    Blank data, break
    Title: PE1
    Artist:
    and audio file: https://drive.google.com/file/d/17Zy...ew?usp=sharing (link may not stay live indefinitely)
    which definitely has both TIT2 and TPE1 fields according to MusicBee:
    Click image for larger version. 

Name:	Capture some chords.PNG 
Views:	20 
Size:	22.0 KB 
ID:	27289
    Last edited by thecomfychair; 01-21-2022 at 09:17 PM. Reason: Added link to audio file and picture of file ID3 tags

  11. #11
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Maybe the encoding is UTF16..
    I'll take a look tonight, or tomorrow.

  12. #12
    That would be amazing if you have a spare moment, thank you!

  13. #13
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Code:
    #include <SD.h>
    #define SDCARD_CS_PIN    BUILTIN_SDCARD
    
    
    SdFat sd;
    SdFile dir;
    File file;
    
    
    #include "SPI.h"
    
    
    String title;
    String artist;
    
    
    void setup() {
      // put your setup code here, to run once:
      Serial.begin(9600);
    
    
      if (!(SD.sdfs.begin(SdioConfig(FIFO_SDIO)))) {
        while (1) {
          Serial.println("Unable to access the SD card");
          delay(500);
        }
      }
    
    
      file = SD.open("ForTag.mp3");
      if (!file) {
        Serial.println("File failed to open");
      }
    
    
      id3Read();
      file.close();
    
    
      Serial.print("Title: ");
      Serial.println(title);
      Serial.print("Artist: ");
      Serial.println(artist);
    
    
    }
    
    
    void loop() {
    
    
    }
    
    
    
    
    String UTF16UTF8(const char* buf, const uint32_t len)
    {
      // converts unicode in UTF-8, buff contains the string to be converted up to len
      // range U+1 ... U+FFFF
    
    
      //if no BOM found, BE is default
      String out;
      out.reserve(len);
    
    
      auto *tmpbuf = (uint8_t*)malloc(len + 1);
      
      if (!tmpbuf)
        return String(); // out of memory;
    
    
      auto *t = tmpbuf;
    
    
      auto bitorder = false; //Default to BE
      auto *p = (uint16_t*) buf;
      const auto *pe = (uint16_t*) &buf[len];
      auto code = *p;
    
    
      if (code == 0xFEFF) {
        bitorder = false;
        p++;
      }  // LSB/MSB
      else if (code == 0xFFFE) {
        bitorder = true;
        p++;
      }  // MSB/LSB
    
    
      while (p < pe) {
        code = *p++;
    
    
        if (bitorder == true)
          code = __builtin_bswap16(code);
    
    
        if (code < 0X80) {
          *t++ = code & 0xff;
        }
        else if (code < 0X800) {
          *t++ = ((code >> 6) | 0XC0);
          *t++ = ((code & 0X3F) | 0X80);
        }
        else {
          *t++ = ((code >> 12) | 0XE0);
          *t++ = (((code >> 6) & 0X3F) | 0X80);
          *t++ = ((code & 0X3F) | 0X80);
        }
      }
    
    
      *t = 0;
      out = (char*)tmpbuf;
      free(tmpbuf);
      return out;
    }
    
    
    // https://id3.org/id3v2.4.0-structure
    
    
    typedef struct
    {
      char id[3];      //"ID3"
      uint8_t version[2]; //04 00 Version 4
      uint8_t flag_unsynchronization: 1;
      uint8_t flag_extendedheader : 1;
      uint8_t flag_experimental : 1;
      uint8_t flag_footerpresent : 1;
      uint8_t flags_zero: 4;
      uint32_t size;
    } __attribute__((packed)) ID3TAG;
    
    
    
    
    typedef struct
    {
      uint32_t size;
      uint8_t numFlagBytes;
      uint8_t flag_zero1: 1;
      uint8_t flag_update: 1;
      uint8_t flag_crc: 1;
      uint8_t flag_restrictions: 1;
      uint8_t flag_zero2: 4;
    } __attribute__((packed)) ID3TAG_EXTHEADER;
    
    
    
    
    typedef struct
    {
      char id[4];      //Tag ID
      uint32_t size;
      uint8_t flags[2];   //Several bits.. not needed here
    } __attribute__((packed)) ID3FRAME;
    
    
    
    
    static uint32_t id3_unsyncsafe(uint32_t data)
    {
      data = __builtin_bswap32(data);
    
    
      auto out = 0;
      auto mask = 0x7f000000ul;
    
    
      do {
        out >>= 1;
        out |= data & mask;
        mask >>= 8;
      } while (mask);
      
      return out;
    }
    
    
    
    
    /*
      id3_getString
      The first byte tells the encoding:
        $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
        $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
            strings in the same frame SHALL have the same byteorder.
            Terminated with $00 00.
        $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
            Terminated with $00 00.
        $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
    */
    static String id3_getString(File file, unsigned int len)
    {
      if (len > 127) len = 127;
      char buf[len + 1];
      file.readBytes(buf, len);
      buf[len] = 0;
      const auto encoding = buf[0];
      auto *s = &buf[1];
      len--;
      switch (encoding) {
        case 0x00:
          Serial.println("Encoding is ISO-8859-1");
          break;
        //          return latin1UTF8(s);
        case 0x01:
          Serial.println("Encoding is UTF-16 with BOM");
          return UTF16UTF8(s, len);
        case 0x02:
          Serial.println("Encoding is UTF-16BE without BOM");
          return UTF16UTF8(s, len);
        case 0x03:
          Serial.println("Encoding is UTF-8");
          break;
          //          return String(s);
      }
    
    
      return String(s);
    }
    
    
    
    
    void id3Read()
    {
      Serial.println("Reading ID3 header");
      //Try to skip or parse ID3 Tags - fill buffer
      bool ID3present = false;
      auto numTagstoFind = 2;
    
    
    
    
      ID3TAG id3;
      auto lenRead = 0;
      lenRead += file.readBytes((char*)&id3, sizeof(id3));
    
    
      if (strncmp (id3.id, "ID3", 3) != 0) {
        Serial.println("No ID3 header found");
        return; //Identifier not found
      }
    
    
      //size is a "syncsafe" integer
      id3.size = id3_unsyncsafe(id3.size) + 10;
    
    
      // log_e("ID3 size: %u", id3.size);
      if (id3.version[0] < 2 || id3.version[0] == 0xff || id3.version[1] == 0xff || id3.flags_zero != 0) {
        Serial.println("Abort!");
        return;
      }
    
    
      //    log_d("ID3v%u.%u found", (unsigned)id3.version[0], (unsigned)id3.version[1] );
    
    
      //TODO: Handle ID3V2
      if (id3.version[0] == 2 || id3.version[0] > 4 ) {
        //        log_w("Can't handle this ID3 version");
        Serial.println("Can't handle this ID3 version");
        goto end;
      }
    
    
    
    
      if (id3.flag_extendedheader) {
        //        log_v("Has extended header. Skipping.");
        Serial.println("Has extended header. Skipping.");
        ID3TAG_EXTHEADER extheader;
        lenRead += file.readBytes((char*)&extheader, sizeof(extheader));
        extheader.size = id3_unsyncsafe(extheader.size);
        if (extheader.size < 6) goto end; //error
        lenRead += extheader.size;
      }
    
    
    
    
      //ID3 Frames:
      file.seek(lenRead);
      Serial.println("Searching for frames");
      do {
    
    
        ID3FRAME frameheader;
        lenRead += file.readBytes((char*)&frameheader, sizeof(frameheader));
        frameheader.size = id3_unsyncsafe(frameheader.size);
    
    
        if (frameheader.id[0] == 0 || frameheader.id[1] == 0  || frameheader.id[2] == 0 ) {
          Serial.println("Blank data, break");
          break;
        }
    
    
        if (lenRead + frameheader.size >= id3.size) {
          Serial.println("End of ID3, break");
          break;
        }
    
    
        Serial.printf("ID3 Frame: %c%c%c%c size:%u\n", frameheader.id[0], frameheader.id[1], frameheader.id[2] , frameheader.id[3], frameheader.size);
    
    
        if ( strncmp("TIT2", frameheader.id, 4) == 0) {
          Serial.print("Found TIT2 - ");
          title = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
        else if ( strncmp("TPE1", frameheader.id, 4) == 0) {
          Serial.print("Found TPE1 - ");
          artist = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
    
    
        if (numTagstoFind == 0) {
          Serial.println("Found all searched tags");
          break;
        }
    
    
        lenRead += frameheader.size;
        file.seek(lenRead);
      } while (1);
    
    
    end:
      //Skip to first mp3/aac frame:
      file.seek(id3.size);
    }
    Note, I changed the filename for my test.

  14. #14
    You're a legend, thank you very much! We're definitely getting close. It works perfectly for the file that I was using to test, however it's not consistent with other files.

    This is pretty much identical, except there's a new function for clearing the Strings, opening and closing the file, and printing any matching tag values:
    Code:
    #include <SD.h>
    #define SDCARD_CS_PIN    BUILTIN_SDCARD
    
    
    SdFat sd;
    SdFile dir;
    File file;
    
    
    #include "SPI.h"
    
    char itemToOpen[256];
    String title;
    String artist;
    
    
    void setup() {
      // put your setup code here, to run once:
      Serial.begin(9600);
    
    
      if (!(SD.sdfs.begin(SdioConfig(FIFO_SDIO)))) {
        while (1) {
          Serial.println("Unable to access the SD card");
          delay(500);
        }
      }
      
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/5 years of mau5/1-04 - Some Chords.mp3");
      getTags();
    
    
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/At Play 2/1-01 - Outta My Life (Touch Mix).mp3");
      getTags();
    
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/For Lack Of A Better Name/1-03 - Ghosts 'n Stuff (feat. Rob Swire).mp3");
      getTags();
      
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/Album Title Goes Here/1-03 - The Veldt (feat. Chris James) [8 Minute Edit].mp3");
      getTags();
    }
    
    void getTags(){
      title = ""; //clear tag strings
      artist = "";
      
      Serial.print("Opening file ");
      Serial.println(itemToOpen);
    
      file = SD.open(itemToOpen);
      if (!file) {
        Serial.println("File failed to open");
      }
    
      id3Read();
      file.close();
    
      Serial.print("Title: ");
      Serial.println(title);
      Serial.print("Artist: ");
      Serial.println(artist);
    
    
    }
    
    
    void loop() {
    
    
    }
    
    
    
    
    String UTF16UTF8(const char* buf, const uint32_t len)
    {
      // converts unicode in UTF-8, buff contains the string to be converted up to len
      // range U+1 ... U+FFFF
    
    
      //if no BOM found, BE is default
      String out;
      out.reserve(len);
    
    
      auto *tmpbuf = (uint8_t*)malloc(len + 1);
      
      if (!tmpbuf)
        return String(); // out of memory;
    
    
      auto *t = tmpbuf;
    
    
      auto bitorder = false; //Default to BE
      auto *p = (uint16_t*) buf;
      const auto *pe = (uint16_t*) &buf[len];
      auto code = *p;
    
    
      if (code == 0xFEFF) {
        bitorder = false;
        p++;
      }  // LSB/MSB
      else if (code == 0xFFFE) {
        bitorder = true;
        p++;
      }  // MSB/LSB
    
    
      while (p < pe) {
        code = *p++;
    
    
        if (bitorder == true)
          code = __builtin_bswap16(code);
    
    
        if (code < 0X80) {
          *t++ = code & 0xff;
        }
        else if (code < 0X800) {
          *t++ = ((code >> 6) | 0XC0);
          *t++ = ((code & 0X3F) | 0X80);
        }
        else {
          *t++ = ((code >> 12) | 0XE0);
          *t++ = (((code >> 6) & 0X3F) | 0X80);
          *t++ = ((code & 0X3F) | 0X80);
        }
      }
    
    
      *t = 0;
      out = (char*)tmpbuf;
      free(tmpbuf);
      return out;
    }
    
    
    // https://id3.org/id3v2.4.0-structure
    
    
    typedef struct
    {
      char id[3];      //"ID3"
      uint8_t version[2]; //04 00 Version 4
      uint8_t flag_unsynchronization: 1;
      uint8_t flag_extendedheader : 1;
      uint8_t flag_experimental : 1;
      uint8_t flag_footerpresent : 1;
      uint8_t flags_zero: 4;
      uint32_t size;
    } __attribute__((packed)) ID3TAG;
    
    
    
    
    typedef struct
    {
      uint32_t size;
      uint8_t numFlagBytes;
      uint8_t flag_zero1: 1;
      uint8_t flag_update: 1;
      uint8_t flag_crc: 1;
      uint8_t flag_restrictions: 1;
      uint8_t flag_zero2: 4;
    } __attribute__((packed)) ID3TAG_EXTHEADER;
    
    
    
    
    typedef struct
    {
      char id[4];      //Tag ID
      uint32_t size;
      uint8_t flags[2];   //Several bits.. not needed here
    } __attribute__((packed)) ID3FRAME;
    
    
    
    
    static uint32_t id3_unsyncsafe(uint32_t data)
    {
      data = __builtin_bswap32(data);
    
    
      auto out = 0;
      auto mask = 0x7f000000ul;
    
    
      do {
        out >>= 1;
        out |= data & mask;
        mask >>= 8;
      } while (mask);
      
      return out;
    }
    
    
    
    
    /*
      id3_getString
      The first byte tells the encoding:
        $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
        $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
            strings in the same frame SHALL have the same byteorder.
            Terminated with $00 00.
        $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
            Terminated with $00 00.
        $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
    */
    static String id3_getString(File file, unsigned int len)
    {
      if (len > 127) len = 127;
      char buf[len + 1];
      file.readBytes(buf, len);
      buf[len] = 0;
      const auto encoding = buf[0];
      auto *s = &buf[1];
      len--;
      switch (encoding) {
        case 0x00:
          Serial.println("Encoding is ISO-8859-1");
          break;
        //          return latin1UTF8(s);
        case 0x01:
          Serial.println("Encoding is UTF-16 with BOM");
          return UTF16UTF8(s, len);
        case 0x02:
          Serial.println("Encoding is UTF-16BE without BOM");
          return UTF16UTF8(s, len);
        case 0x03:
          Serial.println("Encoding is UTF-8");
          break;
          //          return String(s);
      }
    
    
      return String(s);
    }
    
    
    
    
    void id3Read()
    {
      Serial.println("Reading ID3 header");
      //Try to skip or parse ID3 Tags - fill buffer
      bool ID3present = false;
      auto numTagstoFind = 2;
    
    
    
    
      ID3TAG id3;
      auto lenRead = 0;
      lenRead += file.readBytes((char*)&id3, sizeof(id3));
    
    
      if (strncmp (id3.id, "ID3", 3) != 0) {
        Serial.println("No ID3 header found");
        return; //Identifier not found
      }
    
    
      //size is a "syncsafe" integer
      id3.size = id3_unsyncsafe(id3.size) + 10;
    
    
      // log_e("ID3 size: %u", id3.size);
      if (id3.version[0] < 2 || id3.version[0] == 0xff || id3.version[1] == 0xff || id3.flags_zero != 0) {
        Serial.println("Abort!");
        return;
      }
    
    
      //    log_d("ID3v%u.%u found", (unsigned)id3.version[0], (unsigned)id3.version[1] );
    
    
      //TODO: Handle ID3V2
      if (id3.version[0] == 2 || id3.version[0] > 4 ) {
        //        log_w("Can't handle this ID3 version");
        Serial.println("Can't handle this ID3 version");
        goto end;
      }
    
    
    
    
      if (id3.flag_extendedheader) {
        //        log_v("Has extended header. Skipping.");
        Serial.println("Has extended header. Skipping.");
        ID3TAG_EXTHEADER extheader;
        lenRead += file.readBytes((char*)&extheader, sizeof(extheader));
        extheader.size = id3_unsyncsafe(extheader.size);
        if (extheader.size < 6) goto end; //error
        lenRead += extheader.size;
      }
    
    
    
    
      //ID3 Frames:
      file.seek(lenRead);
      Serial.println("Searching for frames");
      do {
    
    
        ID3FRAME frameheader;
        lenRead += file.readBytes((char*)&frameheader, sizeof(frameheader));
        frameheader.size = id3_unsyncsafe(frameheader.size);
    
    
        if (frameheader.id[0] == 0 || frameheader.id[1] == 0  || frameheader.id[2] == 0 ) {
          Serial.println("Blank data, break");
          break;
        }
    
    
        if (lenRead + frameheader.size >= id3.size) {
          Serial.println("End of ID3, break");
          break;
        }
    
    
        Serial.printf("ID3 Frame: %c%c%c%c size:%u\n", frameheader.id[0], frameheader.id[1], frameheader.id[2] , frameheader.id[3], frameheader.size);
    
    
        if ( strncmp("TIT2", frameheader.id, 4) == 0) {
          Serial.print("Found TIT2 - ");
          title = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
        else if ( strncmp("TPE1", frameheader.id, 4) == 0) {
          Serial.print("Found TPE1 - ");
          artist = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
    
    
        if (numTagstoFind == 0) {
          Serial.println("Found all searched tags");
          break;
        }
    
    
        lenRead += frameheader.size;
        file.seek(lenRead);
      } while (1);
    
    
    end:
      //Skip to first mp3/aac frame:
      file.seek(id3.size);
    }
    ...which returns:
    Code:
    Opening file /MusicBee/Music/Deadmau5/5 years of mau5/1-04 - Some Chords.mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: TALB size:33
    ID3 Frame: TIT2 size:25
    Found TIT2 - Encoding is UTF-16 with BOM
    ID3 Frame: TPE1 size:19
    Found TPE1 - Encoding is UTF-16 with BOM
    Found all searched tags
    Title: Some Chords
    Artist: deadmau5
    Opening file /MusicBee/Music/Deadmau5/At Play 2/1-01 - Outta My Life (Touch Mix).mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: MCDI size:94
    ID3 Frame: TENC size:32
    ID3 Frame: TSSE size:23
    ID3 Frame: TXXX size:20
    ID3 Frame: TDAT size:5
    ID3 Frame: TXXX size:15
    ID3 Frame: TXXX size:17
    ID3 Frame: TXXX size:28
    ID3 Frame: TXXX size:20
    ID3 Frame: PRIV size:27
    ID3 Frame: TXXX size:19
    ID3 Frame: APIC size:32707
    End of ID3, break
    Title: 
    Artist: 
    Opening file /MusicBee/Music/Deadmau5/For Lack Of A Better Name/1-03 - Ghosts 'n Stuff (feat. Rob Swire).mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: UFID size:92
    ID3 Frame: TIT2 size:34
    Found TIT2 - Encoding is ISO-8859-1
    ID3 Frame: TALB size:26
    ID3 Frame: TCON size:11
    ID3 Frame: TRCK size:2
    ID3 Frame: TYER size:5
    ID3 Frame: TPE2 size:9
    ID3 Frame: TENC size:12
    ID3 Frame: TPOS size:4
    ID3 Frame: APIC size:4574
    End of ID3, break
    Title: Ghosts 'n Stuff (feat. Rob Swire)
    Artist: 
    Opening file /MusicBee/Music/Deadmau5/Album Title Goes Here/1-03 - The Veldt (feat. Chris James) [8 Minute Edit].mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: TALB size:53
    ID3 Frame: TPE1 size:19
    Found TPE1 - Encoding is UTF-16 with BOM
    ID3 Frame: TYER size:11
    ID3 Frame: TPOS size:5
    ID3 Frame: TCON size:13
    ID3 Frame: TIT2 size:93
    Found TIT2 - Encoding is UTF-16 with BOM
    Found all searched tags
    Title: The Veldt (feat. Chris James) [8 Minute Edit]
    Artist: Deadmau5
    ...despite all of these files having the TPE1 and TIT2 fields. I'm very short of ideas, all I can think of is somehow we're skipping over some tags?
    I've been trying to wrap my head around bitwise operations etc but I'm still very much out of my depth.

    Files for reference (includes file from previous post; may not stay here forever): https://drive.google.com/drive/folde...Hf?usp=sharing

  15. #15
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Looks like it's something with embedded pictures - or more exact, with "unsync".
    Have to look closer....

  16. #16
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Looks like "syncsafe integers" are used in ID3V2.4 only - not in ID3V2.3

    https://id3.org/id3v2.3.0

    Hm.

    Easy fix:
    replace the calls to id3_unsyncsafe() by __builtin_bswap32()


    But it looks like I have to extend it for ID3V2.4 someday..
    Oh my... why isn't that backwards compatible?

  17. #17
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Code:
    #include <SD.h>
    #define SDCARD_CS_PIN    BUILTIN_SDCARD
    
    
    
    
    SdFat sd;
    SdFile dir;
    File file;
    
    
    
    
    #include "SPI.h"
    
    
    String title;
    String artist;
    
    
    void setup() {
      // put your setup code here, to run once:
      Serial.begin(9600);
    
    
    
    
      if (!(SD.sdfs.begin(SdioConfig(FIFO_SDIO)))) {
        while (1) {
          Serial.println("Unable to access the SD card");
          delay(500);
        }
      }
    
    
      getTags("1-04 - Some Chords.mp3");
      getTags("1-01 - Outta My Life (Touch Mix).mp3");
      getTags("1-03 - Ghosts _n Stuff (feat. Rob Swire).mp3");
      getTags("1-03 - The Veldt (feat. Chris James) [8 Minute Edit].mp3");
    }
    
    
    void getTags(const char* itemToOpen) {
      title = ""; //clear tag strings
      artist = "";
    
    
      Serial.print("Opening file ");
      Serial.println(itemToOpen);
    
    
      file = SD.open(itemToOpen);
      if (!file) {
        Serial.println("File failed to open");
      }
    
    
      id3Read(file);
      file.close();
    
    
      Serial.println();
      Serial.print("Title: ");
      Serial.println(title);
      Serial.print("Artist: ");
      Serial.println(artist);
      Serial.println();
      Serial.println();
    
    
    }
    
    
    
    
    void loop() {
    }
    
    
    
    
    String UTF16UTF8(const char* buf, const uint32_t len)
    {
      // converts unicode in UTF-8, buff contains the string to be converted up to len
      // range U+1 ... U+FFFF
    
    
    
    
      //if no BOM found, BE is default
      String out;
      out.reserve(len);
    
    
    
    
      auto *tmpbuf = (uint8_t*)malloc(len + 1);
    
    
      if (!tmpbuf)
        return String(); // out of memory;
    
    
    
    
      auto *t = tmpbuf;
    
    
    
    
      auto bitorder = false; //Default to BE
      auto *p = (uint16_t*) buf;
      const auto *pe = (uint16_t*) &buf[len];
      auto code = *p;
    
    
    
    
      if (code == 0xFEFF) {
        bitorder = false;
        p++;
      }  // LSB/MSB
      else if (code == 0xFFFE) {
        bitorder = true;
        p++;
      }  // MSB/LSB
    
    
    
    
      while (p < pe) {
        code = *p++;
    
    
    
    
        if (bitorder == true)
          code = __builtin_bswap16(code);
    
    
    
    
        if (code < 0X80) {
          *t++ = code & 0xff;
        }
        else if (code < 0X800) {
          *t++ = ((code >> 6) | 0XC0);
          *t++ = ((code & 0X3F) | 0X80);
        }
        else {
          *t++ = ((code >> 12) | 0XE0);
          *t++ = (((code >> 6) & 0X3F) | 0X80);
          *t++ = ((code & 0X3F) | 0X80);
        }
      }
    
    
    
    
      *t = 0;
      out = (char*)tmpbuf;
      free(tmpbuf);
      return out;
    }
    
    
    
    
    // https://id3.org/id3v2.4.0-structure
    
    
    
    
    typedef struct
    {
      char id[3];      //"ID3"
      uint8_t version[2]; //04 00 Version 4
      uint8_t flag_unsynchronization: 1;
      uint8_t flag_extendedheader : 1;
      uint8_t flag_experimental : 1;
      uint8_t flag_footerpresent : 1;
      uint8_t flags_zero: 4;
      uint32_t size;
    } __attribute__((packed)) ID3TAG;
    
    
    
    
    
    
    
    
    typedef struct
    {
      uint32_t size;
      uint8_t numFlagBytes;
      uint8_t flag_zero1: 1;
      uint8_t flag_update: 1;
      uint8_t flag_crc: 1;
      uint8_t flag_restrictions: 1;
      uint8_t flag_zero2: 4;
    } __attribute__((packed)) ID3TAG_EXTHEADER;
    
    
    
    
    
    
    
    
    typedef struct
    {
      char id[4];      //Tag ID
      uint32_t size;
      uint8_t flags[2];
    } __attribute__((packed)) ID3FRAME;
    
    
    
    
    
    
    /*
      id3_getString
      The first byte tells the encoding:
        $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
        $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
            strings in the same frame SHALL have the same byteorder.
            Terminated with $00 00.
        $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
            Terminated with $00 00.
        $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
    */
    static String id3_getString(File file, unsigned int len)
    {
      if (len > 127) len = 127;
      char buf[len + 1];
      file.readBytes(buf, len);
      buf[len] = 0;
      const auto encoding = buf[0];
      auto *s = &buf[1];
      len--;
      switch (encoding) {
        case 0x00:
          Serial.println("Encoding is ISO-8859-1");
          break;
        //          return latin1UTF8(s);
        case 0x01:
          Serial.println("Encoding is UTF-16 with BOM");
          return UTF16UTF8(s, len);
        case 0x02:
          Serial.println("Encoding is UTF-16BE without BOM");
          return UTF16UTF8(s, len);
        case 0x03:
          Serial.println("Encoding is UTF-8"); //ID3V4 only
          break;
          //          return String(s);
      }
    
    
      return String(s);
    }
    
    
    
    
    
    
    
    
    void id3Read(File file)
    {
      Serial.println("Reading ID3 header");
      //Try to skip or parse ID3 Tags - fill buffer
      auto numTagstoFind = 2;
    
    
      ID3TAG id3;
      auto lenRead = 0;
      lenRead += file.readBytes((char*)&id3, sizeof(id3));
    
    
      if (strncmp (id3.id, "ID3", 3) != 0) {
        Serial.println("No ID3 header found");
        return; //Identifier not found
      }
    
    
      //size is a "syncsafe" integer in v4
      id3.size = __builtin_bswap32(id3.size) + 10;
    
    
      
      Serial.printf("Found ID3v2.%u.%u\n", (unsigned)id3.version[0], (unsigned)id3.version[1] );
      
      if (id3.version[0] < 2 || id3.version[0] == 0xff || id3.version[1] == 0xff || id3.flags_zero != 0) {
        Serial.println("Invalid Data. Abort!");
        return;
      }
    
    
      //TODO: Handle ID3V2, V4
      if (id3.version[0] != 3) {
        Serial.println("Can't handle this ID3 version");
        goto end;
      }
    
    
      if (id3.flag_extendedheader) {
        Serial.println("Has extended header. Skipping.");
        ID3TAG_EXTHEADER extheader;
        lenRead += file.readBytes((char*)&extheader, sizeof(extheader));
        extheader.size = __builtin_bswap32(extheader.size);
        if (extheader.size < 6) goto end; //error
        lenRead += extheader.size;
      }
    
    
      //ID3 Frames:
      file.seek(lenRead);
      Serial.println("Searching for frames");
      do {
    
    
        ID3FRAME frameheader;
        lenRead += file.readBytes((char*)&frameheader, sizeof(frameheader));
        frameheader.size = __builtin_bswap32(frameheader.size);
    
    
        if (frameheader.id[0] == 0 || frameheader.id[1] == 0  || frameheader.id[2] == 0 ) {
          Serial.println("Blank data, break");
          break;
        }
    
    
        if (lenRead + frameheader.size >= id3.size) {
          Serial.println("End of ID3, break");
          break;
        }
    
    
        Serial.printf("ID3 Frame: %c%c%c%c size:0x%4x\n", frameheader.id[0], frameheader.id[1], frameheader.id[2] , frameheader.id[3], frameheader.size);
    
    
        if ( strncmp("TIT2", frameheader.id, 4) == 0) {
          Serial.print("Found TIT2 - ");
          title = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
        else if ( strncmp("TPE1", frameheader.id, 4) == 0) {
          Serial.print("Found TPE1 - ");
          artist = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
    
    
        if (numTagstoFind == 0) {
          Serial.println("Found all searched tags");
          break;
        }
    
    
        lenRead += frameheader.size;
        file.seek(lenRead);
      } while (1);
    
    
    
    
    end:
      //Skip to first mp3/aac frame:
      file.seek(id3.size);
    }

  18. #18
    replace the calls to id3_unsyncsafe() by __builtin_bswap32()
    Absolutely correct, it works like a charm now!
    Code:
    #include <SD.h>
    #define SDCARD_CS_PIN    BUILTIN_SDCARD
    
    SdFat sd;
    SdFile dir;
    File file;
    
    #include "SPI.h"
    
    char itemToOpen[256];
    String title;
    String artist;
    String tracklen;
    
    
    void setup() {
      // put your setup code here, to run once:
      Serial.begin(9600);
    
    
      if (!(SD.sdfs.begin(SdioConfig(FIFO_SDIO)))) {
        while (1) {
          Serial.println("Unable to access the SD card");
          delay(500);
        }
      }
      
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/5 years of mau5/1-04 - Some Chords.mp3");
      getTags();
    
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/At Play 2/1-01 - Outta My Life (Touch Mix).mp3");
      getTags();
    
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/For Lack Of A Better Name/1-03 - Ghosts 'n Stuff (feat. Rob Swire).mp3");
      getTags();
      
      strcpy(itemToOpen, "/MusicBee/Music/Deadmau5/Album Title Goes Here/1-03 - The Veldt (feat. Chris James) [8 Minute Edit].mp3");
      getTags();
    }
    
    void getTags(){
      title = ""; //clear tag strings
      artist = "";
      
      Serial.print("Opening file ");
      Serial.println(itemToOpen);
    
      file = SD.open(itemToOpen);
      if (!file) {
        Serial.println("File failed to open");
      }
    
      id3Read();
      file.close();
    
      Serial.print("Title: ");
      Serial.println(title);
      Serial.print("Artist: ");
      Serial.println(artist);
      Serial.print("Track length (ms): ");
      Serial.println(tracklen);
    
    }
    
    
    void loop() {
    
    }
    
    
    String UTF16UTF8(const char* buf, const uint32_t len)
    {
      // converts unicode in UTF-8, buff contains the string to be converted up to len
      // range U+1 ... U+FFFF
    
    
      //if no BOM found, BE is default
      String out;
      out.reserve(len);
    
    
      auto *tmpbuf = (uint8_t*)malloc(len + 1);
      
      if (!tmpbuf)
        return String(); // out of memory;
    
    
      auto *t = tmpbuf;
    
    
      auto bitorder = false; //Default to BE
      auto *p = (uint16_t*) buf;
      const auto *pe = (uint16_t*) &buf[len];
      auto code = *p;
    
    
      if (code == 0xFEFF) {
        bitorder = false;
        p++;
      }  // LSB/MSB
      else if (code == 0xFFFE) {
        bitorder = true;
        p++;
      }  // MSB/LSB
    
    
      while (p < pe) {
        code = *p++;
    
    
        if (bitorder == true)
          code = __builtin_bswap16(code);
    
    
        if (code < 0X80) {
          *t++ = code & 0xff;
        }
        else if (code < 0X800) {
          *t++ = ((code >> 6) | 0XC0);
          *t++ = ((code & 0X3F) | 0X80);
        }
        else {
          *t++ = ((code >> 12) | 0XE0);
          *t++ = (((code >> 6) & 0X3F) | 0X80);
          *t++ = ((code & 0X3F) | 0X80);
        }
      }
    
    
      *t = 0;
      out = (char*)tmpbuf;
      free(tmpbuf);
      return out;
    }
    
    
    // https://id3.org/id3v2.4.0-structure
    
    
    typedef struct
    {
      char id[3];      //"ID3"
      uint8_t version[2]; //04 00 Version 4
      uint8_t flag_unsynchronization: 1;
      uint8_t flag_extendedheader : 1;
      uint8_t flag_experimental : 1;
      uint8_t flag_footerpresent : 1;
      uint8_t flags_zero: 4;
      uint32_t size;
    } __attribute__((packed)) ID3TAG;
    
    
    
    
    typedef struct
    {
      uint32_t size;
      uint8_t numFlagBytes;
      uint8_t flag_zero1: 1;
      uint8_t flag_update: 1;
      uint8_t flag_crc: 1;
      uint8_t flag_restrictions: 1;
      uint8_t flag_zero2: 4;
    } __attribute__((packed)) ID3TAG_EXTHEADER;
    
    
    
    
    typedef struct
    {
      char id[4];      //Tag ID
      uint32_t size;
      uint8_t flags[2];   //Several bits.. not needed here
    } __attribute__((packed)) ID3FRAME;
    
    
    
    //for ID3V2.4 only, 
    //if using ID3V4, replace calls to __builtin_bswap32() in later functions with calls this function
    /*static uint32_t id3_unsyncsafe(uint32_t data)
    {
      data = __builtin_bswap32(data);
    
    
      auto out = 0;
      auto mask = 0x7f000000ul;
    
    
      do {
        out >>= 1;
        out |= data & mask;
        mask >>= 8;
      } while (mask);
      
      return out;
    }
    */
    
    /*
      id3_getString
      The first byte tells the encoding:
        $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
        $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
            strings in the same frame SHALL have the same byteorder.
            Terminated with $00 00.
        $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
            Terminated with $00 00.
        $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
    */
    static String id3_getString(File file, unsigned int len)
    {
      if (len > 127) len = 127;
      char buf[len + 1];
      file.readBytes(buf, len);
      buf[len] = 0;
      const auto encoding = buf[0];
      auto *s = &buf[1];
      len--;
      switch (encoding) {
        case 0x00:
          Serial.println("Encoding is ISO-8859-1");
          break;
        //          return latin1UTF8(s);
        case 0x01:
          Serial.println("Encoding is UTF-16 with BOM");
          return UTF16UTF8(s, len);
        case 0x02:
          Serial.println("Encoding is UTF-16BE without BOM");
          return UTF16UTF8(s, len);
        case 0x03:
          Serial.println("Encoding is UTF-8");
          break;
          //          return String(s);
      }
    
    
      return String(s);
    }
    
    
    void id3Read()
    {
      Serial.println("Reading ID3 header");
      //Try to skip or parse ID3 Tags - fill buffer
      bool ID3present = false;
      auto numTagstoFind = 3;
    
    
    
    
      ID3TAG id3;
      auto lenRead = 0;
      lenRead += file.readBytes((char*)&id3, sizeof(id3));
    
    
      if (strncmp (id3.id, "ID3", 3) != 0) {
        Serial.println("No ID3 header found");
        return; //Identifier not found
      }
    
    
      //size is a "syncsafe" integer
      id3.size = __builtin_bswap32(id3.size) + 10;
    
    
      // log_e("ID3 size: %u", id3.size);
      if (id3.version[0] < 2 || id3.version[0] == 0xff || id3.version[1] == 0xff || id3.flags_zero != 0) {
        Serial.println("Abort!");
        return;
      }
    
    
      //    log_d("ID3v%u.%u found", (unsigned)id3.version[0], (unsigned)id3.version[1] );
    
    
      //TODO: Handle ID3V2
      if (id3.version[0] == 2 || id3.version[0] > 4 ) {
        //        log_w("Can't handle this ID3 version");
        Serial.println("Can't handle this ID3 version");
        goto end;
      }
    
    
      if (id3.flag_extendedheader) {
        //        log_v("Has extended header. Skipping.");
        Serial.println("Has extended header. Skipping.");
        ID3TAG_EXTHEADER extheader;
        lenRead += file.readBytes((char*)&extheader, sizeof(extheader));
        extheader.size = __builtin_bswap32(extheader.size);
        if (extheader.size < 6) goto end; //error
        lenRead += extheader.size;
      }
    
    
    
      //ID3 Frames:
      file.seek(lenRead);
      Serial.println("Searching for frames");
      do {
    
    
        ID3FRAME frameheader;
        lenRead += file.readBytes((char*)&frameheader, sizeof(frameheader));
        frameheader.size = __builtin_bswap32(frameheader.size);
    
    
        if (frameheader.id[0] == 0 || frameheader.id[1] == 0  || frameheader.id[2] == 0 ) {
          Serial.println("Blank data, break");
          break;
        }
    
    
        if (lenRead + frameheader.size >= id3.size) {
          Serial.println("End of ID3, break");
          break;
        }
    
    
        Serial.printf("ID3 Frame: %c%c%c%c size:%u\n", frameheader.id[0], frameheader.id[1], frameheader.id[2] , frameheader.id[3], frameheader.size);
    
    
        if ( strncmp("TIT2", frameheader.id, 4) == 0) {
          Serial.print("Found TIT2 - ");
          title = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
        else if ( strncmp("TPE1", frameheader.id, 4) == 0) {
          Serial.print("Found TPE1 - ");
          artist = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
        else if ( strncmp("TLEN", frameheader.id, 4) == 0) {
          Serial.print("Found TLEN - ");
          tracklen = id3_getString(file, frameheader.size);
          numTagstoFind--;
        }
    
    
        if (numTagstoFind == 0) {
          Serial.println("Found all searched tags");
          break;
        }
    
    
        lenRead += frameheader.size;
        file.seek(lenRead);
      } while (1);
    
    
    end:
      //Skip to first mp3/aac frame:
      file.seek(id3.size);
    }
    Now returns as expected:
    Code:
    Opening file /MusicBee/Music/Deadmau5/5 years of mau5/1-04 - Some Chords.mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: TALB size:33
    ID3 Frame: TIT2 size:25
    Found TIT2 - Encoding is UTF-16 with BOM
    ID3 Frame: TPE1 size:19
    Found TPE1 - Encoding is UTF-16 with BOM
    ID3 Frame: TPE2 size:19
    ID3 Frame: TRCK size:11
    ID3 Frame: TPOS size:9
    ID3 Frame: APIC size:269246
    ID3 Frame: TYER size:11
    ID3 Frame: TCOP size:67
    ID3 Frame: PRIV size:984
    ID3 Frame: TCON size:15
    ID3 Frame: COMM size:10
    Blank data, break
    Title: Some Chords
    Artist: deadmau5
    Track length (ms): 
    Opening file /MusicBee/Music/Deadmau5/At Play 2/1-01 - Outta My Life (Touch Mix).mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: MCDI size:94
    ID3 Frame: TENC size:32
    ID3 Frame: TSSE size:23
    ID3 Frame: TXXX size:20
    ID3 Frame: TDAT size:5
    ID3 Frame: TXXX size:15
    ID3 Frame: TXXX size:17
    ID3 Frame: TXXX size:28
    ID3 Frame: TXXX size:20
    ID3 Frame: PRIV size:27
    ID3 Frame: TXXX size:19
    ID3 Frame: APIC size:131011
    ID3 Frame: COMM size:95
    ID3 Frame: TIT2 size:26
    Found TIT2 - Encoding is ISO-8859-1
    ID3 Frame: TYER size:5
    ID3 Frame: TIT1 size:10
    ID3 Frame: PRIV size:39
    ID3 Frame: PRIV size:41
    ID3 Frame: PRIV size:31
    ID3 Frame: PRIV size:138
    ID3 Frame: TLAN size:4
    ID3 Frame: TPUB size:13
    ID3 Frame: TCON size:18
    ID3 Frame: TALB size:10
    ID3 Frame: TPE2 size:9
    ID3 Frame: PRIV size:34
    ID3 Frame: PRIV size:39
    ID3 Frame: PRIV size:20
    ID3 Frame: TPOS size:4
    ID3 Frame: TRCK size:2
    ID3 Frame: TCOM size:34
    ID3 Frame: TPE1 size:9
    Found TPE1 - Encoding is ISO-8859-1
    ID3 Frame: TLEN size:7
    Found TLEN - Encoding is ISO-8859-1
    Found all searched tags
    Title: Outta My Life (Touch Mix)
    Artist: Deadmau5
    Track length (ms): 368120
    Opening file /MusicBee/Music/Deadmau5/For Lack Of A Better Name/1-03 - Ghosts 'n Stuff (feat. Rob Swire).mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: UFID size:92
    ID3 Frame: TIT2 size:34
    Found TIT2 - Encoding is ISO-8859-1
    ID3 Frame: TALB size:26
    ID3 Frame: TCON size:11
    ID3 Frame: TRCK size:2
    ID3 Frame: TYER size:5
    ID3 Frame: TPE2 size:9
    ID3 Frame: TENC size:12
    ID3 Frame: TPOS size:4
    ID3 Frame: APIC size:9054
    ID3 Frame: PRIV size:41
    ID3 Frame: PRIV size:39
    ID3 Frame: PRIV size:20
    ID3 Frame: PRIV size:31
    ID3 Frame: PRIV size:34
    ID3 Frame: PRIV size:39
    ID3 Frame: TPUB size:7
    ID3 Frame: PRIV size:138
    ID3 Frame: TCOM size:15
    ID3 Frame: TPE1 size:9
    Found TPE1 - Encoding is ISO-8859-1
    ID3 Frame: COMM size:36
    Blank data, break
    Title: Ghosts 'n Stuff (feat. Rob Swire)
    Artist: Deadmau5
    Track length (ms): 368120
    Opening file /MusicBee/Music/Deadmau5/Album Title Goes Here/1-03 - The Veldt (feat. Chris James) [8 Minute Edit].mp3
    Reading ID3 header
    Searching for frames
    ID3 Frame: TALB size:53
    ID3 Frame: TPE1 size:19
    Found TPE1 - Encoding is UTF-16 with BOM
    ID3 Frame: TYER size:11
    ID3 Frame: TPOS size:5
    ID3 Frame: TCON size:13
    ID3 Frame: TIT2 size:93
    Found TIT2 - Encoding is UTF-16 with BOM
    ID3 Frame: TXXX size:27
    ID3 Frame: TXXX size:31
    ID3 Frame: TRCK size:7
    ID3 Frame: TPE2 size:19
    Blank data, break
    Title: The Veldt (feat. Chris James) [8 Minute Edit]
    Artist: Deadmau5
    Track length (ms): 368120
    And as you can see I've added an extra field to search for (TLEN - track length in milliseconds), and it works perfectly first time. You're an absolute legend Frank, thank you very much!

    why isn't that backwards compatible?
    Yes it's rather a pain. Perhaps this explains why the ID3 homepage says that ID3V4 "...has not achieved popular status due to some disagreements on some of the revisions..."

    It's also just come to my attention that ID3 is only used for MP3s, and that the M4A standard is a whole other Apple-masterminded issue...I think that's a headache for another day.

  19. #19
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    I also don't see any real advantage of ID3V2.4
    It adds another encoding (who needs that? UTF16 can handle everything), and unneeded stuff (unsync).


    ID3 is also used for *.aac.
    M4A is horrible. I've seen files that have several kilobytes of M4A header just for the tags (w/o any embedded pictures!)
    That this comes from Apple seems unreal to me...
    Whereever possible, for mikrocontrollers, i'd convert that to .aac..

  20. #20
    Yes it seems like ID3V2.4 is the update nobody really asked for.

    M4A is horrible...Whereever possible, for mikrocontrollers, i'd convert that to .aac
    This is what I'm finding too... Great call on using *.aac instead - I'm syncing files using MusicBee so hopefully I can get some automatic conversion done on export. That would save many headaches!

  21. #21
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    I guess ffmpeg can do this.

  22. #22
    Unfortunately MusicBee won't handle non-standard tags for AAC files so that puts paid to that idea, even if it is possible to shoe-horn ID3 tags into them: https://getmusicbee.com/forum/index....7564#msg197564
    Worth a try though!
    Last edited by thecomfychair; 01-25-2022 at 06:57 AM. Reason: clarity

  23. #23
    Senior Member
    Join Date
    Apr 2014
    Location
    -
    Posts
    9,756
    Yes, often "APE" is used for aac. It stores the information at the end of the file, like ID3v1. However, when I played with this, many y4ears ago, I had some aac files with ID3 Tags.



    So, it's for a music player...

    I'd probably try to port and use taglib, yes.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •