Forum Rule: Always post complete source code & details to reproduce any issue!
Page 2 of 4 FirstFirst 1 2 3 4 LastLast
Results 26 to 50 of 87

Thread: Generic data logger object

  1. #26
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    @mborgerson - Looks good - with edits noted in p#24
    Overnight SD card filled - datalogger lib stops near end of Disk - again using 15.4MB PSRAM:
    Code:
    Warning, all data on the card will be erased.
    Enter 'Y' to continue: 
    Card size:   29.723  GiB (GiB = 2^30 bytes)
    Card will be formated exFAT
    Writing FAT .................................
    Writing upcase table
    Writing root
    Format done
    
    Card Type is exFAT
    File system initialization done.
    RTC has set the system time
    
    Starting Logger.
    
    File name is:  <LOG_0313.bin>
    Pre-allocating 1GB file space 
    Allocation succeeded.
    
    Opened  File LOG_0313.bin
     
       4.345,   Records:   1086250   Overflows:    0
       9.480,   Records:   2370000   Overflows:    0
      14.220,   Records:   3555000   Overflows:    0
    …
    7963.990,   Records:1990997500   Overflows:    0
    7969.520,   Records:1992380000   Overflows:    0
    Halting logging.  Getting near end of SD Card.
    
    Stopping Logger.
    Logger Status:
    
    MBytes Written: 30410.35   SdCard free space:     9.78 MBytes
    Collection time: 7972 seconds
    Max Collection delay: 1 microseconds
    Average Write Time: 285.946 milliseconds
    Maximum Write Time: 809.916 milliseconds
    
    Verifying last file
    
    Recnum: 1000000
    Recnum: 2000000
    …
    Recnum: 915000000
    Recnum: 916000000
    …
    Recnum: 1013000000
    Recnum: 1014000000
    …  
    Recnum: 1989000000
    Recnum: 1990000000
    Recnum: 1991000000
    Recnum: 1992000000
    End of file playback.
    
    Verification errors:  0
    Verification complete
    

  2. #27
    Senior Member
    Join Date
    Oct 2012
    Location
    Portland OR
    Posts
    689

    DataLogger.h file capitalization seems to matter, on Linux at least

    Just getting started trying out this interesting project.
    My first issue: provided examples fail to compile because #include <Datalogger.h> file is not found.

    Reason: capitalization. Actual file is named DataLogger.h with a capital "L"
    I'm running this on Ubuntu; maybe Windows ignores capitalization in filenames?

    Next step: realized Fast_LoggerT4.1 may assume the additional 8MB PSRAM chip which I don't have, so I tried Analog4_LoggerTest
    this seems to work, but the status reports some overflows, and a nearly 2 second write time. Is that expected? I am using T4.1 with an empty SanDisk 32GB card.

    Code:
    Starting Logger.
    File name is:  <LOG_1526.csv>
    Pre-allocating 1GB file space 
    Allocation succeeded.
    Opened  File LOG_1526.csv
     
    1589729194,   4960, 4063, 4068, 4067, 4068
    1589729198,   9030, 4066, 4069, 4067, 4072
    1589729203,  14059, 4064, 4067, 4066, 4065
    1589729208,  19088, 4070, 4062, 4068, 4067
    1589729213,  24717, 4060, 4068, 4066, 4064
    1589729218,  29747, 4066, 4070, 4071, 4067
    1589729224,  34776, 4068, 4069, 4074, 4071
    Stopping Logger.
    
    Logger Status:
    Collection time: 40 seconds  MBytes Written:     4.35
    Max chunks ready: 15 Overflows: 14 Max Collect delay: 85 uSecs
    Avg Write Time: 18.149 mSec   Max Write Time: 1800.530 mSec
    Ok, tried again this time first doing the in-program format instead of the pre-existing format done by the SD Card Assoc. formatting program which I ran on Win10.
    This time, the max write time looks a whole lot better:

    Code:
    Starting Logger.
    File name is:  <LOG_1533.csv>
    Pre-allocating 1GB file space 
    Allocation succeeded.
    Opened  File LOG_1533.csv
     
    1589729643,   4960, 4067, 4068, 4068, 4066
    1589729648,   9904, 4066, 4069, 4064, 4068
    1589729653,  14922, 4063, 4067, 4062, 4066
    1589729658,  19962, 4065, 4066, 4065, 4067
    1589729663,  25002, 4067, 4064, 4066, 4065
    1589729668,  30013, 4065, 4067, 4071, 4070
    1589729673,  34933, 4066, 4069, 4066, 4063
    1589729678,  39999, 4067, 4068, 4069, 4063
    1589729683,  44998, 4063, 4069, 4069, 4066
    1589729688,  49998, 4068, 4072, 4071, 4069
    Stopping Logger.
    
    Logger Status:
    Collection time: 53 seconds  MBytes Written:     6.71
    Max chunks ready: 3 Overflows: 0 Max Collect delay: 85 uSecs
    Avg Write Time: 10.584 mSec   Max Write Time: 26.510 mSec
    Last edited by JBeale; 05-17-2020 at 10:39 PM. Reason: update after in-program format

  3. #28
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Quote Originally Posted by JBeale View Post
    Just getting started trying out this interesting project.
    My first issue: provided example "Fast_LoggerT4_1" fails to compile because #include <Datalogger.h> file is not found.

    Reason: capitalization. Actual file is named DataLogger.h with a capital "L"
    I'm running this on Ubuntu; maybe Windows ignores capitalization in filenames?

    Next step: realized Fast_LoggerT4.1 may assume the additional 8MB PSRAM chip which I don't have, so I tried Analog4_LoggerTest
    this seems to work, but the status reports some overflows, and a nearly 2 second write time. Is that expected? I am using T4.1 with an empty SanDisk 32GB card.

    Code:
    Starting Logger.
    File name is:  <LOG_1526.csv>
    Pre-allocating 1GB file space 
    Allocation succeeded.
    Opened  File LOG_1526.csv
     
    1589729194,   4960, 4063, 4068, 4067, 4068
    1589729198,   9030, 4066, 4069, 4067, 4072
    1589729203,  14059, 4064, 4067, 4066, 4065
    1589729208,  19088, 4070, 4062, 4068, 4067
    1589729213,  24717, 4060, 4068, 4066, 4064
    1589729218,  29747, 4066, 4070, 4071, 4067
    1589729224,  34776, 4068, 4069, 4074, 4071
    Stopping Logger.
    
    Logger Status:
    Collection time: 40 seconds  MBytes Written:     4.35
    Max chunks ready: 15 Overflows: 14 Max Collect delay: 85 uSecs
    Avg Write Time: 18.149 mSec   Max Write Time: 1800.530 mSec
    > Yes WINDOWS cares not for CASE - unlike LINUX. To be usable that needs to be consistent.
    > Indeed Fast_LoggerT4.1 is special cased to assume PSRAM
    -- > I just extended that for a 33 Byte Packed struct and found there is some integrity issue I need to find after 15M records :: Teensy-4-1-Beta-Test
    -- > As noted there the packed struct is more abusive and also the added fields can be checked as KNOWN in Verify for integrity … now they fail on ReadBack and I need to see why ????
    > Something very UGLY/SLOW with your SD card :: Max Write Time: 1800.530 mSec - this may be causing logging to bump into writing SD and cause the Overflow.
    -- > my SD writes will be in excess of 1MB each and the above "Maximum Write Time: 809.916 milliseconds" is the highest I have seen it.
    -- > Do a PC Reformat of the card perhaps - or it may be counterfeit or dying and not running up to spec? Change the setup to allow longer buffer time is what I'm using to help FILL PSRAM here of 16 MB:: #define BUFFERMSEC 3600

    Last run shows:
    Code:
    Logger Status:
    
    MBytes Written:   791.26   SdCard free space: 29517.43 MBytes
    Collection time: 205 seconds
    Max Collection delay: 9 microseconds
    Average Write Time: 261.248 milliseconds
    Maximum Write Time: 612.549 milliseconds

  4. #29
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    111
    Thanks for the notes.

    1. I'll definitely check for proper spelling of file names. Windows HAS made me a bit lazy in that respect.

    2. A 33-byte packed structure??? I've always avoided structures with odd lengths since back in the early ARM days when accessing unaligned data had a speed penalty at best and outright failure at worst. I suspect that the problem pops up on replay as the data records are read for playback one 33-byte record at a time. While recording, the records are probably grouped into a chunk that is, at worst, a multiple of four bytes. That possibility is something for me to check out, as there are still some applications out there with odd structure lengths.

    3. One thing to note on the in-program format option: I think it is set up to program all size cards as ExFAT. You can do that with the SDFat 2.0 formatter. ExFaT is great for large data logger files, but wastes a lot of disk space with large cluster sizes if you are recording a lot of small files. In some recent tests, I noted very long write times when getting to the last 5% of the free space on a card. That's another thing I need to investigate. On a 2GB card formatted as FAT32, I did see write times over 600mSec on a regular basis even when not near the end of the card.


    4. There is a hidden sanity check in the buffer initialization code that limits the buffering time to 4095mSec. I will make that a #define MAXMSECLEN 4096
    and group all those sanity check defines in one area for easier maintenance.

  5. #30
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Yes, packed 33 byte struct on purpose to see if it would FAULT the processor or fail - it does not - it does seem to take extra time in early test pre-library, but that is okay just needed to know it worked. Did the same in early beta test of PSRAM - started the PSRAM pointer at 0x???001 and then did either BYTE or DWORD write and read compare of the rest of the chip with no problem.

    It fails similarly when packed is removed and the 33 byte struct gets padded. The weird thing is the FAIL can happen after recnum 4,000,000 or much higher - and all tests well up to there:
    Code:
    //unpacked
    Recnum: 33000000
    Record number 2 error at 33233123 [*>0x70DC57EC]  File: 33665123  local: 33233123
    Code:
    //unpacked
    Recnum: 28000000
    End of file playback.
    Verification errors:  0
    Code:
    //packed
    Recnum: 15000000
    Record number error at 15857784  File: 15857784  local: 15425784
    In Code Formatter is handy - not having to remove the SD to resume testing.

    > It would be handy for the 'Dir List' to show Free Space.

    ...

  6. #31
    Senior Member
    Join Date
    Jul 2014
    Posts
    2,665
    Quote Originally Posted by mborgerson View Post
    My long-term plan is to be able to have the logger become an MTP responder, so that you can upload the files from the Teensy as if it were an attached as an MTP USB disk. I've gotten the basic MTP test programs to work on the T3.6, but not the T4.That means that full MTP capability awaits my education and, probably a lot of work by other programmers working on the MTP functionality.
    That is what I have done for one of my application (as you are/were in the field, datalogger on Slocum glider, sorry its not a SeaGliderr).
    One Issue I faced:
    MTP keeps picture of disk in memory. So logging data does not automatically be made visible on PC.

    Could not get notification events working, so I settled on following solution:
    user disconnects MTP before starting data logging, connects again after stopping data logging
    when opening session, MTP responder clears and resets the local disc structure.

  7. #32
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    111
    Another thought just occurred to me: How does the PSRAM handle storage of 33-byte structures? Is there some mechanism unknown to me that doesn't cope well with odd-sized writes?

  8. #33
    Senior Member
    Join Date
    Oct 2012
    Location
    Portland OR
    Posts
    689
    Two possibly worthwhile changes I made to the basic logger. In the setup() part:
    Code:
      pinMode(A0, INPUT_DISABLE);  // 4 ADC inputs without digital "keeper" drive on pin
      pinMode(A1, INPUT_DISABLE);
      pinMode(A2, INPUT_DISABLE);
      pinMode(A3, INPUT_DISABLE);
    This made a BIG difference to sensor accuracy in my particular test, with C-bypassed but fairly high DC impedance inputs, nominally centered around Vdd/2 midpoint. I was really confused about it, until I remembered the Input Pin Keeper situation. Even though I think I was just posting about this in another thread a few days ago.

    In the GetStatus() function, I think it's fun to see the current system time, just to confirm it's correct.
    Code:
    void GetStatus(void) {
      TLoggerStat *tsp;
      float mbytes;
      char stime[25];       // string for YYYY-MM-DD HH:MM:SS system time
      time_t t;             // current C time, seconds since epoch
      tsp =  mydl.GetStatus();
    
      mbytes = tsp->byteswritten / (1024.0 * 1024.0);
      Serial.println("\nLogger Status:");
      t = now();
      sprintf(stime, "%04d-%02d-%02d %02d:%02d:%02d", year(t),month(t),day(t),hour(t),minute(t),second(t) );
      Serial.printf("Current time is: %s\n", stime);
     ...
    Code:
    Logger Status:
    Current time is: 2020-05-18 12:40:03
    Collection time: 554 seconds  MBytes Written:     2.27
    Max chunks ready: 1 Overflows: 0 Max Collect delay: 332 uSecs
    Avg Write Time:  1.778 mSec   Max Write Time: 91.201 mSec

  9. #34
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Quote Originally Posted by mborgerson View Post
    Another thought just occurred to me: How does the PSRAM handle storage of 33-byte structures? Is there some mechanism unknown to me that doesn't cope well with odd-sized writes?
    PSRAM I/O all handled by 1062 processor. It maps into memory space just like RAM for direct memory access. Odd boundary offset and lengths seem to test fine.

    I used this:
    Code:
    struct datrec {
      uint32_t microtime;    //  time in microseconds
      uint32_t millitime;   // millis() since collection start at time collection occurred
      uint32_t numrecords;  // number of structures collected
      uint8_t *cptr;        // pointer to last storage location to check out-of-bounds writes
      uint8_t aByte;
      uint32_t microtime2;    //  time in microseconds
      uint32_t millitime2;   // millis() since collection start at time collection occurred
      uint32_t numrecords2;  // number of structures collected
      uint32_t microtime3;    //  time in microseconds
    };
    //} __attribute__((packed));
    Then on myCollector() the values were just duplicated into the #2 and #3 itmes to compare to the primary stored values.
    Code:
    void myCollector(  void* vdp) {
      volatile struct datrec *dp;
      // Note: logger status keeps track of max collection time
      dp = (volatile struct datrec *)vdp;
      dp->millitime = filemillis; // save mSec wev'e been collecting
      dp->microtime = filemicros;
    
      dp->aByte = dp->microtime&0xff;
      dp->microtime2 = dp->microtime;
      dp->millitime2 = dp->millitime;
      dp->numrecords2 = numrecs;
      dp->microtime3 = dp->microtime;
    
      dp->numrecords = numrecs;
      dp->cptr = (uint8_t *)dp; // save the address at which data is being stored
      numrecs++;
    }
    Then during Verify I could assert that they read back the same from SD as they were written.
    Code:
      if (dp->aByte != (dp->microtime & 0xff)) {
      if (dp->numrecords2 != pbrecnum) {
      if ( pbmillitime != dp->millitime2  ) {
      if ( dp->microtime != dp->microtime2  ) {
      if ( dp->microtime != dp->microtime3 ) {
    And it works for some many Millions of records ... then triggers an error. For this test I need to see if these same tests pass in the buffer to be written to SD to find if they corrupt in PSRAM on write/read of the SD.

    And as noted if fails with or without being packed.

    I started printing the PSRAM address stored in dp->cptr - and it is not always the same address. I tested on a T_4.1 with 8 and 16 MB of PSRAM and both showed similar failures ????

    Need to make time to get back into the code. I could just make a CRC of the preceding bytes and check that - but this quickly showed checking multiple copies of the same changing number caught the failure.

  10. #35
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    111
    Errors that pop up just a few times in millions of records are hardest to troubleshoot.

    I can think of several reasons that things like this might happen:

    * Something strange in interrupt land that makes the timer irq handler take more than 4uSec (if you are still collecting at 250KSamples/second).
    * Read or write errors in either PSRAM or SDC. Those might be resolved by a test program that uses onboard static RAM to see if the same
    errors occur.

    Of course, there is the old standby: different Teensy hardware and new SD card. I always buy at least two of a new board when I intend to push things near the limits. (Most often the problem is letting out the magic smoke by clumsy probing with the O'scope or voltmeter).

    I got the T4.1 high-speed logger to go through a 16GB card without errors when I stopped at 95% of card capacity.


    I like the idea of adding free space available to the SD card directory output. I'll get that into the next version of the library---along with properly spelled file names in the examples. The analog logger example will also benefit from the "pinMode(A3, INPUT_DISABLE);" addition.

  11. #36
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    111
    Hmm, if the issue is something happening at interrupt time, then increasing the IntervalTimer priority might have an effect. I think I will also
    investigate the code for reading the elapsedMillis() and elapsedMicros() timers that are read inside the timer handler.

    I think I'll also add a command in the program to do a write and verify in sequence so that I can let it run overnight and check it in the morning.

  12. #37
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    I have two T_4.1's with their own SD and one 8MB PSRAM and the other 16MB

    Just put VERIFY on write during logging and there are no errors with first copy of code.

    Then quit logging and PlayBack Verify finds errors Here is the end of logging where @V is when the code below is run on the struct and the checks show the PSRAM buffer data is good. Then stopping logger and doing Verify triggers errors as shown:
    Code:
     V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  304.474,   Records:  38059200   Overflows:    0
     V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  V@  309.312,   Records:  38664000   Overflows:    0
     V@  V@  V@  V@  V@  V@ Stopping Logger.
    
    Verifying last file
    
    Recnum: 1000000
    Recnum: 2000000
    // GOOD
    Recnum: 15000000
    Record number error at 15529856  File: 15529856  local: 15097856
    millitime number error at 15529856  File: 124239  local: 120783
    millitime2 2 number error at 15529856 [*>0x70E12400]  File: 124239  local: 120783
    millitime number error at 15529857  File: 124239  local: 120783
    millitime2 2 number error at 15529857 [*>0x70E12424]  File: 124239  local: 120783
    millitime number error at 15529858  File: 124239  local: 120783
    millitime2 2 number error at 15529858 [*>0x70E12448]  File: 124239  local: 120783
    millitime number error at 15529859  File: 124239  local: 120783
    millitime2 2 number error at 15529859 [*>0x70E1246C]  File: 124239  local: 120783
    millitime number error at 15529860  File: 124239  local: 120783
    millitime2 2 number error at 15529860 [*>0x70E12490]  File: 124239  local: 120783
    millitime number error at 15529861  File: 124239  local: 120783
    millitime2 2 number error at 15529861 [*>0x70E124B4]  File: 124239  local: 120783
    millitime number error at 15529862  File: 124239  local: 120783
    millitime2 2 number error at 15529862 [*>0x70E124D8]  File: 124239  local: 120783
    millitime number error at 15529863  File: 124239  local: 120783
    millitime2 2 number error at 15529863 [*>0x70E124FC]  File: 124239  local: 120783
    millitime number error at 15529864  File: 124239  local: 120783
    millitime2 2 number error at 15529864 [*>0x70E12520]  File: 124239  local: 120783
    millitime number error at 15529865  File: 124239  local: 120783
    millitime2 2 number error at 15529865 [*>0x70E12544]  File: 124239  local: 120783
    millitime number error at 15529866  File: 124239  local: 120783
    millitime2 2 number error at 15529866 [*>0x70E12568]  File: 124239  local: 120783
    millitime number error at 15529867  File: 124239  local: 120783
    millitime2 2 number error at 15529867 [*>0x70E1258C]  File: 124239  local: 120783
    millitime number error at 15529868  File: 124239  local: 120783
    millitime2 2 number error at 15529868 [*>0x70E125B0]  File: 124239  local: 120783
    millitime number error at 15529869  File: 124239  local: 120783
    millitime2 2 number error at 15529869 [*>0x70E125D4]  File: 124239  local: 120783
    millitime number error at 15529870  File: 124239  local: 120783
    millitime2 2 number error at 15529870 [*>0x70E125F8]  File: 124239  local: 120783
    End of file playback.
    
    Verification errors:  31
    Verification complete
    Code:
    struct datrec {
      uint32_t microtime;    //  time in microseconds
      uint32_t millitime;   // millis() since collection start at time collection occurred
      uint32_t numrecords;  // number of structures collected
      uint8_t *cptr;        // pointer to last storage location to check out-of-bounds writes
      uint8_t aByte;
      uint32_t microtime2;    //  time in microseconds
      uint32_t millitime2;   // millis() since collection start at time collection occurred
      uint32_t numrecords2;  // number of structures collected
      uint32_t microtime3;    //  time in microseconds
    };
    //} __attribute__((packed));
    
    // write data chunks as binary data  writechunk has index
    // of first ready-to-write chunk
    void DataLogger::WriteBinaryChunks(uint16_t numready) {
      uint16_t i;
      uint32_t etime;
      float wravg;
    
      struct datrec *dp;
      uint32_t verifyerrors=0;
      Serial.printf(" V@ ");
    
      LEDON
      for (i = 0; i < numready; i++) {
        etime = micros();
        writePtr = chunkPtrs[writechunk];
    
      if (verifyerrors < 30) {
      dp = (struct datrec *)writePtr;
        if (dp->aByte != (dp->microtime & 0xff)) {
        Serial.printf("aByte error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->aByte, dp->microtime & 0xff);
        verifyerrors++;
      }
      if (dp->numrecords2 != dp->numrecords) {
        Serial.printf("Record number 2 error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->numrecords2, dp->numrecords);
        verifyerrors++;
      }
      if ( dp->millitime != dp->millitime2  ) {
        Serial.printf("millitime2 2 number error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->millitime2, dp->millitime);
        verifyerrors++;
      }
      if ( dp->microtime != dp->microtime2  ) {
        Serial.printf("microtime2 2 error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->microtime2, dp->millitime);
        verifyerrors++;
      }
    
      if ( dp->microtime != dp->microtime3 ) {
        Serial.printf("microtime3 3 error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->microtime3, dp->millitime);
        verifyerrors++;
      }
      }
    
        dataFile.write((uint8_t *)writePtr, chunksize); // write the data to file
    Just did a second VERIFY after the first and the data coming off the SD shows the same corruption detail.

    SKETCH with 16 MB PSRAM using:
    Code:
    #define SAMPLERATE 120000
    #define BUFFERMSEC 3600
    #define MAXBUFFER 15900000
    uint8_t *mybuffer = (uint8_t *)0x70000000;

  13. #38
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Above suggests the PSRAM is not the issue. The _isr() rate dropped to 120K/sec when the Struct size went from 16 to 36 in this case

    Perhaps the _isr() interrupting the SD card writes is getting corrupt data from that?

    Going to start and let it run to fill the rest of the SD card - 20 GB - with packed struct to confirm.

    <edit>
    Something odd with that code when PACKED - going to look ...:
    Code:
     V@ microtime2 2 error at 0 [*>0x70000000]  File: 139874045  local: 0
    microtime3 3 error at 0 [*>0x70000000]  File: 134217728  local: 0
     V@ Record number 2 error at 43200 [*>0x7017BB00]  File: 3221225473  local: 43200
    millitime2 2 number error at 43200 [*>0x7017BB00]  File: 1509950790  local: 346
    microtime2 2 error at 43200 [*>0x7017BB00]  File: 139340334  local: 346
    microtime3 3 error at 43200 [*>0x7017BB00]  File: 134217896  local: 346
     V@ Record number 2 error at 86400 [*>0x702F7600]  File: 2147483650  local: 86400
    millitime2 2 number error at 86400 [*>0x702F7600]  File: 3003124364  local: 691
    microtime2 2 error at 86400 [*>0x702F7600]  File: 145583788  local: 691
    microtime3 3 error at 86400 [*>0x702F7600]  File: 134218065  local: 691
     V@ Record number 2 error at 129600 [*>0x70473100]  File: 1073741828  local: 129600
    millitime2 2 number error at 129600 [*>0x70473100]  File: 218107858  local: 1037
    microtime2 2 error at 129600 [*>0x70473100]  File: 135957162  local: 1037
    microtime3 3 error at 129600 [*>0x70473100]  File: 134218234  local: 1037
     V@ Record number 2 error at 172800 [*>0x705EEC00]  File: 5  local: 172800
    millitime2 2 number error at 172800 [*>0x705EEC00]  File: 1711281432  local: 1382
    ...

  14. #39
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    111
    I note that all the errors shown post #37 above occur when the buffer pointer is ~0x70E12400. I think this is in the memory range occupied by the 2nd PSRAM, which is soldered onto pads normally occupied by an SPI flash chip. The schematic shows that both chips use the same data and clock lines. Does the T4.1 driver initialize that segment driver differently than the 0x7000000 block?

    It also seems significant that all the errors in #37 occurred within a 512-byte block from 0x70E12400 to 0x70E125FF. That may indicate an issue with SD card storage or playback.

  15. #40
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Post #37 verifies the items before the dataFile.write() with the same checks done later on playBack Verify.

    Indeed showing and seeing those address made me wonder - that is why I put the verify code there in the library.

  16. #41
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Problem below(above p#38) was I only changed the struct{} to PACKED in one of the two locations since the lib needed a private copy.

    That proves is fails when wrong - except now I see it is only checking the first record in each CHUNK/buffer.
    Need to add a run through all elements to call it verified.

    Quote Originally Posted by defragster View Post
    Above suggests the PSRAM is not the issue. The _isr() rate dropped to 120K/sec when the Struct size went from 16 to 36 in this case

    Perhaps the _isr() interrupting the SD card writes is getting corrupt data from that?

    Going to start and let it run to fill the rest of the SD card - 20 GB - with packed struct to confirm.

    <edit>
    Something odd with that code when PACKED - going to look ...:
    Code:
     V@ microtime2 2 error at 0 [*>0x70000000]  File: 139874045  local: 0
    microtime3 3 error at 0 [*>0x70000000]  File: 134217728  local: 0
     V@ Record number 2 error at 43200 [*>0x7017BB00]  File: 3221225473  local: 43200
    millitime2 2 number error at 43200 [*>0x7017BB00]  File: 1509950790  local: 346
    microtime2 2 error at 43200 [*>0x7017BB00]  File: 139340334  local: 346
    microtime3 3 error at 43200 [*>0x7017BB00]  File: 134217896  local: 346
     V@ Record number 2 error at 86400 [*>0x702F7600]  File: 2147483650  local: 86400
    millitime2 2 number error at 86400 [*>0x702F7600]  File: 3003124364  local: 691
    microtime2 2 error at 86400 [*>0x702F7600]  File: 145583788  local: 691
    microtime3 3 error at 86400 [*>0x702F7600]  File: 134218065  local: 691
     V@ Record number 2 error at 129600 [*>0x70473100]  File: 1073741828  local: 129600
    millitime2 2 number error at 129600 [*>0x70473100]  File: 218107858  local: 1037
    microtime2 2 error at 129600 [*>0x70473100]  File: 135957162  local: 1037
    microtime3 3 error at 129600 [*>0x70473100]  File: 134218234  local: 1037
     V@ Record number 2 error at 172800 [*>0x705EEC00]  File: 5  local: 172800
    millitime2 2 number error at 172800 [*>0x705EEC00]  File: 1711281432  local: 1382
    ...

  17. #42
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Updated code running PACKED not finding error testing all "PACKED" structs in buffer to write - with added FOR() loop.

    The added test still completes in time to prevent overflow. Leaving to fill SD ... now at record #38+ MILLION
    Code:
      dp = (struct datrec *)writePtr;
    for ( int jj=0; jj< chunksize / sizeof(datrec); jj++) {
      if (verifyerrors < 30) {
        if (dp->aByte != (dp->microtime & 0xff)) {
        Serial.printf("aByte error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->aByte, dp->microtime & 0xff);
        verifyerrors++;
      }
      if (dp->numrecords2 != dp->numrecords) {
        Serial.printf("Record number 2 error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->numrecords2, dp->numrecords);
        verifyerrors++;
      }
      if ( dp->millitime != dp->millitime2  ) {
        Serial.printf("millitime2 2 number error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->millitime2, dp->millitime);
        verifyerrors++;
      }
      if ( dp->microtime != dp->microtime2  ) {
        Serial.printf("microtime2 2 error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->microtime2, dp->microtime);
        verifyerrors++;
      }
    
      if ( dp->microtime != dp->microtime3 ) {
        Serial.printf("microtime3 3 error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  File: %lu  local: %lu\n", dp->microtime3, dp->microtime);
        verifyerrors++;
      }
      }
        dp++;
    }
    if (verifyerrors!=0)
      Serial.printf("| V@ \n");

  18. #43
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    As note p#37 - that FIRST copy only validated first record.

    PSRAM read data for write is corrupted.

    So that write can be removed to make sure there isn't overflow data loss from ver

    Will take out the SD write and try again unpacked and see if anything else useful to print ...

  19. #44
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    @mborgerson: THIS IS ODD::
    Code:
     V@  V@  V@ Record 2nd number error at 419920454 [*>0x70084906]  PSRAM: 420352454  local: 420352454
    | V@
    The PSRAM and LOCAL MATCH at '420352454' - but do not agree with error at '419920454 '

    When in fact the code is testing the two same values where one tested was the same as the one indicating the error! ????

    Code:
      if (dp->numrecords2 != dp->numrecords) {
        Serial.printf("Record 2nd number error at %lu [*>0x%lX]", dp->numrecords, dp->cptr);
        Serial.printf("  PSRAM: %lu  local: %lu\n", dp->numrecords2, dp->numrecords);
        verifyerrors++;
      }
    The same pattern appears in all the other errors found - and it always the write before the end of the group write >> "| V@ "

    It seems the value of numready in :: for (i = 0; i < numready; i++) {
    is including a changing buffer?

    That might be a problem with the __STREXW getting interrupted and not properly handling the _isr() changing count of readychunks ???
    >> Will try with restored noInterrupts() code

  20. #45
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Ran first time with no SD writes - it stopped when write count peaked after 138,110,400 records unpacked without issue?

    Formatted drive and restarted could also stop the byteswritten update - Avg write time with Verify is 79.1 ms - so I could up the sample time and reduce the BUFFERMSEC cause more havoc.
    Will let this run unpacked some time - then redo packed test with no SD writes. Now at 78M Records.

    Still using the __STREXW for now. Will add print of the helper vars to see when it fires.

  21. #46
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Still chasing ... no errors detected on data to write packed or unpacked ... unpacked recorded about 25 GB ... checked but not written. None of the p#44 ghost issues appeared?

    Turned on write and write time verify passed, but stopped log Verify found issues on playback?
    Code:
     V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ 472.781,   Records:  59097600   Overflows:    0
     V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_ V@ _V#100_Stopping Logger.
     V@ _V#100_Verifying last file
    
    Recnum: 1000000
    // All Good
    Recnum: 38000000
    Record number 2 error at 38415018 [*>0x70C8EBEA]  File: 38847018  local: 38415018
    millitime2 2 number error at 38415018 [*>0x70C8EBEA]  File: 310649  local: 307321
    microtime3 3 error at 38415018 [*>0x70C8EBEA]  File: 310776153  local: 307321
    Record number error at 38847019  File: 38847019  local: 38415019
    millitime number error at 38847019  File: 310777  local: 307321
    millitime2 2 number error at 38847019 [*>0x70C8EC0B]  File: 310777  local: 307321
    millitime number error at 38847020  File: 310777  local: 307321
    millitime2 2 number error at 38847020 [*>0x70C8EC2C]  File: 310777  local: 307321
    millitime number error at 38847021  File: 310777  local: 307321
    millitime2 2 number error at 38847021 [*>0x70C8EC4D]  File: 310777  local: 307321
    millitime number error at 38847022  File: 310777  local: 307321
    millitime2 2 number error at 38847022 [*>0x70C8EC6E]  File: 310777  local: 307321
    millitime number error at 38847023  File: 310777  local: 307321
    millitime2 2 number error at 38847023 [*>0x70C8EC8F]  File: 310777  local: 307321
    millitime number error at 38847024  File: 310777  local: 307321
    millitime2 2 number error at 38847024 [*>0x70C8ECB0]  File: 310777  local: 307321
    millitime number error at 38847025  File: 310777  local: 307321
    millitime2 2 number error at 38847025 [*>0x70C8ECD1]  File: 310777  local: 307321
    millitime number error at 38847026  File: 310777  local: 307321
    millitime2 2 number error at 38847026 [*>0x70C8ECF2]  File: 310777  local: 307321
    millitime number error at 38847027  File: 310777  local: 307321
    millitime2 2 number error at 38847027 [*>0x70C8ED13]  File: 310777  local: 307321
    millitime number error at 38847028  File: 310777  local: 307321
    millitime2 2 number error at 38847028 [*>0x70C8ED34]  File: 310777  local: 307321
    millitime number error at 38847029  File: 310777  local: 307321
    millitime2 2 number error at 38847029 [*>0x70C8ED55]  File: 310777  local: 307321
    millitime number error at 38847030  File: 310777  local: 307321
    millitime2 2 number error at 38847030 [*>0x70C8ED76]  File: 310777  local: 307321
    millitime number error at 38847031  File: 310777  local: 307321
    millitime2 2 number error at 38847031 [*>0x70C8ED97]  File: 310777  local: 307321
    millitime number error at 38847032  File: 310777  local: 307321
    millitime2 2 number error at 38847032 [*>0x70C8EDB8]  File: 310777  local: 307321
    End of file playback.
    
    Verification errors:  32
    Verification complete

  22. #47
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Left it alone 4066 seconds to fill the SD with 507,254,400 records and no errors in verify at time of write detected ... after adding this to myCollector() :: arm_dcache_flush( vdp, sizeof(datrec) );
    Then starting playBack verify it finds errors reading from the SD???
    Code:
    Recnum: 247000000
    Record number 2 error at 247260392 [*>0x704EBFE8]  File: 247692392  local: 247260392
    microtime3 3 error at 247260392 [*>0x704EBFE8]  File: 1981539145  local: 1978083
    Record number error at 247692393  File: 247692393  local: 247260393
    millitime number error at 247692393  File: 1981539  local: 1978083
    millitime2 2 number error at 247692393 [*>0x704EC009]  File: 1981539  local: 1978083
    millitime number error at 247692394  File: 1981539  local: 1978083
    millitime2 2 number error at 247692394 [*>0x704EC02A]  File: 1981539  local: 1978083
    millitime number error at 247692395  File: 1981539  local: 1978083
    millitime2 2 number error at 247692395 [*>0x704EC04B]  File: 19815
    That is on 5MB into PSRAM [5,161,035] so not near the boundary. Of course that value is read from a file that just showed it is corrupted.

    And Recnum: 247,000,000 is farther into the logging process than normal - just under half of the nearly 9GB file.

    Just formatted 32GB SD and restarted with arm_dcache_flush_delete( vdp, sizeof(datrec) ); - the _delete really slows down the process (LED never off)! Now getting OVERFLOW It shouldn't help as the 32K cache should easily wipe as it gathers 1.5MB between writes?

    So no quick sure repro yet - just recurring 1 in a million things.

  23. #48
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,826
    Looking at this off and on - it seems the trouble here is from the __STREXW code

    Have filled SD cards a couple times on two T_4.1's using the nointerrupt() prior code:
    Code:
        noInterrupts();
        numready = readychunks;
        readychunks = 0;  // clear number ready
        mystatus.bufferoverflows = bufferoverflows;
        mystatus.maxcdelay = maxcdelay;
        interrupts();
    instead of:
    Code:
      do { // account for interrupts that may alter vars during read.
        __LDREXW(&logger_safe_read);
        readyHold = numready;
        numready += readychunks;
        readychunks -= (numready - readyHold);  // account for number found
        mystatus.bufferoverflows = bufferoverflows;
        mystatus.maxcdelay = maxcdelay;
        readyHoldLast++;
      } while ( __STREXW(1, &logger_safe_read));
    Tried to monitor the times the do{}while(); repeats and saw no trend or trigger.
    Often the PSRAM would verify fine before write - but find corruption in the file on playback, sometimes Verify woudl fail - but printed values that trigger fail show correct???
    REPRO test takes an hour+ and millions of records to test - and results not repeatable - even adding PSRAM cache_flush and cache_flush_delete

  24. #49
    Tried to compile the default_loggertest example - and got a couple of errors.

    One of them : C:\Users\ACER\AppData\Local\Temp\arduino_build_199 688\sketch\Datalogger.h:86:2: error: 'SdFs' does not name a type

    SdFs sdx; // for fat32/ExFat SD Card

    I use sdfat (greiman). Do I need something more ?

  25. #50
    Senior Member
    Join Date
    Jul 2014
    Posts
    2,665
    Quote Originally Posted by hw999 View Post
    I use sdfat (greiman). Do I need something more ?
    For ExFat you need SdFat-beta

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •