Fast Data Logger

Perhaps. Let's see what a one hour run looks like with a proper card. The dual buffer doesn't seem to hurt, so I may keep it any case. It's there , might as well use it.

That's a nice design. However his SDRAM uses up the bus I need for my 32 bit data input!

Here is my board. SD Card is on the back. The 'wings' are to attach batteries.
Chip side.png
 
I'm confused about your results. If your max buffer usage is 132 KB, what is your actual data rate? If you were getting 40-ms delays, that would imply your data is only about 2.5 MB/sec.
 
Oh sorry, my 40msec was based on what others had measured. I am seeing much less than that - so far. My data rate is 18mbytes/sec, so I’m seeing about 7msec.
I’ll do a full hour test tomorrow with a fast card and let’s see what I get there.
 
I was able to get 439,296 bytes in RAM1 and 505,856 bytes in RAM2 for a total of 945,152 bytes.
Not the best design. This provides a maximum of 439,296 of buffer space while the other is being written. Smaller buffers are better although write performance usually is better with larger buffers so it is a balancing act.

By using 32KB buffers, for example, you could double the maximum space available when there is a card stall.
 
I may have described things poorly. I don’t send the entire buffer to the SD write routine.
I send data to the SD card in 512 byte blocks.
I treat the two memories as one large buffer. The input data clock causes an interrupt that saves the 32 bits of data to the next spot in memory. It increments the buffer ‘in’ pointer as well as a word counter. If the ‘in’ pointer reaches the top of the RAM1 buffer it is set to the beginning of the RAM2 buffer. If it reaches the top of the RAM2 buffer it is reset to the beginning of the RAM1 buffer. In the main program I wait until that word counter meets or exceeds 512 bytes (128 words) and then call the SD Card write routine with 512 bytes of data. When it returns I subtract the 128 words from my word counter and advance the ‘out’ data pointer. I fix the ‘out’ pointer as needed as described above for the ‘in’ pointer.
I hope that makes sense.
 
Perhaps. Let's see what a one hour run looks like with a proper card.

My own bench.ino tests with Sandisk XC is ~23MB/sec over SDIO, 2x faster than HC:

FreeStack: 449400
Type is exFAT
Card size: 63.86 GB (GB = 1E9 bytes)
Manufacturer ID: 0X3
OEM ID: SD
Product: SP64G
Version: 8.0
Serial number: 0X28D120F8
Manufacturing date: 1/2012
FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes
Starting write test, please wait.
write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
21095.70,13964,22,23
22320.00,833,22,22
Starting read test, please wait.
read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
22829.59,23,22,22
22829.59,23,22,22
I learned only high speed mode is supported, which runs at 50MHz - with 4 bits of data per clock that's a maximum transfer rate of 25MB/s which is pretty close
 
Yes, I was seeing similar numbers on my cards- even Class I. The problem is that it’s an average. The card datasheets don’t tell you the maximum latency. That is they may plow along at 25mb/s but then do housekeeping for 50msec. That means you need to buffer 1.24Mbytes of data if you are streaming and can’t just hold off until it’s able to store data again.
I have the additional problem of my data interrupt which takes about 45nsecs. Which seems minuscule but it occurs every 222nsecs. That’s a significant hit on the time allowed for the sdfat routines to run.
Frankly, I am super impressed by the performance of the library. I basically took the bench.ino and reshaped it for my needs. Every problem I’ve had so far has been my fault.
I appreciate all the comments. I hope this can be a good guide to fast data logging in the end.
 
If you have time, I wonder if you would mind running the program below and sharing your results? What it does is log for 20 seconds at 5 MB/s, with each write being 512 bytes, and keeps track of buffer usage, time to return from SdFat write() calls, and how long the SD is busy after each write. Here are the results with a 32 GB Sandisk Ultra SD that I've been using. SD write() calls always take about 5 us, and the SD is busy for an average of ~17 us after each write, but a max of 41 ms. I'm curious what you get with the same settings. You can change the macros at the top to do shorter or longer runs with higher or lower data rates, with the constraint being there has to be room in RAM1 for a 50-ms buffer for whatever you choose as a data rate.

Feb 2 2026 19:26:43
Teensy 4.1
Teensyduino version 160
SdFat version 2.1.2
Type any character to begin
Log for 20 seconds at 5.00 MB/s (256 bytes per interrupt)
Pre-allocated file 104857600 bytes
RingBuf 262144 bytes
Start dataTimer (period = 48 us)
...................
Stop dataTimer
204800 writes in 19.661 s (0.951 s writing to file = 4.839 %)
File is full
fileSize = 104857600 before sync()
rb.bytesUsed = 0 before sync()
fileSize = 104857600 after sync()
rb.bytesUsed = 0 after sync()
rbMaxUsed = 218368
avg write us = 4.65
min write us = 4.57
max write us = 5.20
avg busy us = 20.30
min busy us = 17.34
max busy us = 40958.95
file.close()
Code:
// Test Teensy SDIO with write busy in a data logger demo.
//
// The driver writes to the uSDHC controller's FIFO then returns while the
// controller writes the data to the SD.  The first sector puts the controller
// in write mode and takes about 11 usec on a Teensy 4.1. About 5 usec is
// required to write a sector when the controller is in write mode.

#include <SdFat.h>
#include <RingBuf.h>

//******************************************************************************
// global variables
//******************************************************************************
#define SD_CONFIG    (SdioConfig(FIFO_SDIO))        // use Teensy SDIO
#define LOGGING_TIME_S    (20)                // s
#define DATA_RATE_BPS    ((5)*(1024*1024))        // bytes/s
#define LOG_FILE_SIZE    (DATA_RATE_BPS * LOGGING_TIME_S)// total bytes
#define RING_BUF_SIZE    (DATA_RATE_BPS / 20)        // 50-ms buffer at BPS
#define BUFLEN        (256)                // bytes per write
#define C2US(c)        ((c)*(1E6/F_CPU))        // CPU cycles to us

//******************************************************************************
// global variables
//******************************************************************************
IntervalTimer dataTimer;        // IntervalTimer for ISR-level writes
RingBuf<FsFile,RING_BUF_SIZE> rb;       // ISR --> RingBuf --> loop --> SD file
SdFs     sd;                // SdFat type
FsFile   file;                // SdFat file type
size_t   rbMaxUsed = 0;            // RingBuf max bytes (useful diagnostic)
char     buf[BUFLEN];            // test buffer
uint32_t error;                // RingBuf/file error code

//******************************************************************************
// IntervalTimer callback -- write BUFLEN bytes to RingBuf
//******************************************************************************
void dataTimerCallback( void )
{
#if (SD_FAT_VERSION == 20102)        // #if SdFat 2.1.1
  rb.memcpyIn( buf, BUFLEN );        //   write to RingBuf via rb.memcpyIn()
#elif (SD_FAT_VERSION >= 20202)        // #elif SdFat >= 2.2.0
  rb.beginISR();            //   begin interrupt access
  rb.write( buf, BUFLEN );        //   write to RingBuf via rb.write()
  rb.endISR();                //   end interrupt access
#endif                    // #endif
  if (rb.getWriteError())        // if write error occurred
    error = 1;                //   set global error code
} 

//******************************************************************************
// setup()
//******************************************************************************
void setup()
{
  Serial.begin(9600);
  while (!Serial && millis() < 3000) {}

  Serial.printf( "%s  %s\n", __DATE__, __TIME__ );
  Serial.printf( "Teensy %s\n",
#if defined(ARDUINO_TEENSY35)
  "3.5" );
#elif defined(ARDUINO_TEENSY41)
  "4.1" );
#endif
  Serial.printf( "Teensyduino version %1lu\n", TEENSYDUINO );
  Serial.printf( "SdFat version %s\n", SD_FAT_VERSION_STR );

  // Initialize the SD.
  if (!sd.begin(SD_CONFIG)) {
    sd.initErrorHalt(&Serial);
  }
 
  // these 2 lines are necessary to enable cycle counting on T3.x
  #if (defined(KINETISL) || defined(KINETISK))        // if Teensy LC or 3.x
  ARM_DEMCR    |= ARM_DEMCR_TRCENA;            //   enable debug/trace
  ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;        //   enable cycle counter
  #endif
}

//******************************************************************************
// loop()  open file, preAllocate, init RingBuf, log data, print results/stats
//******************************************************************************
void loop()
{
  while (Serial.available()) { Serial.read(); }
  Serial.println( "Type any character to begin" );
  while (!Serial.available()) {}
 
  Serial.printf( "Log for %1lu seconds at %1.2f MB/s (%1lu bytes per interrupt)\n",
        LOGGING_TIME_S, (float)(DATA_RATE_BPS/(1024*1024.0)), BUFLEN );
  Serial.printf( "Pre-allocated file %1lu bytes\n", LOG_FILE_SIZE );
  Serial.printf( "RingBuf %1lu bytes\n", RING_BUF_SIZE );

  // Open or create file - truncate existing file.
  if (!file.open( "logfile.txt", O_RDWR | O_CREAT | O_TRUNC )) {
    Serial.println( "open failed\n" );
  }
  // File must be pre-allocated to avoid huge delays searching for free clusters.
  else if (!file.preAllocate( LOG_FILE_SIZE )) {
     Serial.println( "preAllocate failed\n" );
     file.close();
  }
  // init the file and RingBuf
  else {
    rb.begin(&file);
  }
 
  // Init data buffer with random data
  randomSeed( micros() );
  for (int i=0; i<BUFLEN; i++)
    buf[i] = 0x30+random( 10 );
  buf[BUFLEN-1] = '\n';
 
  uint32_t timer_period_us = 1E6 * BUFLEN / DATA_RATE_BPS;
  Serial.printf( "Start dataTimer (period = %1lu us)\n", timer_period_us );
  dataTimer.begin( dataTimerCallback, timer_period_us );

  uint32_t count = 0, start_ms = millis(), start_busy = 0;
  bool busy = false;
  error = 0;
  rbMaxUsed = 0;
  elapsedMillis ms = 0;
  uint32_t sum_busy=0, min_busy=0xFFFFFFFF, max_busy=0;
  uint32_t sum_write=0, min_write=0xFFFFFFFF, max_write=0;
  while (error == 0 && millis() - start_ms < LOGGING_TIME_S*1000) {
      
    if (ms >= 1000) { Serial.print( "." ); ms -= 1000; }
    
    // number of bytes in RingBuf
    size_t n = rb.bytesUsed();
    if (n > rbMaxUsed) {
      rbMaxUsed = n;
    }

    // bytes in RingBuf now will fit, but any more will exceed file size
    if ((n + file.curPosition()) > (LOG_FILE_SIZE - BUFLEN/*_MAX*/)) {
      error = 2; // file full
    }

    // write one sector (512 bytes) from RingBuf to file
    // Not busy only allows one sector before possible busy wait
    if (file.isBusy()) {
      if (!busy) {
        busy = true;
        start_busy = ARM_DWT_CYCCNT;
      }
    }
    else if (busy) {
      busy = false;
      uint32_t busy_cyc = ARM_DWT_CYCCNT - start_busy;
      sum_busy += busy_cyc;
      if (busy_cyc < min_busy) min_busy = busy_cyc;
      if (busy_cyc > max_busy) max_busy = busy_cyc;
    }
    if (n >= 512 && !busy) {
      uint32_t start_write = ARM_DWT_CYCCNT;
      if (512 != rb.writeOut(512)) {
        error = 1; // write error
      }
      uint32_t write_cyc = ARM_DWT_CYCCNT - start_write;
      sum_write += write_cyc;
      if (count > 0) {
        if (write_cyc < min_write) min_write = write_cyc;
        if (write_cyc > max_write) max_write = write_cyc;
      }
      count++;
    }
  }
 
  uint32_t duration_ms = millis() - start_ms;
  Serial.printf( "\nStop dataTimer\n" );
  dataTimer.end();
 
  double duration_s = duration_ms/1000.0;
  double write_s = C2US(sum_write)/1E6;
  double write_percent = 100*(write_s/duration_s);
  Serial.printf( "%1lu writes in %1.3lf s (%1.3lf s writing to file = %1.3lf %c)\n",
        count, duration_s, write_s, write_percent, '%' );
  switch (error) {
    case 0:   Serial.printf( "No error\n" );            break;
    case 1:   Serial.printf( "Not enough space in RingBuf\n" );    break;
    case 2:   Serial.printf( "File is full\n" );        break;
    case 3:   Serial.printf( "Write from RingBuf failed" );    break;
    default:  Serial.printf( "Undefined error %1lu\n", error );    break;
  }

  // write any remaining RingBuf data to file
  Serial.printf( "fileSize     = %10lu before sync()\n", (uint32_t)file.fileSize() );
  Serial.printf( "rb.bytesUsed = %10lu before sync()\n", (uint32_t)rb.bytesUsed() );
  rb.sync();
 
  // file and buffer stats
  Serial.printf( "fileSize     = %10lu after sync()\n", (uint32_t)file.fileSize() );
  Serial.printf( "rb.bytesUsed = %10lu after sync()\n", (uint32_t)rb.bytesUsed() );
  Serial.printf( "rbMaxUsed    = %10lu\n", (uint32_t)rbMaxUsed );
  Serial.printf( "avg write us = %10.2lf\n", C2US(sum_write)/count );
  Serial.printf( "min write us = %10.2lf\n", C2US(min_write) );
  Serial.printf( "max write us = %10.2lf\n", C2US(max_write) );
  Serial.printf( "avg busy  us = %10.2lf\n", C2US(sum_busy)/count );
  Serial.printf( "min busy  us = %10.2lf\n", C2US(min_busy) );
  Serial.printf( "max busy  us = %10.2lf\n", C2US(max_busy) );
 
  // print first N line(s) of file.
  int lines=0;
  if (lines > 0) {
    Serial.printf( "First %1d line(s) of file\n", lines );
    file.truncate();
    file.rewind();
  }
  for (int n=0; n < lines && file.available();) {
    int c = file.read();
    if (c < 0) break;
    Serial.write(c);
    if (c == '\n') n++;
  }
 
  // close file
  file.close();
  Serial.printf( "file.close()\n\n" );
}
 
That's good. Your total buffer is 945,152 bytes, so when you say max buffer usage was 132,227, does that mean you don't need both buffers after all?

Not a short-term fix, but if you want to get the RAM limit out of the way entirely, see the link below and the work by @Dogbone06 on DIY boards with 32 MB of SDRAM.

https://forum.pjrc.com/index.php?threads/diy-teensy-sdram-solder-yourself.76887/#post-357855
The OP may indeed need only one buffer---until the internal SD card controller decides it needs to swap out a failing block and handle wear leveling within a few milliseconds of each other.

What is the power penalty for adding DRAM. (I'm assuming that the Teensy DRAM design doesn't use the same DRAM that the major PC vendors and AI server companies can't get enough of right now.)
 
In response to DRAM power - I don't know. In my particular case though I have neither the area or (probably) free pins to add it.

On the test, I will run the tests in the morning.
 
Ok, so I ran three cards. Results below. The three were:

Sandisk EXtreme PLUS 256GB V30
SanDisk Industrial XI 128GB V10
Riverdi 'Grade A Micro SD' 4GB V10

All were fully formatted using the SD Card Organization formatting tool.

Most notable were differences in sizes of the buffer used and the max busy us. It definitely pays to use a V30 card as it uses a third of the buffer and also has a third of the max busy time. I think this is the most important criterion in logging high speed streaming data.

=====================

Sandisk EXtreme PLUS 256GB V30
------------------------------
Teensy 4.1
Teensyduino version 158
SdFat version 2.3.0
Type any character to begin
Log for 20 seconds at 5.00 MB/s (256 bytes per interrupt)
Pre-allocated file 104857600 bytes
RingBuf 262144 bytes
Start dataTimer (period = 48 us)
...................
Stop dataTimer
204800 writes in 19.660 s (0.967 s writing to file = 4.921 %)
File is full
fileSize = 104857600 before sync()
rb.bytesUsed = 0 before sync()
fileSize = 104857600 after sync()
rb.bytesUsed = 0 after sync()
rbMaxUsed = 33536
avg write us = 4.72
min write us = 4.72
max write us = 5.25
avg busy us = 17.99
min busy us = 17.35
max busy us = 6136.50
file.close()

================================

SanDisk Industrial XI 128GB V10
--------------------------------
Teensy 4.1
Teensyduino version 159
SdFat version 2.1.2
Type any character to begin
Log for 20 seconds at 5.00 MB/s (256 bytes per interrupt)
Pre-allocated file 104857600 bytes
RingBuf 262144 bytes
Start dataTimer (period = 48 us)
...................
Stop dataTimer
204800 writes in 19.661 s (0.956 s writing to file = 4.864 %)
File is full
fileSize = 104857600 before sync()
rb.bytesUsed = 0 before sync()
fileSize = 104857600 after sync()
rb.bytesUsed = 0 after sync()
rbMaxUsed = 119296
avg write us = 4.67
min write us = 4.67
max write us = 5.23
avg busy us = 19.76
min busy us = 17.35
max busy us = 22363.88
file.close()

====================================

Riverdi 'Grade A Micro SD' 4GB V10
----------------------------------
Teensy 4.1
Teensyduino version 159
SdFat version 2.1.2
Type any character to begin
Log for 20 seconds at 5.00 MB/s (256 bytes per interrupt)
Pre-allocated file 104857600 bytes
RingBuf 262144 bytes
Start dataTimer (period = 48 us)
...................
Stop dataTimer
158349 writes in 20.000 s (0.735 s writing to file = 3.677 %)
No error
fileSize = 104857600 before sync()
rb.bytesUsed = 0 before sync()
fileSize = 104857600 after sync()
rb.bytesUsed = 0 after sync()
rbMaxUsed = 262144
avg write us = 4.64
min write us = 4.58
max write us = 5.22
avg busy us = 17.34
min busy us = 17.48
max busy us = 171061.15
file.close()
 
That's great info, thanks. Yes, the max buffer used is simply (max busy time * data rate). The average busy times are all about the same, and those are the best indicator of the data rate you can achieve.

EDIT: the total time to write 512 bytes is the 5 us for the write(), plus 17-18 us of additional busy time, for a total of 22-23 us per 512 byte write. That implies a max write speed of about 22 MB/s, which matches pretty well with what people get from bench.ino.

I'll have to get one of the Extreme Plus cards. 7 ms max busy time is quite an improvement over the 40 ms I get with 32 GB Sandisk Ultra.

It would be interesting to see your results with this program at 10, 15, 18 MB/s, and see what % of CPU is used to write. You can reduce the buffer size from MB/s/20 (50 ms) to MB/s/100 (10 ms) and I think everything will fit in RAM1 (TCM).
 
Last edited:
5us to write 512 bytes. Excellent trick. What interface are you using?
@UhClem, you asked this question back in April. It's the SDIO interface. 5 us is the time from call to return of SdFat write(). The SD card remains busy completing the write for 17-18 us after the return, so the total time to write 512 bytes is 22-23 us. That implies a little more than 20 MB/s, which I think matches what people get from the bench.ino test.
 

slash2

with your test and
Type is exFAT
Card size: 63.86 GB (GB = 1E9 bytes)
Manufacturer ID: 0X3
OEM ID: SD
Product: SP64G
Version: 8.0
Serial number: 0X28D120F8
Manufacturing date: 1/2012
4ede99e3-5a39-4cda-b965-9b6506bee829._CR0,0,488,700_PT0_SX488__.jpg



I get about 2x your max busy us :
Teensy 4.0
Teensyduino version 159
SdFat version 2.1.2
Type any character to begin
Log for 10 seconds at 1.00 MB/s (256 bytes per interrupt)
Pre-allocated file 10485760 bytes
RingBuf 52428 bytes
Start dataTimer (period = 244 us)
.........
Stop dataTimer
20480 writes in 9.995 s (0.096 s writing to file = 0.962 %)
File is full
fileSize = 10485760 before sync()
rb.bytesUsed = 0 before sync()
fileSize = 10485760 after sync()
rb.bytesUsed = 0 after sync()
rbMaxUsed = 13056
min write us = 4.68
max write us = 5.54
min busy us = 17
max busy us = 12448
file.close()
 
I ran the test 7 times and saw quite a variation in max busy:

rbMaxUsed​
1280​
12032​
1280​
768​
6656​
768​
min write us​
4.68​
4.68​
4.68​
4.68​
4.68​
4.68​
max write us​
5.54​
5.53​
5.54​
5.54​
5.54​
5.69​
min busy us​
18​
17​
18​
18​
18​
18​
max busy us​
1390​
11674​
1380​
822​
6412​
823​
 
Those are really good times. And, yes, I see a lot of variation as well.

I bought several new cards and tested them. All were supposed to be 30MB/s, though one had conflicting information (as I understand it). I also re-ran the ExtremePlus several times.

I discovered one interesting thing today while testing the ExtremePlus on my board. I was able to track the latency of block writes to the SD Card and after the first 512 block was sent, and the second block was sent, there was a long delay - about 6msec, which is about what I was seeing for it's max latency. Perhaps, when that first write takes place, there is initial housekeeping that must be done. If that is true, it might be good to write a dummy block and pause before starting streaming data. I may try to look at this in more detail. Now, the long delays may also take place at other times.

I have yet to see an SD Card data sheet that specifies maximum latency. It's all about sustained writes and reads, so I guess they assume you have a huge buffer.

Teensy 4.1
Teensyduino version 158
SdFat version 2.3.0

============================
SanDisk Industrial 256GB Speed Class V30 UHS 30
----------------------------
(new)
max busy us = 21434.20

(then do a full format)
max busy us = 26457.90

(run again without format)
max busy us = 20087.90

============================
Micron 256GB Speed class 10 UHS 30
----------------------------
(new)
max busy us = 15815.10

============================
innodisk 256GB Speed Class 10 no UHS speed class
----------------------------
(new)
max busy us = 25335.20

============================
SanDisk Extreme Plus 256GB Speed Class 30 UHS 30
----------------------------
(retest, not reformatted)
max busy us = 19769.79

(then do a full format)
max busy us = 5287.18

(run again without format)
max busy us = 30353.52

(run again without format)
max busy us = 11535.56

(run again without format)
max busy us = 9236.56
 
Perhaps, when that first write takes place, there is initial housekeeping that must be done.
Keep in mind that you have a layer of software between your write call and the SD card.

One thing that SDFat does to help with write speed is to use the multi-sector write command. So that first call has to transfer the write command to the card (over the single bit command channel) before transferring the data. Subsequent sequential writes just transfer data. At some point, a non-sequential write perhaps, SDFat will close out that SD multi-sector write command.

File system updates of course will add extra delays whenever SDFat decides it needs to update it. Maximum time for a FAT update according to the SD specification is 750ms.

And SD cards keep changing as well. I was just looking at the speed class information in the SD specification (V9) and there is a bunch. Such as the new CMD20. Where you tell the card you are about to start streaming and it can take up to one second to get ready. 4.13.2.8.1
 
Those are really good times. And, yes, I see a lot of variation as well.

I bought several new cards and tested them. All were supposed to be 30MB/s, though one had conflicting information (as I understand it). I also re-ran the ExtremePlus several times.

I discovered one interesting thing today while testing the ExtremePlus on my board. I was able to track the latency of block writes to the SD Card and after the first 512 block was sent, and the second block was sent, there was a long delay - about 6msec, which is about what I was seeing for it's max latency. Perhaps, when that first write takes place, there is initial housekeeping that must be done. If that is true, it might be good to write a dummy block and pause before starting streaming data. I may try to look at this in more detail. Now, the long delays may also take place at other times.

I've done some tracking of when the long delays occur, and I don't remember ever seeing that they occur as soon as the 2nd 512-byte write, but I can't say for sure. My main concern has been max busy time, because that's what determines buffer size. For me, max busy time has been very consistent for a given card type.

With 32 GB Sandisk Ultra, I get max busy time of ~40 ms. I've done many logging runs with a buffer sized for 50 ms of data, and never had a buffer overrun, so I'm pretty sure it never goes much over 40. I can't say it could never occur, but it never has in 100s of GB of data collection.

With 128 GB Sandisk Extreme Plus, which I got after seeing @slash2's results, I get max busy time of ~17 ms, so I can use a buffer sized for 25 ms and support twice the data rate if I need it.

One caveat is that the total data written has to be pretty sizeable to get close to a true max. I typically use 30 sec @ 5 MB/sec = 150 MB.
 
I ran a bunch of long tests today. I saved 30 minutes of data at 18Mbytes/sec. This is on the order of what I need - I am aiming for 1 hour at 18Mbytes/sec, but the tests take a long time to run and analyze, so I cut it back some.

Some notes about my code:
- I am reading 32 bits at a time as that is the most efficient way to get data into the T4.
- My buffer allows overruns. I can't stop the data coming in, so I just try to have that not happen. Also, by allowing overruns, and tracking what the overrun is, I can see what I'm up against.
- For my test, I saved the value how much buffer was used (or needed) in place of incoming data. This preserved the timing of the code I actually want to run.
- Theoretically, one can save data at about 22Mbytes/sec (according to bench.ino tests with a fast card). However reading data in takes about a fifth of the available time. I think 18Mbytes/sec may be a practical upper limit (for short runs - see below).

At three times during the test, the buffer overran - by a lot. It only happened three times during the test, but it is a big problem. The chart below shows it. The horizontal line is the biggest buffer I could make in the T4.

1770513073924.png

So, this is saying one would need at least 6Mbytes of buffer. Which I don't have. However, I think the time it's above the available buffer is only about 0.5 sec over the entire test. For my application, I think it may be ok to just lose that data. The other option is to store less channels. So how many less?

I ran at 1.5Msmpls/sec (6 Mbytes/sec). I ran the same amount of samples, so this one lasted 3 times longer.

This was successful:

1770513898424.png

You can see it did come close; maximum buffer used was 924,380 out of 944,128 available. But this a third of the throughput I would like. I believe we will decide to risk losing a bit of data over having to decide which 2/3 of the channels we want to not save.

I also ran at 3.5Msmp/sec. It failed, but less badly than at 4.5Msmp/sec.

An interesting observation is that the 'hiccups' must be related to a certain number of samples as they are regular and I get three in each of the two tests. They are about 10Gbytes apart. The SD card must be doing some serious housekeeping during that time.

I had done testing at the outset this project and I ran a lot of shorter tests which looked good. I must have missed the short loss of data on longer tests.

Anyway, I hope hope this is useful to others wanting to do fast data streaming to SD Card.
 
One file - 32GBytes. Pre-allocated. Quick format each time (full format on 256gb takes forever and I was running a lot of tests).
 
We would like an hour - 64 Gbytes, but at the moment it looks like our battery setup will only give us 45 minutes, which is useable. We are limited by size and weight and cannot increase the battery.

I’ve been pondering how to handle a buffer overrun. My initial idea is to have the processor signal the FPGA when the buffer is near full. The FPGA will complete the current frame of data (600 bytes/frame), then send a frame with all data set to a single value (perhaps -0). It will then stop sending data and start counting how many frames are being dropped. When the near full signal is lowered, the FPGA will send another fixed frame with the lost count added in. Then back to regularly scheduled programming. This will let me correctly reconstruct the data timing with a gap of null data added in by the post-processing software.
 
I've done some tracking of when the long delays occur, and I don't remember ever seeing that they occur as soon as the 2nd 512-byte write, but I can't say for sure. My main concern has been max busy time, because that's what determines buffer size. For me, max busy time has been very consistent for a given card type.

With 32 GB Sandisk Ultra, I get max busy time of ~40 ms. I've done many logging runs with a buffer sized for 50 ms of data, and never had a buffer overrun, so I'm pretty sure it never goes much over 40. I can't say it could never occur, but it never has in 100s of GB of data collection.

With 128 GB Sandisk Extreme Plus, which I got after seeing @slash2's results, I get max busy time of ~17 ms, so I can use a buffer sized for 25 ms and support twice the data rate if I need it.

One caveat is that the total data written has to be pretty sizeable to get close to a true max. I typically use 30 sec @ 5 MB/sec = 150 MB.
IIRC, many of the larger SD cards have erase block sizes of 4-16MB. Depending on the prior write history, it may be hard to predict when a block erasure may be scheduled by the card's internal processor. When it does happen, you will see a spike in the busy time. The busy time was not an issue in most of our loggers as data rates were slow and we had sufficient buffer for up to 100mSec of busy time. What did bother us was the spike in card current and generated RFI when the card turned on the internal voltage multiplier to generate the 15-20V needed to erase the block of NAND flash. Many of our oceanographic loggers had piezo-electric shear sensors with output impedances of 500 to 800 megOhms. Not usually a problem as the transimpedance amplifier was close to the sensor and a few meters of salt water damps out environmental RFI. However we did require lots of ferrite filters on all the shielded power supply lines---even with separate battery packs for analog and digital circuitry.

We worked to minimize the issues by always deploying the loggers with new Sandisk Ultra 32GB cards. Thirty-two GB is the minimum for six months at about 2KB/second. 3 months was the normal deployment time, but ship schedules were variable.

About 5 years ago we ran into an issue with counterfeit Sandisk Cards. We learned to spot the packaging anomalies and always checked the metadata when formatting the cards.
 
Back
Top