SD logger issues

Bodger62

Member
Hello All,

I am logging CAN messages and storing them on an SD card in the BusMaster format shown below. We log approximately 1Gbyte of data in an hour.

10:04:34:4234 Rx 1 0x0c060000 x 8 00 09 00 11 30 95 58 45

The data is stored every 4K, or once 64 CAN messages have been received. The reason is that we need to interrogate the messages to detect any sudden or unexpected errors in the system which causes it to crash.

The problem is that sometimes messages are missed, but there doesn’t seem to be a pattern to it. The time logging stops can be as much as 2 seconds. On a 2.9Gbyte file the following was found

Args 2 String 240309_1653.log

17:02:37:9210 Rx 1 0x00040139 x 8 bb b7 bb b7 bb b7 bb b7
17:02:39:8485 Rx 1 0x04240041 x 4 ac fd ff ff messages missed

17:08:00:8298 Rx 1 0x00220001 x 8 2f 01 00 00 99 02 00 00
17:08:02:7611 Rx 1 0x00800102 x 8 68 c5 68 c5 68 c5 00 00 messages missed

17:18:22:5850 Rx 1 0x00800102 x 8 68 c5 68 c5 68 c5 00 00
17:18:24:5169 Rx 1 0x04000022 x 8 2a 4e 0e 00 03 01 00 00 messages missed

17:34:05:1581 Rx 1 0x00800202 x 8 00 00 00 00 00 00 00 00
17:34:07:0816 Rx 1 0x02800069 x 4 0a 00 00 00 messages missed

17:39:28:1262 Rx 1 0x04000022 x 8 2a 4e 0e 00 f1 00 00 00
17:39:30:0517 Rx 1 0x00800202 x 8 00 00 00 00 00 00 00 00 messages missed

17:44:50:7280 Rx 1 0x00220001 x 8 90 02 00 00 5b fb ff ff
17:44:52:6527 Rx 1 0x00260001 x 8 f4 ff f4 ff ea fc 00 00 messages missed

17:55:07:8407 Rx 1 0x02800069 x 4 0b 00 00 00
17:55:09:7673 Rx 1 0x04220041 x 4 19 06 00 00 messages missed

18:00:30:7010 Rx 1 0x00800002 x 8 68 c5 68 c5 68 c5 68 c5
18:00:32:6259 Rx 1 0x00200001 x 8 ed ff fe ff 00 00 00 00 messages missed

18:05:53:7022 Rx 1 0x00040239 x 8 bb b7 bb b7 bb b7 bb b7(Us)
18:05:55:6252 Rx 1 0x00800202 x 8 00 00 00 00 00 00 00 00 messages missed

18:11:16:7530 Rx 1 0x00200001 x 8 fd ff 03 00 00 00 00 00
18:11:18:6806 Rx 1 0x02800069 x 4 0b 00 00 00 messages missed

18:21:33:9236 Rx 1 0x00220001 x 8 1b 02 00 00 59 fa ff ff
18:21:35:8467 Rx 1 0x00260001 x 8 f1 ff f4 ff e0 fc 00 00 messages missed

18:26:56:7541 Rx 1 0x046a0131 x 8 f0 ff ff ff f3 ff ff ff
18:26:58:6797 Rx 1 0x00260001 x 8 f3 ff f9 ff e2 fc 00 00 messages missed

18:32:19:5836 Rx 1 0x00200001 x 8 00 00 01 00 ff ff 00 00
18:32:21:5073 Rx 1 0x00240001 x 8 c8 43 00 00 36 43 00 00 messages missed



18:42:36:5115 Rx 1 0x00200001 x 8 fc ff fc ff 00 00 00 00
18:42:38:4367 Rx 1 0x04620031 x 8 78 f6 11 15 2b c1 81 ba messages missed

18:58:45:1170 Rx 1 0x02800069 x 4 0c 00 00 00
18:58:47:0420 Rx 1 0x02800069 x 4 0c 00 00 00 messages missed

19:09:28:5426 Rx 1 0x00800102 x 8 68 c5 68 c5 68 c5 00 00
19:09:30:4686 Rx 1 0x00220001 x 8 43 06 00 00 3d ee ff ff messages missed

19:14:25:6602 Rx 1 0x00200001 x 8 05 00 fc ff 00 00 00 00
19:14:27:5857 Rx 1 0x00240001 x 8 52 e3 ff ff f4 e2 ff ff messages missed

19:19:48:6378 Rx 1 0x00260001 x 8 c4 ff ec ff d6 fc 00 00
19:19:50:5631 Rx 1 0x00260001 x 8 ce ff ea ff ef fc 00 00 messages missed

19:25:11:7046 Rx 1 0x00020039 x 5 00 00 00 00 00
19:25:13:6285 Rx 1 0x00220001 x 8 35 06 00 00 3b ee ff ff messages missed

19:35:29:2773 Rx 1 0x00260001 x 8 cf ff ef ff e8 fc 00 00
19:35:31:2047 Rx 1 0x00200001 x 8 00 00 01 00 fd ff 00 00 messages missed

19:40:52:4892 Rx 1 0x00200001 x 8 03 00 06 00 02 00 00 00
19:40:54:4141 Rx 1 0x02800069 x 4 0e 00 00 00 messages missed

19:46:15:6871 Rx 1 0x04240041 x 4 b2 f6 ff ff
19:46:17:6104 Rx 1 0x00200001 x 8 fe ff 04 00 00 00 00 00 messages missed

File total line count 44761677 data line count 44761664



Other processing
on the file shows that the time the messages are missed grows with the memory size used

16:54:08:1062 Rx 1 0x02800069 x 4 0d 00 00 00
16:54:08:1063 Rx 1 0x00000202 s 7 00 00 00 00 00 00 00
line count 16 time diff(Us) 100

16:54:08:1065 Rx 1 0x02800069 x 4 0d 00 00 00
16:54:08:1077 Rx 1 0x00000302 s 5 30 06 db cb 01
line 19 time diff(Us) 1200

16:54:08:1083 Rx 1 0x0c060000 x 8 06 45 4e 27 00 12 00 27
16:54:08:1103 Rx 1 0x02800069 x 4 0d 00 00 00
line 24 count time diff(Us) 2000

16:54:08:1117 Rx 1 0x00000302 s 5 30 06 db ca 01
16:54:08:1153 Rx 1 0x02800069 x 4 0d 00 00 00
line count 27 time diff(Us) 3600

16:54:08:1157 Rx 1 0x00000302 s 5 30 06 db ca 01
16:54:08:1195 Rx 1 0x00000202 s 7 00 00 00 00 00 00 00
line count 30 time diff(Us) 3800

16:54:08:1355 Rx 1 0x00000302 s 5 2d 06 db ca 01
16:54:08:1394 Rx 1 0x00000202 s 7 00 00 00 00 00 00 00
line count 44 time diff(Us) 3900

16:54:08:4503 Rx 1 0x02800069 x 4 0c 00 00 00
16:54:08:4553 Rx 1 0x02800069 x 4 0d 00 00 00
line count 274 time diff(Us) 5000

16:57:14:0504 Rx 1 0x00800002 x 8 68 c5 68 c5 68 c5 68 c5
16:57:15:9799 Rx 1 0x00260001 x 8 05 00 00 00 e6 fc 00 00
line count 744399 time diff(Us) 1929500

17:08:00:8298 Rx 1 0x00220001 x 8 2f 01 00 00 99 02 00 00
17:08:02:7611 Rx 1 0x00800102 x 8 68 c5 68 c5 68 c5 00 00
line count 3471887 time diff(Us) 1931300

17:12:58:7548 Rx 1 0x00260001 x 8 09 00 fd ff e4 fc 00 00
17:13:00:6862 Rx 1 0x00000302 s 5 e7 05 db ca 01
line count 4727631 time diff(Us) 1931400

17:18:22:5850 Rx 1 0x00800102 x 8 68 c5 68 c5 68 c5 00 00
17:18:24:5169 Rx 1 0x04000022 x 8 2a 4e 0e 00 03 01 00 00
line 6093391 time diff(Us) 1931900

File total line count 44761677 data line count 44761664

So I assume that it’s something to do with the way that the data is written to the SD card. I’ve tried opening new files every 24KB and that seemed to work, but is not an acceptable solution.

Could anyone suggest how I can find solution so that I can record all the messages without missing any. Would Raw writing work?

Regards
Ray

Using

Teensy 4.1,
Arduino compiler 2.0.4
SD card 16Gbyte configured as exFAT
 

Attachments

  • CAN_Logger_Renew.ino
    60 KB · Views: 21
SD card write times can vary and with large files it gets worse. The best method for dealing with this is to separate the SD writing process from the data collection using buffers. The data collection process stores data into buffers and when a buffer is full the SD write process writes it to the SD card. More buffers allows more time for the write to complete before running out of buffer space.
The details of how you do that depend on what you are doing. It could be as simple as doing data collection in an interrupt routine or as complicated as an RTOS.
 
Could you be missing CAN messages during the times your SD writes are blocking? Check out the SdFat example program TeensySdioLogger. It shows how to do a few things that will help:

1) Pre-allocate the file to minimize searching for free sectors
2) Use SdFat's built-in RingBuf
3) Use SdFat's isBusy() and write in 512-byte chunks to avoid blocking

Using the method shown in the example, SD writes will never block, with typical write time of 5 us. 1 GB/hour is only 300 KB/sec, and T4.1 can easily sustain that with plenty of time left over to do your CAN processing. With the method outlined above you'll need a RingBuf of as little as 16 KB, or 32 KB to be very safe.
 
Thank you for your prompt replies.

I have tried to use the code from TeensySdioLogger in the attached sketch, but am having some trouble to get it to work. The data doesn't seem to be saved in the file correctly and I was wondering if you could shed some light on the matter.

I am trying to get the code to work with one write every 512 and will then try and expand the sketch to use more buffers.

I look forward to any advice you can give me.
 

Attachments

  • Data_Logger_New_File_Format.ino
    6.1 KB · Views: 16
  • SdioLogger.txt
    512 bytes · Views: 16
  • SdioLogger.txt
    512 bytes · Views: 13
  • Data_Logger_New_File_Format.ino
    6.1 KB · Views: 12
Your RingBuf is only 512 bytes now. Make it larger, something like 32768 or 65536, just to be sure it's not an issue.

After setup(), assuming open() and preAllocate() are successful, you will have a 1GB file open and ready to write. Your state machine closes the file every time it writes and then re-opens it, which you should not do. The reason is that when you close the file and re-open, you no longer have the 1GB pre-allocated file. For now, while you're testing, I recommend you comment out the calls to close and re-open the file, and we can talk more about alternatives after you are able to write to the file.

Your call to rb.memcpyIn() to write data to the RingBuf looks good.

Where you write to the file, you have this:

Code:
        if (512 != rb.writeOut(512))
        {
          Serial.println("writeOut failed");
        }
        else
        {
          digitalToggle(LED);
        }

And you need this:

Code:
    // if RingBuf has at least 512 bytes and SD is not busy...
    if (rb.bytesUsed() >= 512 && !file.isBusy())
    {
        // Write one sector from RingBuf to file.
        if (512 != rb.writeOut(512))
        {
          Serial.println("writeOut failed");
        }
        else
        {
          digitalToggle(LED);
        }
    }

The purpose of !file.isBusy() is to avoid blocking. Sometimes a write to SD will require a search or erase, which can take a relatively long time. The call to write() that triggers the search/erase will return immediately, but if your code calls write() again while the SD is busy, it will block until the search/erase is complete. In my testing that longest delays have been about 40 ms, but some people say longer. I think it depends on the type of SD card you're using, how old it is, etc. In any case, as long as you never call write() while file.isBusy() is true, your code will never block. As long as your RingBuf is large enough to hold as much data as can accumulate during the delay, you will be able to write them to the RingBuf and not miss any due to being blocked in the SD write(). When the delay is over, let's say you have 4K of CAN data in RingBuf, your code would do 8 x 512-byte writes on successive passes through your state machine.

Let us know how it goes.
 
Looking at your sketch a little more, RingBuf can replace all of your existing buffering. You don't need any arrays or indexes. The sketch below always opens and writes to the same file, but it shows the basics of writing to the RingBuf from an ISR and reading from the RingBuf and writing to the file in loop(). The CAN interrupt is simulated with an IntervalTimer that runs at 1 kHz, and one 32-byte CAN message is written to the RingBuf on each interrupt. Data logging starts and stops when press the Enter button on the serial monitor. I deleted the actual CAN stuff to make it easier to see how the logging works.

Code:
#include <SdFat.h>
#include <RingBuf.h>
#include <IntervalTimer.h>

#define OPEN_LOG_FILE         0
#define WAIT_START_LOGGING    1
#define LOGGING_DATA          2
#define CLOSE_LOG_FILE        3

#define SD_CONFIG SdioConfig(FIFO_SDIO)

#define CAN_MESSAGE_FREQ_HZ    (1000U)
#define CAN_RECORD_SIZE_BYTES  (32U)
#define BYTES_PER_SECOND       (CAN_MESSAGE_FREQ_HZ * CAN_RECORD_SIZE_BYTES)
#define RING_BUF_CAPACITY      (BYTES_PER_SECOND / 10)   // 100-ms buffer
#define LOG_FILE_SIZE          (1024U * 1024U * 1024U)   // 1Gbyte

#define LOG_FILENAME "SdioLogger.csv"

SdFs sd;
FsFile file;

// RingBuf for File type FsFile.
RingBuf<FsFile, RING_BUF_CAPACITY> rb;

typedef struct {
  uint32_t      us;      //  4 bytes
  uint32_t      spare;   //  4 bytes
  uint8_t       msg[24]; // 24 bytes
} SD_Message_t;

uint8_t logging_state;

IntervalTimer timer;

/* ------------------------------------------------------------------ */
/* IntervalTimer handler to simulate receiving CAN messages           */
/* ------------------------------------------------------------------ */
void canSniff( void )
{
  SD_Message_t SD_msg = {
    micros(), // microseconds
    0,        // spare
    // msg
    {  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12,
      13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 }
  };

  if (logging_state == LOGGING_DATA)          // if logging
    rb.memcpyIn( &SD_msg, sizeof(SD_msg) );   //   write to RingBuf
}


void setup() {

  Serial.begin(115200);  //delay(400);
  while (!Serial) {}

  pinMode( LED_BUILTIN, OUTPUT );
 
  // Initialize the SD.
  if (!sd.begin(SD_CONFIG)) {
    sd.initErrorHalt(&Serial);
  }
  // Open or create file - truncate existing file.
  if (!file.open(LOG_FILENAME, O_RDWR | O_CREAT | O_TRUNC))
  {
    Serial.println("open failed\n");
    return;
  }
  else {
    //Serial.println("open same file?\n");
  } 

  // File must be pre-allocated to avoid huge
  // delays searching for free clusters.
  if (!file.preAllocate(LOG_FILE_SIZE)) {
     Serial.println("preAllocate failed\n");
     file.close();
     return;
  }

  // initialize the RingBuf.
  rb.begin(&file);

  timer.begin( canSniff, 1'000'000/CAN_MESSAGE_FREQ_HZ );
}

void loop() {

  switch (logging_state)      //Put into a function
  {
    case OPEN_LOG_FILE:
      Serial.println( "Type any character to start logging" );
      while (!Serial.available()) {
      }
      while (Serial.available()) {
        Serial.read();
      }
      Serial.println( "Type any character to stop logging" );
      logging_state =  LOGGING_DATA;   
      break;

    case LOGGING_DATA:   
      if (Serial.available()) {
        while (Serial.available())
          Serial.read();
    rb.sync();
    logging_state = CLOSE_LOG_FILE;
      }
      else if (rb.bytesUsed() >= 512 && !file.isBusy()) {
        // Write one sector from RingBuf to file.
        if (512 != rb.writeOut(512)) {
          Serial.println("writeOut failed");
        }
        else {
          digitalToggleFast( LED_BUILTIN );
        }
      }
      break;

    case CLOSE_LOG_FILE:
      rb.sync();
      digitalWriteFast( LED_BUILTIN, LOW );
      logging_state = OPEN_LOG_FILE;
      break;

    default:
      break;
  }
}
 
Hello,

Thank you for your sketch, which I really appreciate.

I have tested it and changed it as attached so that every 4 CAN messages the data is stored to the SD card. A close command would then be sent over CAN at the end of the test so that the file can be closed correctly and the data examined.

The reason the data is stored periodically is that in event of a power or system failure, when the file won't be closed correctly, we can look over the message log to try and determine the cause.

I need to do some more testing and look at some different scenarios but I think I'm now on the right track to manage the SD card so that no messages are missed.

Thanks again.

Ray
 

Attachments

  • Data_Logger_New_File_Format.ino
    6 KB · Views: 20
1733736227946.png


I apologise for the delay in sending this data, but I've been busy on another project. I hope it's what you were asking for.

Regards,
Ray
 

Attachments

  • BusMaster_CAN1.zip
    284.1 KB · Views: 16
One thing to add to the previous comments, if you're logging a busy bus (it looks like your bus load is fairly high) then you may want to look at reducing the number of calls to snprintf() in your CAN to text output formatting.
This is more an issue if you start looking at CAN-FD packets but doesn't hurt for normal CAN. We had a system logging some heavily loaded high speed CAN busses. Like you we'd done the obvious and had a for loop printing each data byte using printf. That combined with the time / ID information ended up being a lot of calls to printf, especially for 64 byte FD packets. If we dialed the data rate right up the system ended up spending 40% of the time creating the text strings we wanted to log. Simply replacing the printf calls for the data bytes with memcpys from a lookup table was a significant CPU saving. In the end we replaced all the printf calls with custom value to text routines and the time spend converting the packets from CAN to text went from ~400 us per ms to 15 us.

Beyond that, all I can recommend is basically what has already been said, split the CAN reception and the SD writing into two processes with a big buffer in between.
We actually went with 3 stages: High priority interrupt on can Rx copies the packet into a buffer waiting to be processed and performans any time critical actions based on the the packet. A low priority timer (set to the lowest possible priority) kicks off every 5 ms, handles non-time critical CAN processing and converts any waiting CAN packets into text format, the results are and written into a big text buffer. Background loops waits for there to be 512 bytes in the text buffer (a multiple of the SD card block size) and then writes to the SD card.
This way SD card writes can be as slow as they like without impacting any other performance as long as the buffers are big enough.
 
Back
Top