Help with Large Array

Status
Not open for further replies.

GOB

Active member
Hi Folks,

I am working on a project that saves a GPS path on a Teensy3.1 so that it can be compared to a current location. To maintain high accuracy, a large number of the points from the GPS path must be saved.

The issue I am running into is that I am running out of RAM trying to save many points from the GPS path.

I have tried a number of methods to save the data, the first just creating a matrix where the first column was a North position and the second column was an East position.

After reading through the forums a bit, I tried this current method using a data structure.

Code:
#include "struct.h"

int tmp_N, tmp_E;


void setup() {
  Serial.begin(9600);
  
}

void loop() { 
  for(int i=0; i<memory; i++)
  { 
    
    //temporary data for test
    tmp_N = 100;
    tmp_E = 200;
    
    
    //assign temporary data into data_array fields
    data_array[i].Lnorth = tmp_N;
    data_array[i].Least = tmp_E;
    delay(100);
    Serial.print(data_array[i].Lnorth); Serial.print(" , "); Serial.println(data_array[i].Least);
  }
}

struct.h:
Code:
#include <Arduino.h>
const uint16_t memory = 7000;

struct history {
  int32_t Lnorth;
  int32_t Least;
  
};

extern history data_array[];

struct.cpp:

Code:
#include "struct.h"



history data_array[memory];


I am trying to save more than 10,000 points because I need a long history of the original track.

Any recommendations?

Thanks.
 
Last edited:
Each element of the array takes 8 bytes. If you bump up the array size to 10,000, that is 80,000 bytes. The Teensy 3.1/3.2 only has 64,000 bytes of read/write RAM, and other things besides your array must go into that RAM, such as the stack frame for the program, buffers for USB/serial devices, and any dynamically allocated memory. Even at 7,000 points, that is 56,000 bytes, which may be too close. And the Teensy LC is even worse, with only 8K of RAM.

As stevech says, the only solution is to store the data in either SPI flash or a micro-SD card. You would then need to process this data in chunks (you read a chunk of memory, process it, and then read the next chunk -- for advanced usage, you would have two buffers, and start the reading on one buffer while processing the other).
 
Use differential readings to get data compression

The Teensy3.1 has 64k RAM, of which one can count on using all but a few kbytes. Each pair of int32_t values will use 8 bytes, so your 7000 is close to the maximum given the structure that you have chosen.

Try using a differential approach: If you are confident that your vehicle (or whatever object you're tracking) moves at a speed less than 1/65536 of the total range of your uint32_t values per recorded sample, then it is possible to store only the difference between each successive measurement as an int16_t (signed 16-bit) number. At the beginning of a series of readings, use a pair of uint32_t values to described a starting position, then make each successive reading a difference from the last.

With this technique, it's a good idea to ensure synchronization every so often by building a structure that contains a limited number of readings. During recording, it's necessary to record the complete position only at the start of each block, followed by a set of differences between successive readings in each differential array entry.

There are gotchas here... If your vehicle is close to either the north or south poles, then the Easterly difference from one reading to the next could exceed a 16-bit signed value. In that situation, a coordinate transform allows preservation of data compression properties with some arithmetic complexity.

Code:
#define DIFF_MAX   512

struct position_block
{
    uint32_t      initial_N;                // Northerly starting position of this block
    uint32_t      initial_E;                // Easterly starting position of this block
    int16_t       diff_count;               // Number of differential readings in the block

    int16_t       diff_N[DIFF_MAX];
    int16_t       diff_E[DIFF_MAX];
};

This is a classical form of numeric data compression; other techniques are available that give better results.
 
There are many ways to compress a sequence of positions.

The changes from one reading to another are usually limited by physics, i.e. the GPS sensor has a maximum speed of say, 20m/s. Making an assumption like this would allow you to calculate the maximum change between readings. This would allow you to calculate the maximum number of bits required for the largest delta. You could store the absolute position for only one out of K readings and store the smaller delta from previous reading in between.

How accurate do your readings need to be? If the bottom few bits of each reading are noise or the resolution you need doesn't include the bottom few bits, don't store them.

You only need lots of samples when the path is complex. If large parts of the path can be described by a smooth curve, you could store the path in sections where some of the sections are represented as parameters of a parametric curve such as a Bezier polynomial.

The Ramer-Doglas-Peucker algorithm can be used to discard samples while minimizing the distortion to the path.

If you're willing to do some research, there's tons of work you could leverage for this.

Or you could throw hardware at it as Steve suggested. :)
 
Thanks for all the responses and advice folks.
I'm going to try the SD card approach first, and see if I can read and write fast enough for my application. If not, I'll try SPI flash.

Are there any good examples for the two buffers, reading on one buffer while processing the other method out there?
 
I'm going to try the SD card approach first, and see if I can read and write fast enough for my application. If not, I'll try SPI flash.

It's hard to imagine SD will be too slow. SD is plenty fast enough to write a stereo 16 bit, 44.1 kHz audio stream in real time (eg, File > Examples > Audio > Recorder).

However, as you can see in that recorder example if you uncomment the lines to print write latency, occasionally writing does take several milliseconds. If you've got data arriving rapidly, you need an interrupt or DMA to pile it up in a buffer while you're busy with the SD card (that's the purpose of the "queue" objects in the audio lib... to automatically queue up data if your sketch sometimes needs more time than the rapid audio rate would allow). The Arduino SD library waits for the write to complete, which make collecting fast data and getting it all written to the card pretty challenging.

The SPI flash chips are somewhat easier, even though they don't write as fast, because the SerialFlash library returns immediately and allows the write to happen as your code runs. It can even suspend & resume the write if you need to read other data before the write completes, though there's a performance penalty for doing so. SerialFlash only waits if you do another write or some other operation that can't occur until the write completes. Those features, and the slower but much more consistent write speed of SPI flash would make collecting incoming data simpler, if it arrives while the physical media is busy writing.
 
Thanks, Paul.

I guess I have to optimize the following process and I don't have the knowledge yet on how to do this well with an SD card:

1. Data comes in (via xBee over serial) from a GPS units with a value for North and a value for East (N1,E1). This is coming in rather quickly, every 15 - 100 milliseconds.

2. I need to save the history of this data (N1,E1) for some time

3. At the same time, the teensy is reading over serial data from a second GPS at a high rate - call this (N2,E2)

4. I then search through the data from N1,E1 and look for a match in the N1,E1 history of the current N2.

I did this earlier with arrays, loops and structures but was limited by the amount of history I could save of N1,E1. Now that I want to do it with an SD card I could use some help writing that code well. I imagine I don't want to be opening and closing the SD file every loop through?

Thanks so much for all your help.
 
You can probably just build your program in the simplest, easiest way possible. The serial ports have buffers to capture incoming data. If you run into trouble with lost data during SD writes, just edit serial1.c, serial2.c or serial3.c to increase the receive buffer.

This is coming in rather quickly, every 15 - 100 ms.

Try to keep timing in perspective.

The audio library captures 16 bit data 44100 times per second, or every 22.7 us. Every 15-100 ms may seem fast, but that's ~1000 times slower than audio!

If you're concerned about timing, use elapsedMicros or micros() to measure the amount of time SD.write() actually takes. Likewise, you can measure the actual elapsed time between data arrival.

Teensy 3.1 is incredibly fast. Data every 15 ms is actually quite slow, relative to what can be accomplished on Teensy. There's unlikely to be any need to go to extra effort to optimize for data arriving and logging this slowly.
 
Thanks, Paul.

So opening and closing the file every loop for writing and reading shouldn't be a problem?

Also, what is the best way for searching through the text on the SD card?
 
Well depending on how much processing you need to do, and whether you need to do it in real time, one way of processing it would be to take out the SD card and move it to a PC or Raspberry Pi, possibly switching cards to minimize down time.

Alternatively, perhaps you don't really need to keep all 10,000 points, but instead do rolling sums with a larger data type (like int64_t), and just keep the last 200 or so in memory using a circular buffer to overwrite the earlier entries.
 
So opening and closing the file every loop for writing and reading shouldn't be a problem?

Perhaps you could measure with elapsedMicros or micros() on your actual card?

The timing will depend on how many other files are in the directory, and if you're accessing the file in a subdir, how many others it has to scan to reach your file. The actual speed of the SD card also matters....

I can't say exactly how fast your card and its data structures are. By some very simple code can tell you exactly how many microseconds are needed. Maybe if you do this, you could share your results here, for others who wish to do similar GPS logging with SD cards?
 
Thanks, Paul -- I'll share results once I have it up and running.

Right now I am struggling more with the search through the csv file aspect.
 
OK, here's the code I have implemented to do the search algorithm.

This is for doing one search through a large file (100,000 rows and 2 columns) whose data looks like this: (LEADnorth_history, LEADeast_history)

10 , 5
11 , 5
12 , 5
13 , 5

etc. in a csv file.

I have simulated the current position that is looking for a match in this data as a constant (north,east).

Code:
/*
 
* SD card attached to SPI bus as follows:
** MOSI - pin 11
** MISO - pin 12
** CLK - pin 13
** CS - pin 4
 
*/

#include <SD.h>
#include <SPI.h>

File myFile;

const byte numChars = 32;
char receivedChars[numChars];	// an array to store the received data
int LEADnorth_hist = 0;
int LEADeast_hist = 0;

boolean newData = false;

int north = 10000;    //simulation of current north
int east = 0;      //simulation of current east

boolean matchflag = false;    //flag to check if there is a match, exit loop


void setup()
{
    // Open serial communications and wait for port to open:
     Serial.begin(9600);
      while (!Serial) {
       ; // wait for serial port to connect. 
     }
    
    
     Serial.print("Initializing SD card...");
     // On the Ethernet Shield, CS is pin 4. It's set as an output by default.
     // Note that even if it's not used as the CS pin, the hardware SS pin 
     // (10 on most Arduino boards, 53 on the Mega) must be left as an output 
     // or the SD library functions will not work. 
      pinMode(10, OUTPUT);
      
     if (!SD.begin(4)) {
       Serial.println("initialization failed!");
       return;
     }
     Serial.println("initialization done.");  
       
     long starttime = micros();
     // re-open the file for reading:
     myFile = SD.open("data.csv");
     if (myFile)
       {
         while (myFile.available()) {
           recvWithEndMarker();   //gets the raw serial data from the csv and puts into an array
           showNewData();         //resets new data to false and can display raw char to serial monitor
           parseData();           //parses the raw data into two ints
          
           
           MatchSearch();         //searches for match of north to LEADnorth and also compares easts at that LEADnorth
           if (matchflag == true)
           {
             break;
           }
           
         }
       }  
     else 
       {
         // if the file didn't open, print an error:
         Serial.println("error opening data.csv");
       }
       
      
     // close the file:
     myFile.close();
     Serial.print("elapsed time: ");
     Serial.println(micros()-starttime);
 
}


void loop()
{
	// nothing happens after setup
}

void recvWithEndMarker()
{
        static byte ndx = 0;
        char endMarker = '\n';
        char rc;
        
        
        while (newData == false)
        {
          rc = myFile.read();
          
          if (rc != endMarker)
          {
            receivedChars[ndx] = rc;
            ndx++;
            if (ndx >= numChars) {
              ndx = numChars - 1;
            }
            
          }
          
          else
          {
            receivedChars[ndx] = '\0';   //terminate the string
            ndx = 0;
            newData = true;
          }
          
        }
        
}

void showNewData()
{
      if (newData == true)
      {
          //Serial.print("This just in ...");
          //Serial.println(receivedChars);
          newData = false;
      
      }
}

void parseData()
{
      //split the data into its parts
      
      char * strtokIndx;    //this is used by strtok() as an index
      
      strtokIndx = strtok(receivedChars,",");    //get the first part
      LEADnorth_hist = atoi(strtokIndx);          //convert to an integer
      
      strtokIndx = strtok(NULL, ","); // this continues where the previous call left off
      LEADeast_hist = atoi(strtokIndx);     // convert this part to an integer
    
    
}



void  MatchSearch()
{
       if (LEADnorth_hist == north)
       {
           Serial.println("MATCH!");
           Serial.print("north: ");
           Serial.print(north);
           Serial.print(" LEADnorth: ");
           Serial.println(LEADnorth_hist);
           Serial.print("east: ");
           Serial.print(east);
           Serial.print(" LEADeast at this north: ");
           Serial.println(LEADeast_hist);
           Serial.print("correction factor: ");
           Serial.println(LEADeast_hist - east);
           matchflag = true;
       }
       
}

The system works, but unfortunately is somewhat slow. Here is sample output for when I set north to be 10,000:

Initializing SD card...initialization done.
MATCH!
north: 10000 LEADnorth: 10000
east: 0 LEADeast at this north: 5
correction factor: 5
elapsed time: 586730


note that elapsed time is in microseconds.

Ideally I would want to be running this search function many times a second, while also writing new values to the data file many times a second. Any ideas on how I could improve things?

some details: I am using this card: Samsung micro sd (amazon link)

and the SD reader that is on the back of the TFT display from PJRC: PJRC site (Color 320x240 TFT Display, ILI9341 Controller Chip)

Thanks!
 
Last edited:
Hi folks,

Sorry to revive an old thread of mine, but I am getting back onto this project after a long hiatus.

Could anyone help me figure out how to shorten these search times through a CSV file with many rows and 2 columns?

Code is same as posted below.

Thanks!


OK, here's the code I have implemented to do the search algorithm.

This is for doing one search through a large file (100,000 rows and 2 columns) whose data looks like this: (LEADnorth_history, LEADeast_history)

10 , 5
11 , 5
12 , 5
13 , 5

etc. in a csv file.

I have simulated the current position that is looking for a match in this data as a constant (north,east).

Code:
/*
 
* SD card attached to SPI bus as follows:
** MOSI - pin 11
** MISO - pin 12
** CLK - pin 13
** CS - pin 4
 
*/

#include <SD.h>
#include <SPI.h>

File myFile;

const byte numChars = 32;
char receivedChars[numChars];	// an array to store the received data
int LEADnorth_hist = 0;
int LEADeast_hist = 0;

boolean newData = false;

int north = 10000;    //simulation of current north
int east = 0;      //simulation of current east

boolean matchflag = false;    //flag to check if there is a match, exit loop


void setup()
{
    // Open serial communications and wait for port to open:
     Serial.begin(9600);
      while (!Serial) {
       ; // wait for serial port to connect. 
     }
    
    
     Serial.print("Initializing SD card...");
     // On the Ethernet Shield, CS is pin 4. It's set as an output by default.
     // Note that even if it's not used as the CS pin, the hardware SS pin 
     // (10 on most Arduino boards, 53 on the Mega) must be left as an output 
     // or the SD library functions will not work. 
      pinMode(10, OUTPUT);
      
     if (!SD.begin(4)) {
       Serial.println("initialization failed!");
       return;
     }
     Serial.println("initialization done.");  
       
     long starttime = micros();
     // re-open the file for reading:
     myFile = SD.open("data.csv");
     if (myFile)
       {
         while (myFile.available()) {
           recvWithEndMarker();   //gets the raw serial data from the csv and puts into an array
           showNewData();         //resets new data to false and can display raw char to serial monitor
           parseData();           //parses the raw data into two ints
          
           
           MatchSearch();         //searches for match of north to LEADnorth and also compares easts at that LEADnorth
           if (matchflag == true)
           {
             break;
           }
           
         }
       }  
     else 
       {
         // if the file didn't open, print an error:
         Serial.println("error opening data.csv");
       }
       
      
     // close the file:
     myFile.close();
     Serial.print("elapsed time: ");
     Serial.println(micros()-starttime);
 
}


void loop()
{
	// nothing happens after setup
}

void recvWithEndMarker()
{
        static byte ndx = 0;
        char endMarker = '\n';
        char rc;
        
        
        while (newData == false)
        {
          rc = myFile.read();
          
          if (rc != endMarker)
          {
            receivedChars[ndx] = rc;
            ndx++;
            if (ndx >= numChars) {
              ndx = numChars - 1;
            }
            
          }
          
          else
          {
            receivedChars[ndx] = '\0';   //terminate the string
            ndx = 0;
            newData = true;
          }
          
        }
        
}

void showNewData()
{
      if (newData == true)
      {
          //Serial.print("This just in ...");
          //Serial.println(receivedChars);
          newData = false;
      
      }
}

void parseData()
{
      //split the data into its parts
      
      char * strtokIndx;    //this is used by strtok() as an index
      
      strtokIndx = strtok(receivedChars,",");    //get the first part
      LEADnorth_hist = atoi(strtokIndx);          //convert to an integer
      
      strtokIndx = strtok(NULL, ","); // this continues where the previous call left off
      LEADeast_hist = atoi(strtokIndx);     // convert this part to an integer
    
    
}



void  MatchSearch()
{
       if (LEADnorth_hist == north)
       {
           Serial.println("MATCH!");
           Serial.print("north: ");
           Serial.print(north);
           Serial.print(" LEADnorth: ");
           Serial.println(LEADnorth_hist);
           Serial.print("east: ");
           Serial.print(east);
           Serial.print(" LEADeast at this north: ");
           Serial.println(LEADeast_hist);
           Serial.print("correction factor: ");
           Serial.println(LEADeast_hist - east);
           matchflag = true;
       }
       
}

The system works, but unfortunately is somewhat slow. Here is sample output for when I set north to be 10,000:

Initializing SD card...initialization done.
MATCH!
north: 10000 LEADnorth: 10000
east: 0 LEADeast at this north: 5
correction factor: 5
elapsed time: 586730


note that elapsed time is in microseconds.

Ideally I would want to be running this search function many times a second, while also writing new values to the data file many times a second. Any ideas on how I could improve things?

some details: I am using this card: Samsung micro sd (amazon link)

and the SD reader that is on the back of the TFT display from PJRC: PJRC site (Color 320x240 TFT Display, ILI9341 Controller Chip)

Thanks!
 
Ok, a suggestion:
Don't use a ASCII-File, use a binary file instead.
Means: Write two unsigned ints per "line" (NOT the ASCII-Numbers) . Don't write the comma and the line-end-marker.
2 unsigned ints = 8 bytes per dataset

if you want to read the 10000th line, do a seek(10000*8), and read the both values.

https://www.arduino.cc/en/Reference/FileSeek

you can opimize this further, if you don't write the first value...
 
Thanks! Will try this now.

I need both values, as this is essentially a lookup table. When I find the first entry (first column) I need to know what the corresponding second entry is (second column).
 
Okay, having some issues with this.

1. I am not seeking a certain line, I am seeking to match a value.
2. How would you go about creating this binary file?
 
?? ok, why this example-file, then - makes no sense. in your example-file are consecutive line-numbers.
Please post a real file.

Use write() to create a binary file.
 
Hi Frank - the example file makes it very easy to test the search algorithm and see if it is correct. It also makes it easy to see how much the time to find a solution changes when the match in the data is at the beginning of the dataset vs. at the end.

I don't have a real dataset yet, I am writing this to make it work with simulated data before I attach real sensors.
 
Ok, in this case, the only hint i can give, is: don't use single-byte reads, read 512-byte blocks at once - this is the internal blocksize the sd-library uses, and will be much faster.

But still, try to use binary data - the file will be much smaller - and therefore less data to read.
 
Status
Not open for further replies.
Back
Top