Teensy database solution guidance

Status
Not open for further replies.

JWizard

Member
Hello,

I need to implement some form of on-board database connected to either a Teensy 3.2 or 4.0.

Some preliminary information:
  • 100,000 entries transferred when initiated by user
  • ~100 bytes per entry
  • Finding and updating an entry needs to be virtually immediate
  • Non-Volatile

I have been playing with a teensy 3.2 connected to a standard SDHC formatted as Fat32 using the SD library. My thought was that if I am to have virtually immediate finds and updates that I should write one file per entry. This way my "key" is the filename. The issue I am having with this at the moment is that it is taking upwards of 20 minutes just to transfer the first 13000 entries and this kind of timing is unacceptable. 5 minutes to transfer the full 100,000 entries is my upper limit target. However, I do not know for certain why it is slow at this time.

My other idea I want to explore is using FRAM. I'm thinking I can implement a hash table directly into FRAM ensuring fast finds and updates. I am not certain of the time it will take to do the initial write, but since I can address the FRAM with more granularity instead of writing big blocks I think it would be improved.

Any thoughts on such a system? Does anyone have experience in dealing with >=100,000 database entries with Teensy?
 
While you might be able to do it with a Teensy, I suspect you might be better served by using a Raspberry Pi than a Teensy.

However, I suspect that ultimately you will be limited by external things (i.e. speed of the USB, speed of SD-HC media, speed of the network, etc.). That is at least 10 megabytes of data that you are asking for.

The Teensy 3.x systems transfer USB data at USB 1.1 speeds (i.e. 12 Mega-bits/seconds, or ~ 1 Megabyte/second). The Teensy 4.0 transfer USB data at USB 2.0 speeds (i.e. 480 Mega-bits/seconds, or ~ 48 Megabytes/second). However that is a theoretical maximum. It is unlikely you will get anywhere near that speed without doing serious optimization on both the Teensy end of things and the receiving computer. So, without even thinking about Teensy (or Raspberry Pi, etc.) you need to think about how you are transferring data, and what the various bottlenecks are.

Good luck.
 
Keying on the filename sounds a good way to segment and parse the data. Paying attention to directory structure and file system in use will make a difference. Not made clear if a large number of entries per few files or large number of files with some few entries in each? Directory and subdirectory organization can make a huge difference in file system overhead to get it indexed to and open a given file. In particular the root directory should not be overloaded.

Instantaneous access/update might be possible - how long between subsequent updates? Read access to get a file open and the data recalled for an entry may be fast - but the write back of the update may take some 100's of milliseconds as the SD card saves the data.

There is a USBHost thread ( T_3.6 or T_4.0 ) that would allow putting a flash drive or hard disk or SSD on the Teensy. Not clear if that added complexity to simplify is possible or would result in better throughput. As noted the T_4 device USB is 480 Mbit/sec so the incoming data should come in well and the USBHost on both noted Teensy's is also 480 Mbit. USBHost on T_3.6 is easier - but it also has the 4 bit SDIO interface to the SD card that should result in improved throughput.
 
There is a USBHost thread ( T_3.6 or T_4.0 ) that would allow putting a flash drive or hard disk or SSD on the Teensy. Not clear if that added complexity to simplify is possible or would result in better throughput. As noted the T_4 device USB is 480 Mbit/sec so the incoming data should come in well and the USBHost on both noted Teensy's is also 480 Mbit. USBHost on T_3.6 is easier - but it also has the 4 bit SDIO interface to the SD card that should result in improved throughput.

But note my results over in this thread. I was surprised that while SDIO helped a little bit (1.76 Mbyte/second vs. 1.19 Mbyte/second), it wasn't the 2-4x improvement I thought it might be:
 
@MM: I had just read that - indeed surprised it wasn't at least 2-3X faster???? Maybe be something in the way the library makes the calls?

p#3 USBHost ref was to this: USBHost_t36-USB-Mass-Storage-Driver-Experiments

This post shows writes at a few speeds:
SDIO FAT32 :: 1.75 MB/s
SDIO FAT16 :: 7.67 MB/s
HDD FAT32 :: 8.15 MB/s
HDD FAT16 :: 13.82 MB/s

Other posts on that thread can typically get 8-15 MB/sec on WRITE where file system makes a difference, assume reads would be faster but that wasn't a tested sketch.

This post shows writes with ExFAT - on a HDD it seems:
File System FS_EXFAT (7296990 - 17.962475 MB/s)
File System FS_FAT32 (22447996 - 5.838918 MB/s)

But that thread and sample also support the T_3.6 or T_4.0 SDIO connected SD drive that show better throughput on write than 1.75 MB/sec - maybe only when FAT16 formatted?

<edit> There is also a BETA Greiman SD FS thread - IIRC it was showing 20MB/s read speeds? [ github.com/greiman/SdFat-beta
>> Thread for :: pjrc.com/threads/57669-SdFat-SDIO-for-Teensy-4-0
 
Thank you @MichaelMeissner for your responses regarding transfer times.

You have helped point out to me that transfer of the initial database contents into my system is more than likely the largest bottleneck here. I have received confirmation from my team that it would be okay if initialization took even 10 to 20 minutes.

After that I only need to access the data (which will be give or take just 100 bytes of data per entry) by the key. Since I have no need to search the database by any item other than the official key I believe I can make this happen with a Teensy and an SD card but will first try with FRAM since this is not intended to be removable media and I want unlimited writes to remove any consideration of wear-out and the granularity of access will improve updates since they will not require the minimum of 512 block size on an SD.

I do still get recommendation after recommendation to swap out the teensy for a pi. I think this is the wrong move because of what I have said above. The speeds mentioned in yalls' posts should be more than adequate for this specific use case given I can take even 20 minutes to import and store 10MBs. I have absolutely no need to hold chunks of the database in RAM for any purpose other than efficient transfer. When I look at the data I will always be looking at one isolated entry.

@defragster special thanks for your thoughts on using the key as the filename in an SD card solution. To clarify each entry (or file in this case) is just ~100 bytes of contiguous binary data that my MCU application will make sense of. When I update an entry I am just updating these 100 bytes in a complete rewrite of the entry.
 
Last edited:
If I were going to tackle this problem, I would try to get a better understanding of how the file allocation table (the "FAT" in FAT32) could impact performance. If you hold let's say, several thousand DB entries per file, and each entry is a fixed size, you can do simple math to find a certain ID within that file. That may be faster than one ID per file and having FAT search through thousands of files to find it. Multiple entries per file means you can also read and write in multiples of 512 to get best performance.

Having said that, this would take like 30 minutes to setup and have working on any linux computer like a Pi, and you'd get networking etc for free.
 
Status
Not open for further replies.
Back
Top