LittleFS port to Teensy/SPIFlash

I think I have everything synced up. And built again. I also edited the example sketch that at end allows me to quickformat or lowlevelFormat or none...

Running on my Beta Faster board...
Code:
C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 25 2020 17:25:21
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes......................
C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 26 2020 06:54:59
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes.........................
Big write took 62.44 Sec for 8280000 Bytes : file3.size()=0
	Big write KBytes per second 132.62 
Bytes Used: 8192, Bytes Total:16777216

C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 26 2020 06:59:44
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes.........................
Big write took 60.74 Sec for 8280000 Bytes : file3.size()=0
	Big write KBytes per second 136.32 
Bytes Used: 8306688, Bytes Total:16777216

: q-quickformat, l-lowlevel format, else no format
Bytes Used: 8192, Bytes Total:16777216

C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 26 2020 06:59:44
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes.........................
Big write took 64.65 Sec for 8280000 Bytes : file3.size()=0
	Big write KBytes per second 128.08 
Bytes Used: 8306688, Bytes Total:16777216

: q-quickformat, l-lowlevel format, else no format
.........................................................................................................................
Bytes Used: 8192, Bytes Total:16777216

C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 26 2020 06:59:44
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes.........................
Big write took 62.15 Sec for 8280000 Bytes : file3.size()=0
	Big write KBytes per second 133.23 
Bytes Used: 8306688, Bytes Total:16777216

: q-quickformat, l-lowlevel format, else no format
So it stays at about the 120-130+kb per second. Now to try again on New chip:
First with the older one I did, even after low level format about 78kb...
Dito for the new one...

Just to make sure same memory on each board, I update the littlefs to print out data about the identification of the chip...

screenshot.jpg
Which you can see the details from screenshot of three TyCommander windows, where all three chips were programmed at the same time :D

It is almost like maybe the original chip had something run on it that converted it to work in Quad mode and the newer boards are still running ins single or dual pin SPI? Not sure if that
makes sense?
 
The only big difference I can think of is the time needed for erasing.
Does the program erase anything?

The Teensy waits for the chip. Oh, and yes, maybe it waits when a block gets written (don't remember - must look) .. but this difference should'nt be that big?

Edit: ehh...are we speaking about Flash or PSRAM?
 
This is to the WinBond QSPI Flash chips. All three should be the same.

As I have mentioned I was having some of the MTP fail on my board when I would try to copy a 300KB file from windows to the QSPI on my newer boards but worked on the T4.1 beta board.

So the write of data is slower. Again not sure where yet. But as I mentioned I did verify that the system thinks they are the same chip.
 
@defragster - I found issue with showing 0 size for files in my version I was using.

The issue was the size was printing using %u and it is a 64 bit value. so when I changed to %llu
It printed values like:
Code:
C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 26 2020 09:12:28
LittleFS - Chip Lookup: ef 40 18
  24 - 100 1000 1000000 bb8 61a80 402a4000
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes......
C:\Users\kurte\Documents\Arduino\bar\bar.ino Nov 26 2020 09:12:55
LittleFS - Chip Lookup: ef 40 18
  24 - 100 1000 1000000 bb8 61a80 402a4000
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes.........................
Big write took 103.66 Sec for 8280000 Bytes : file3.size()=8280000
	file3.position()=8280000
	Big write KBytes per second 79.88 
Bytes Used: 8306688, Bytes Total:16777216

: q-quickformat, l-lowlevel format, else no format
I was also curious so also printed out the file position to make sure it was at end...
 
Just out of curiosity just ran @defragster sketch in post #422 using a fresh PSRAM chip on a new T4.1. Some interesting results:
Code:
D:\Users\Merli\Documents\Arduino\T41\BIgFileTest_defrag\BIgFileTest_defrag.ino Nov 26 2020 11:27:07
Started QSPI_DISK

Bytes Used: 1716224, Bytes Total:16777216

Start Big write of 7428000 Bytes
.  Bytes To Write:7112000	Bytes Used: 2039808, totalSize()=16777216 {diff=14737408)
.  Bytes To Write:72000	Bytes Used: 9093120, totalSize()=16777216 {diff=7684096)

Big write took 55.10 Sec for 7428000 Bytes : file3.size()=7428000
	Big write KBytes per second 134.82 
Bytes Used: 9162752, Bytes Total:16777216
		myfs.quickFormat() Completed
Bytes Used: 8192, Bytes Total:16777216


AFTER QUICK FORMAT
Big write took 61.73 Sec for 8280000 Bytes : file3.size()=8280000
	Big write KBytes per second 134.14 
Bytes Used: 8306688, Bytes Total:16777216
		myfs.quickFormat() Completed
Bytes Used: 8192, Bytes Total:16777216


After LOWLEVEL FORMAT
Big write took 62.24 Sec for 8280000 Bytes : file3.size()=8280000
	Big write KBytes per second 133.03
here is a comparison:
Code:
New Chip LFS format:  55.1 seconds
QuickFormat:          61.73
LLFormat              62.24

Seems like after you do a quickformat or LLFormat something changes to slow the chip up.

On the Beta T4.1 gets even more interesting:
Code:
1stpass:  71,75 seconds
QuickFormat:          70.25
LLFormat              69.38
In this case the times go down.
 
What I was wondering if any of our earlier tests before littlefs somehow configured the chips differently?
 
What I was wondering if any of our earlier tests before littlefs somehow configured the chips differently?

Think in all our earlier don't think we were using extmem_malloc implementations. Right now in LittleFS:
Code:
	bool begin(uint32_t size) {
#if defined(__IMXRT1062__)
		return begin(extmem_malloc(size), size);
#else
		return begin(malloc(size), size);
#endif
which I don't think we were doing using when we were originally playing with PSRAM. Think this is what I was doing when we started with PSRAM: https://github.com/PaulStoffregen/teensy41_extram/blob/master/extRAM_t4/extRAM_t4.cpp.

Other than that believe all the chip configuration has remained the same but those changes i didn't look at close.
 
@mjs513 - I am not sure if the PSRam is used when testing the QSPI flash? But maybe for buffers. But then still wonder why one of my chips acts the same speed as yours but the other two are half the speed more or less.

Here are the three boards... The one in center with SDCard is beta .
IMG_0283.jpg

Note the chips do look slightly different... All three PSRAMS came from PJRC. The two Winbonds on the outside ones, one from PJRC the other from Digikey
 
@mjs513 - I am not sure if the PSRam is used when testing the QSPI flash? But maybe for buffers. But then still wonder why one of my chips acts the same speed as yours but the other two are half the speed more or less.

Here are the three boards... The one in center with SDCard is beta .

Note the chips do look slightly different... All three PSRAMS came from PJRC. The two Winbonds on the outside ones, one from PJRC the other from Digikey
Yeah you are right - that was just me getting myself confused on QSPI. In terms of the flash not sure anything changed with LUTS - saw one difference but shouldn't impact the performance.
 
The only big difference I can think of is the time needed for erasing.
Does the program erase anything?

The Teensy waits for the chip. Oh, and yes, maybe it waits when a block gets written (don't remember - must look) .. but this difference should'nt be that big?

Yes, currently the flash writing & erasing waits for the flash chip to be completely finished before returning.

In the future I would like to change this, so writes are put into a queue and performed by the hardware "in the background". We might even support suspending writes or erases, so reads don't suffer long waits. Maybe. But that will make the code much more complicated. I'd like to at least release 1.54 using the simple code, so we have at least 1 stable release before attempting something so complex.

For PSRAM and reads from program flash, the M7's caches also come into play. Writes to the PSRAM will usually go into the cache and seem to complete very quickly. Many reads might also come from the cache. The cache also can be used when reading program flash. But QSPI isn't using the cache at all.
 
I am wondering if maybe I have the wrong Winbond part?

The one I have is the: https://www.digikey.com/en/products/detail/winbond-electronics/W25Q128JVSIQ-TR/7087212

@mjs513 - I believe in another thread you said you purchased the one: https://www.digikey.com/en/products/detail/winbond-electronics/W25Q128JVSIQ/5803943

I am wondering what the TR at the end implies. Maybe I need to purchase some more of the ones you mentioned.

Wonder if the ones that Paul shipped? Are the TR ones as well? Not sure...

May need to play. Might be good candidate to go into external test board when I get one.
 
I am wondering if maybe I have the wrong Winbond part?

The one I have is the: https://www.digikey.com/en/products/detail/winbond-electronics/W25Q128JVSIQ-TR/7087212

@mjs513 - I believe in another thread you said you purchased the one: https://www.digikey.com/en/products/detail/winbond-electronics/W25Q128JVSIQ/5803943

I am wondering what the TR at the end implies. Maybe I need to purchase some more of the ones you mentioned.

Wonder if the ones that Paul shipped? Are the TR ones as well? Not sure...

May need to play. Might be good candidate to go into external test board when I get one.

That seems to be a digikey packaging notation :: W25Q128JVSIQTR-ND - Tape & Reel (TR)

The rest of the main part# and info appears the same.
 
That seems to be a digikey packaging notation :: W25Q128JVSIQTR-ND - Tape & Reel (TR)

The rest of the main part# and info appears the same.

:eek: - Yep I am used to CT and the like, but TR was not in my normal ordering like stuff... Especially since they actually ship CT from this page. It is interesting that the page that @mjs513 posted is about $.05 cheaper than the CT version.

So still wondering why my newer ones are running slower?

Bad soldering? I may touch one of these boards again with iron to see if that makes any difference?

Maybe running some of our early tests back with the teensy41_extram project changed the original one?

Note: I am trying to run with up to date stuff. That is in my Arduino install I renamed the hardware\teensy\avr\cores to some other name and did a link tot he current cores project(mklink /D cores d:\github\cores)
Which I try to keep up to sync. And if something changed here, I would think it would effect all three of my setups.

Wondering if I maybe should try to solder my last T4.1 up with just the Flash and see if it acts any different.

Other suggestions? Has anyone else seen something like this?
 
@KurtE - others
Not sure why you new ones are slower - the timings i showed in post 429 indicate that the beta is running slower then my fresh Flash. Maybe touch up the solder as you said
 
@KurtE - others
Not sure why you new ones are slower - the timings i showed in post 429 indicate that the beta is running slower then my fresh Flash. Maybe touch up the solder as you said

Another long shot: My original one from Paul: There are three lines of text: WinBond, 25Q128JVSQ, 1905
The newer ones, which cases look a bit different, the first two lines are the same, the last line is 2028
Note This is the same for the ones I ordered versus the ones from PJRC.

My guess is that the last line is probably something like a manufacturing run? Curious what yours says?
 
Another long shot: My original one from Paul: There are three lines of text: WinBond, 25Q128JVSQ, 1905
The newer ones, which cases look a bit different, the first two lines are the same, the last line is 2028
Note This is the same for the ones I ordered versus the ones from PJRC.

My guess is that the last line is probably something like a manufacturing run? Curious what yours says?

Just check and mine is showing 1949 - so very well could be a run number.
 
Thanks, for the heck of it I ordered 5 more with the part number your posting mentioned earlier. Should be the same parts, but curious to see what the run number will be.

I also tried to touch up solder. Had to use some more flux so washed off and sitting above stove to dry... Don't think this will help as I previously checked all of the pins with our HiLow test and it showed all of the pins had connectivity and no shorts... But...

@Paul - I know you are up to your eyeballs with stuff like keeping stuff in stock and suppliers happy, but if you do get a bit of time, maybe try a board with the chips you had/have some of the same ones in my new board.
and see if you are getting the same performance?

Code:
#include <LittleFS.h>

#define HALFCUT  // Comment this to see failed LARGE FILE
// HALFCUT defined will show invalid file size

//#define TEST_RAM
//#define TEST_SPI
#define TEST_QSPI
//#define TEST_PROG

// Set for SPI usage
const int FlashChipSelect = 6; // digital pin for flash chip CS pin

#ifdef TEST_RAM
LittleFS_RAM myfs;
// RUNTIME :: extern "C" uint8_t external_psram_size;
EXTMEM char buf[8 * 1024 * 1024];  // USE DMAMEM for more memory than ITCM allows - or remove
//DMAMEM char buf[490000];  // USE DMAMEM for more memory than ITCM allows - or remove
char szDiskMem[] = "RAM_DISK";
#elif defined(TEST_SPI)
//const int FlashChipSelect = 21; // Arduino 101 built-in SPI Flash
#define FORMATSPI
//#define FORMATSPI2
LittleFS_SPIFlash myfs;
char szDiskMem[] = "SPI_DISK";
#elif defined(TEST_PROG)
LittleFS_Program myfs;
char szDiskMem[] = "PRO_DISK";
#else // TEST_QSPI
LittleFS_QSPIFlash myfs;
char szDiskMem[] = "QSPI_DISK";
#endif

File file3;

void setup() {
  while (!Serial) ; // wait
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);

#ifdef TEST_RAM
  if (!myfs.begin(buf, sizeof(buf))) {
#elif defined(TEST_SPI)
#ifdef FORMATSPI
  if (!myfs.begin( FlashChipSelect )) {
#elif defined(FORMATSPI2)
  pinMode(FlashChipSelect, OUTPUT);
  digitalWriteFast(FlashChipSelect, LOW);
  SPI2.setMOSI(50);
  SPI2.setMISO(54);
  SPI2.setSCK(49);
  SPI2.begin();
  if (!myfs.begin(51, SPI2)) {
#endif
#elif defined(TEST_PROG)
  if (!myfs.begin(1024 * 1024 * 4)) {
#else
  if (!myfs.begin()) {
#endif
    Serial.printf("Error starting %s\n", szDiskMem);
  }
  else
    Serial.printf("Started %s\n", szDiskMem);

  bigFile();

  Serial.printf("Bytes Used: %llu, Bytes Total:%llu\n", myfs.usedSize(), myfs.totalSize());
  Serial.println("\n: q-quickformat, l-lowlevel format, else no format");
  while (!Serial.available()) ;
  int ch = Serial.read();
  if (ch == 'q') myfs.quickFormat(); // quick format on exit for next run
  if (ch == 'l') myfs.lowLevelFormat('.'); // Low level format;
  Serial.printf("Bytes Used: %llu, Bytes Total:%llu\n", myfs.usedSize(), myfs.totalSize());
}



void loop() {
  // put your main code here, to run repeatedly:

}

void bigFile(  ) {
  char myFile[] = "/bigfile.txt";

  // FILL DISK
  lfs_ssize_t resW = 1;
  char someData[4000];
  //uint32_t xx, toWrite = (myfs.totalSize()) - myfs.usedSize() - 40960; // allow for slack space :: WORKS on FLASH?
  uint32_t xx, toWrite = (myfs.totalSize()) - myfs.usedSize() - 202400; // allow for slack space
  Serial.printf("\nBytes Used: %llu, Bytes Total:%llu\n\n", myfs.usedSize(), myfs.totalSize());
#ifdef HALFCUT
  toWrite /= 2; // cutting to this works on LittleFS_RAM myfs - except reported file3.size()=2054847098
#endif
  xx = toWrite;
  memset( someData, 'z', 4000 );
  Serial.printf( "\nStart Big write of %u Bytes", xx);
  uint32_t timeMe = micros();
  file3 = myfs.open(myFile, FILE_WRITE);
  int hh = 0;
  while ( toWrite > 4000 && resW > 0 ) {
    resW = file3.write( someData , 4000 );
    hh++;
    if ( !(hh % 80) ) Serial.print('.');
    toWrite -= 4000;
  }
  xx -= toWrite;
  file3.close();
  timeMe = micros() - timeMe;
  file3 = myfs.open(myFile, FILE_WRITE);
  if ( resW < 0 ) {
    Serial.printf( "\nBig write ERR# %i 0x%X \n", resW, resW );
  }
  Serial.printf( "\nBig write took %5.2f Sec for %u Bytes : file3.size()=%llu", timeMe / 1000000.0, xx, file3.size() );
  Serial.printf( "\n\tfile3.position()=%llu", file3.position());
  Serial.printf( "\n\tBig write KBytes per second %5.2f \n", xx / (timeMe / 1000.0) );
  file3.close();
}
 
maybe try a board with the chips you had/have some of the same ones in my new board.
and see if you are getting the same performance?


With W25Q128JVSIQ (date code 2028):

Code:
/tmp/arduino_modified_sketch_28042/sketch_nov27a.ino Nov 27 2020 07:34:56
Started QSPI_DISK

Bytes Used: 12288, Bytes Total:16777216


Start Big write of 8281264 Bytes.........................
Big write took 101.95 Sec for 8280000 Bytes : file3.size()=8280000
	file3.position()=8280000
	Big write KBytes per second 81.21 
Bytes Used: 8310784, Bytes Total:16777216

: q-quickformat, l-lowlevel format, else no format


With W25Q128JVSIQ (date code 1905):

Code:
/tmp/arduino_modified_sketch_28042/sketch_nov27a.ino Nov 27 2020 07:34:56
Started QSPI_DISK

Bytes Used: 8192, Bytes Total:16777216


Start Big write of 8283312 Bytes.........................
Big write took 62.56 Sec for 8280000 Bytes : file3.size()=8280000
	file3.position()=8280000
	Big write KBytes per second 132.35 
Bytes Used: 8306688, Bytes Total:16777216

: q-quickformat, l-lowlevel format, else no format


With W25Q512JVEIM (date code 1950):

Code:
/tmp/arduino_modified_sketch_28042/sketch_nov27a.ino Nov 27 2020 07:34:56
Started QSPI_DISK

Bytes Used: 12288, Bytes Total:67108864


Start Big write of 33447088 Bytes........................................................................................................
Big write took 383.32 Sec for 33444000 Bytes : file3.size()=33444000
	file3.position()=33444000
	Big write KBytes per second 87.25 
Bytes Used: 33525760, Bytes Total:67108864

: q-quickformat, l-lowlevel format, else no format
 
Thanks Paul,

So yours confirmed that the different date codes and the like do make a significant difference and it was not my bad soldering ;)
Your 1905 - Big write KBytes per second 132.35
Your 2028 - Big write KBytes per second 81.21

Looks like the new W25Q512JVEIM is also slower: Big write KBytes per second 87.25

Looks like I may need to try to update my order to Digikey to have some of the newer ones :D
 
@KurtE
Out of curiosity I decided to give the BigFile sketch a try using SPIFFS just as a comparison, if I converted the sketch correctly: these are the results with a date code of 1905 on my flash chip:
Code:
Bytes Used: 0, Bytes Total:15414161

Start Big write of 7504000 Bytes

Big write took 14.23 Sec 
	Big write KBytes per second 527.32 
bigfile.txt [0001] size:7504000

Couldn't resist this next one, looks like the max file size with SPIFFS is 14MB, otherwise I get an error on write, so:
Code:
Start Big write of 13912000 Bytes

Big write took 26.37 Sec 
	Big write KBytes per second 527.47 
bigfile.txt [0001] size:13912000

This is the sketch I used if you are interested:
Code:
/*
   This test uses the optional quad spi flash on Teensy 4.1
   https://github.com/pellepl/spiffs/wiki/Using-spiffs
   https://github.com/pellepl/spiffs/wiki/FAQ

   ATTENTION: Flash needs to be empty before first use of SPIFFS


   Frank B, 2020
*/
extern "C" uint8_t external_psram_size;

#include <spiffs_t4.h>
#include <spiffs.h>

spiffs_t4 myfs;

//Setup files IO
spiffs_file file1, file3;



void setup() {
  while (!Serial);
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
  Serial.printf("PSRAM: %d MB\n", external_psram_size);

  Serial.println("\n Enter 'y' in 6 seconds to format FlashChip - other to skip");
  uint32_t pauseS = millis();
  char chIn = 9;
  while ( pauseS + 6000 > millis() && 9 == chIn ) {
    if ( Serial.available() ) {
      do {
        if ( chIn != 'y' )
          chIn = Serial.read();
        else
          Serial.read();
      }
      while ( Serial.available() );
    }
  }
  if ( chIn == 'y' ) {
    int8_t result = myfs.begin();
    myfs.eraseFlashChip();

  }

  myfs.begin();

  Serial.println();
  Serial.println("Mount SPIFFS:");
  myfs.fs_mount();

  Serial.println();
  Serial.println("Directory contents:");
  myfs.fs_listDir();

  bigFile();

    myfs.fs_listDir();

}

void loop() {}

void bigFile(  ) {
  char myFile[] = "bigfile.txt";
  int resW;

  uint32_t totalSize, usedSize;
  char someData[4000];
  uint32_t xx;
  Serial.println("Getting space"); Serial.flush();
  myfs.fs_space(&totalSize, &usedSize);
  uint32_t toWrite = ((totalSize - usedSize)/2) - 202400; // allow for slack space
  Serial.printf("\nBytes Used: %lu, Bytes Total:%lu\n\n", usedSize, totalSize);
  Serial.flush();
  xx = toWrite;
  memset( someData, 'z', 4000 );
  Serial.printf( "\nStart Big write of %lu Bytes\n", (toWrite - toWrite % 4000));
  uint32_t timeMe = micros();
  resW = myfs.f_open(file3, myFile, SPIFFS_CREAT | SPIFFS_TRUNC | SPIFFS_RDWR);
  Serial.println(resW);
  int hh = 0;
  while ( toWrite > 4000 && resW > 0 ) {
    resW = myfs.f_write(file3, someData , 4000 );
    hh++;
    if ( !(hh % 80) ) {
      Serial.print('.');
      myfs.fs_space(&totalSize, &usedSize);
      Serial.printf("  Bytes To Write:%lu\tBytes Used: %lu, totalSize =%lu {diff=%lu)\n", (toWrite - toWrite % 4000), usedSize, totalSize, totalSize - usedSize );
    }
    toWrite -= 4000;
  }
  xx -= toWrite; 
  myfs.f_close(file3);
  timeMe = micros() - timeMe;
  myfs.f_open(file3, myFile, SPIFFS_RDWR);
  if ( resW < 0 ) {
    Serial.printf( "\nBig write ERR# %i 0x%X \n", resW, resW );
  }
  Serial.printf( "\nBig write took %5.2f Sec ", timeMe / 1000000.0, xx );
  Serial.printf( "\n\tBig write KBytes per second %5.2f \n", xx / (timeMe / 1000.0) );
  myfs.f_close(file3);

}
 
Back
Top