Additional PSRAM ID that works plus goodies

My samples of the IS66WVS16M8FBLL (3.3V version) came in and I put two 16MB chips (32MB total) on a Teensy 4.1. The Startup.c in cores was updated with the new version that is going into Beta 5.

View attachment 37817

Code:
32MB default 88MHz
test ran for 144.97 seconds. All memory tests passed :-)

32MB CCM_CBCMR=95AE8304 (105.6 MHz)
test ran for 123.66 seconds.  All memory tests passed :-)

32MB CCM_CBCMR=B5AE8204 (110.8 MHz)
test ran for 118.56 seconds.  All memory tests passed :-)

32MB CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 110.65 seconds.  All memory tests passed :-)

32MB CCM_CBCMR=55AE8004 (132.0 MHz) - FAILED
testing with fixed pattern 0F0F0F0F
Error at 710016C0, read 0F0F1F0F but expected 0F0F0F0F

Reran same test and got exact same failure. This failed at a lower speed than xxxajk saw, so I removed 1 PSRAM to see if that was the difference or if it was the fact that I was using 3.3V vs 1.8V parts.

Code:
16MB CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 55.27 seconds.  All memory tests passed :-)

16MB CCM_CBCMR=55AE8004 (132.0 MHz)
test ran for 51.07 seconds.  All memory tests passed :-)

16MB CCM_CBCMR=95AE8104 (144.0 MHz)
test ran for 47.09 seconds.  All memory tests passed :-)

16MB CCM_CBCMR=75AE8204 (166.2 MHz)
testing with fixed pattern 5A698421.
Error at 70000000, read DF6B9C23 but expected 5A698421

Now the results matched xxxajk's results so it appears that the issue is likely a bus loading issue. These PSRAM are made of two stacked 8MB parts, so guessing that 2 of them probably load the bus like having 4 of the regular 8MB parts.

To further check this, I added a 2Gb NAND Flash chip and the failure point returned to the same as with the 32MB PSRAM setup

View attachment 37818

Code:
16MB + 2Gb Flash CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 55.27 seconds.  All memory tests passed :-)

16MB + 2Gb Flash CCM_CBCMR=55AE8004 (132.0 MHz)
testing with fixed pattern 5A698421
Error at 700F7700, read 5A698423 but expected 5A698421

Would have been nice if they worked at the 132MHz that everything else seems to be OK working at, but 120Mhz is still pretty decent.

Perhaps bypass capacitor distance could be a factor too?
C20 is quite a bit away from the second chip, and it failed within the second 16MB.


As a final set of tests, I tried to run my combined PSRAM and NAND Flash test at the default 88MHz bus speed. The PSRAM test passed as expected but received an error starting QSPI Disk which I think was expected since littleFS needs to be updated to work.

I then downloaded @h4yn0nnym0u5e revised LittleFS.cpp file and tried it again. https://github.com/h4yn0nnym0u5e/LittleFS/tree/main/src

It compiled and downloaded but got no output. The PSRAM test didn’t even seem to start. I did not see a revised LittleFS_NAND.cpp file so perhaps that was the issue? Not sure if both files need to be modified for NAND or I was just pulling from the wrong place.
 
SPI is in SLAVE MODE and not in control of the clocking to it, so it needs to have something ready ahead of time, and RAM may be BUSY (BUSS contention) since it MAY be writing from the other DMA channel.

There's also addressing overhead, caching, etc.
Same thing happens when using DMAMEM, the TX DMA instance doesn't get the byte out of RAM in time.
Sounds like yet another case for using pre-emptible DMA channels, so the writing can be interrupted at any moment by the reading...

(Prefetching might work too, but it's doubtful it will make any noticeable difference if DMAMEM has the same problem.)
 
Last edited:
Sounds like yet another case for using pre-emptible DMA channels, so the writing can be interrupted at any moment by the reading...

(Prefetching might work too, but it's doubtful it will make any noticeable difference if DMAMEM has the same problem.)
That still wouldn't fix the problem, since it would need to be able to post the next byte BEFORE one 8MHz HALF-CYCLE, or less than 62.5nS.

88 MHz gives us 11.36.... nS to transmit what action, the address, and to read or write the 64bits to/from SPI's cache, ain't happening, and it has to possibly finish a transaction that may be in-progress too. If the AHB isn't ready it can stall as well.

From a cold start...
Code:
FLEXSPI2_FLSHA1CR1 = FLEXSPI_FLSHCR1_CSINTERVAL(2)
                | FLEXSPI_FLSHCR1_TCSH(3) | FLEXSPI_FLSHCR1_TCSS(3);
CSINTERVAL(2) 22.72nS
plus
FLEXSPI_FLSHCR1_TCSH(3) 34.08nS
plus
FLEXSPI_FLSHCR1_TCSS(3) 34.08nS
equals 90.88nS in wait states alone, not including any command/address/data/turn around cycles
21 cycles to get to the data to read = 238.56nS
90.88+238.56 = 329.44nS PLUS the actual data reads (8 of them??) 90.88nS
329.44+90.88 = 420.32nS
writes are quicker, 13 cycles, there's no turn-around delay, 147.68nS
147.68+90.88 = 238.56nS

Perfect scenario one DMA writing and another one reading:
420.32+238.56 = 658.88nS


99Mhz gives us 10.01nS, so:
20.02+30.03+30.03 = 80.08ns in wait states
21 cycles to get to the data to read = 210.21
80.08+210.21 = 290.29nS PLUS the actual data reads (8 of them??) 80.8nS
290.29+80.08 = 370.37nS
13 cycles for write 130.13nS
130.13+80.08 = 210.21nS

Perfect scenario one DMA writing and another one reading:
370.37+210.21 = 580.58nS

Those 78.3nS make all the difference.

EDIT: ideally i'd want the latency down to under 500nS
 
Last edited:
I then downloaded @h4yn0nnym0u5e revised LittleFS.cpp file and tried it again. https://github.com/h4yn0nnym0u5e/LittleFS/tree/main/src

It compiled and downloaded but got no output. The PSRAM test didn’t even seem to start. I did not see a revised LittleFS_NAND.cpp file so perhaps that was the issue? Not sure if both files need to be modified for NAND or I was just pulling from the wrong place.
Wrong place - you need to pull from https://github.com/h4yn0nnym0u5e/LittleFS/tree/feature/big-PSRAM. I’m one of those odd people who uses git as intended, by developing on branches and keeping main aligned with the upstream repo :)
 
SPEED TESTING
I set up another Teensy, this time with a 16MB PSRAM and a 16MB NOR Flash. Since the 2Gb parts use dual dies internally, I figured that the single die NOR flash with less bus loading would allow the setup to run faster and it did in fact pass PSRAM testing at 132MHz and then failed at 144MHz.

I think jmarsh's idea to play with drive parameters probably makes sense to try to push faster speeds, though I haven’t found any info on easily changing the SDIO drive parameters. Not having a software background, trying to understand the 3000+ page programming document makes my head hurt.

16MB PSRAM with Flash

@h4yn0nnym0u5e, I downloaded the new LittleFS.cpp and LittleFS_NAND.cpp files you pointed to and I still get the same symptom where the Teensy just hangs at start and requires a Program button push to be recognized on the port again. This is with both the NAND flash and the NOR flash. I also tried a setup with the standard 8MB PSRAM and 2Gb Flash and it also behaves the same. I made sure to force a recompile when the files where changed.

Once I replaced the LittleFS files with released versions, the old style board again worked. You mentioned that you tested with Flash, so I assume I am probably doing something wrong on my end, but I haven't figured out what.
 
Hmm, that's odd. Here's my test program: it still has the PSRAM_IDs array, which Paul has removed, so if you're using his version you'll have to delete those references. It may be worth your using my code for the time being, as the output could perhaps reveal useful information.
C++:
#include <LittleFS.h>

extern "C" uint8_t external_psram_size;
extern "C" uint32_t PSRAM_IDs[2];
extern "C" const uint32_t qspi_memory_base;

//LittleFS_SPINAND   myNANDfs;
LittleFS_QSPIFlash myFlashFS;

const char* testFile = "testFile2.txt";

void setup()
{
  while (!Serial)
    ;
  pinMode(LED_BUILTIN,OUTPUT);   
  Serial.printf("PSRAM %dMB: IDs %08x, %08x; QSPI base: %08X (%dM)\n",
                external_psram_size,
                PSRAM_IDs[0],
                PSRAM_IDs[1],
                qspi_memory_base,
                qspi_memory_base / 0x100000
                );

  Serial.flush();
 
  if (myFlashFS.begin())
  {
    File f;
    
    Serial.printf("LittleFS_QSPIFlash on %s\n", myFlashFS.getMediaName());
    Serial.printf("Bytes Used: %llu, Bytes Total:%llu\n", myFlashFS.usedSize(), myFlashFS.totalSize());
    
    f = myFlashFS.open(testFile,FILE_READ);
    if (!f)
    {
      Serial.println("Attempt to create test file");
      f = myFlashFS.open(testFile,FILE_WRITE);
      if (f)
      {
        f.printf("Test file exists\n");
        f.close();
      }
      f = myFlashFS.open(testFile,FILE_READ);
    }

    if (f)
    {
      char buf[50];
      f.readBytesUntil('\n',buf,49);
      buf[49] = 0;
      Serial.printf("%s contains '%s'\n",testFile,buf);
      f.close();
    }
  }
  else
    Serial.println("No QSPI Flash detected");
}

bool pinState;
void loop()
{
  digitalWrite(LED_BUILTIN,pinState);
  pinState = !pinState;
  delay(250);
}

The output I get is:
Code:
PSRAM 16MB: IDs a9835d9d, 00000000; QSPI base: 01000000 (16M)
LittleFS_QSPIFlash on W25Q256JV-Q
Bytes Used: 131072, Bytes Total:33554432
testFile2.txt contains 'Test file exists'
 
Hmm, that's odd. Here's my test program:
I tried your test program with LittleFS updated with your files and I still get the same hanging condition.

If I comment out all the LittleFS stuff, the program will run and gives me this output which looks correct.
Code:
PSRAM 16MB: IDs a9835d9d, 00000000; QSPI base: 01000000 (16M)

If I then just uncomment //LittleFS_QSPIFlash myFlashFS; the program hangs.

Trying to think what else may be different on my setup. I am using the startup.c that is in Beta 04. Should I be using something different?
 
You might be seeing the strange constructor crash problem. Could really use some help from anyone more familiar with the finer points of C++.

Today I made a commit to remove the hard coded 8 Mbyte offset. In theory it should automatically work with 16 MByte PSRAM, but I don't have that new chip to actually test.
As far as the "strange crash" thing, I've see it happen when loop() doesn't contain any code in it, and adding even a simple delay(1) prevents the crash. Almost smells like GCC is optimizing it out, but not from main, which tries to call a nonexistent loop(), and causes a fault with reboot.
 
I just put some work into the strange crash. At least for the cases I was using, it turned out to be much earlier than I thought. Why exactly is still a mystery (have a few theories but no solid evidence), but adding a few more cycles to the startup delay makes the problem go away.

I'll package up another beta installer this weekend, since several files have changed. If you still see this mysterious crash with the latest code, please post a test case so I can try to reproduce it.
 
It doesn’t by any chance also go away if you instead implement my PR#673, does it? I usually have it in place for using TeensyDebug. There’s definitely something odd happening in reset_PFD() - my recollection is I put a ‘scope on the Teensy, and the function never returned when the “strange crash” was being triggered.
 
It would still be best to initialize FLEXSPI2_FLSHA1CR0 and FLEXSPI2_FLSHA2CR0 to default values even when no PSRAM is detected, since otherwise the offset for a solitary flash chip in the second position may be beyond the default 32MB mapped range.
 
Thanks Paul. I pulled your LittleFS and cores changes, and started experiencing the startup hang again. Gah. Then I thought, well, if reset_PFD() is still the culprit, maybe it needs one of those magic ARM barrier instructions in it...

Sure enough, adding asm("dsb"); at the end stops the hang:
C:
FLASHMEM void reset_PFD()
{  
    //Reset PLL2 PFDs, set default frequencies:
    CCM_ANALOG_PFD_528_SET = (1 << 31) | (1 << 23) | (1 << 15) | (1 << 7);
    CCM_ANALOG_PFD_528 = 0x2018101B; // PFD0:352, PFD1:594, PFD2:396, PFD3:297 MHz    
    //PLL3:
    CCM_ANALOG_PFD_480_SET = (1 << 31) | (1 << 23) | (1 << 15) | (1 << 7);  
    CCM_ANALOG_PFD_480 = 0x13110D0C; // PFD0:720, PFD1:664, PFD2:508, PFD3:454 MHz
    asm("dsb");
}

I can even remove the PR#673 stuff now. EDIT: nope, turns out that's still needed for TeensyDebug+Audio.

More testing needed, but I thought I'd get my observation in early to avoid excessive head-scratching and allow others to test. My updated quick test sketch looks like this - just changes to adapt to my debug variables having been removed:
C++:
#define noDEBUG_INFO
#include <LittleFS.h>

extern "C" uint8_t external_psram_size;
#if defined(DEBUG_INFO)
extern "C" uint32_t PSRAM_IDs[2];
extern "C" const uint32_t qspi_memory_base;
#else
uint32_t PSRAM_IDs[2]{0xDEADBEEF,0xCAFEBABE};
uint32_t qspi_memory_base;
#endif // defined(DEBUG_INFO)

//LittleFS_SPINAND   myNANDfs;
LittleFS_QSPIFlash myFlashFS;

const char* testFile = "testFile2.txt";

void setup()
{
  qspi_memory_base = (FLEXSPI2_FLSHA1CR0 & 0x7FFFFF) << 10;
  while (!Serial)
    ;
  pinMode(LED_BUILTIN,OUTPUT);  
  Serial.printf("PSRAM %dMB: IDs %08x, %08x; QSPI base: %08X (%dM)\n",
                external_psram_size,
                PSRAM_IDs[0],
                PSRAM_IDs[1],
                qspi_memory_base,
                qspi_memory_base / 0x100000
                );

  Serial.flush();
 
  if (myFlashFS.begin())
  {
    File f;
   
    Serial.printf("LittleFS_QSPIFlash on %s\n", myFlashFS.getMediaName());
    Serial.printf("Bytes Used: %llu, Bytes Total:%llu\n", myFlashFS.usedSize(), myFlashFS.totalSize());
   
    f = myFlashFS.open(testFile,FILE_READ);
    if (!f)
    {
      Serial.println("Attempt to create test file");
      f = myFlashFS.open(testFile,FILE_WRITE);
      if (f)
      {
        f.printf("Test file exists\n");
        f.close();
      }
      f = myFlashFS.open(testFile,FILE_READ);
    }

    if (f)
    {
      char buf[50];
      f.readBytesUntil('\n',buf,49);
      buf[49] = 0;
      Serial.printf("%s contains '%s'\n",testFile,buf);
      f.close();
    }
  }
  else
    Serial.println("No QSPI Flash detected");
}

bool pinState;
void loop()
{
  digitalWrite(LED_BUILTIN,pinState);
  pinState = !pinState;
  delay(250);
}
 
Last edited:
and started experiencing the startup hang again. Gah.

Yup, I hit it again here too. Looks like resetPFD() has been causing a lot of subtle problems. It came some time ago from a pull request that I really should have tested more.

I made yet another attempt to fix this problem, hopefully for good. This also removes both of those startup delays, because we really should have this fixed "properly" and not be depending on weird delays. I tried both of my troublesome test cases with all optimize settings. All start up ok.

Please let me know if you see the startup problem, and save the code so I can use it as a test case.
 
After much more experimenting, I'm starting to feel there's no safe way to set the PFD gate bits, especially for PFD0 in the 480 MHz PLL. Every combination of delays and various ways of gating crashes with at least 1 of several programs, usually with 1 or more of the optimization options with LTO, but sometimes with the others.

My best guess is NXP wired that clock to something important but undocumented inside the chip. Adding different delays at startup or compiling with different optimization just shifts the brief moments we're messing with the PFD gate bits, perhaps relative to something else going on within the hardware, so we usually avoid crashing. At least that's my best guess about what may really be going on. It's still pretty much a mystery.

Even those it seems to go against the official way, just directly writing the PFD settings early without gating seems to avoid the strange crashes. Or at least so far as I've been able to test. So here's yet another commit. I moved this stuff as early as possible, so we get those PFDs set up before anything else may be needing them stable. Hopefully this will be the final change...
 
Tested as mentioned in post #98, now with commit a78bc7c (cf. post #97). Still good. If I manage to find a new and exciting way of locking up at startup, I'll supply a copy by whatever means seems best at the time ... might be an issue, or on a related thread here, depending.
 
Thanks for looking into this - definitely looks like it's been a bear to debug, and as you say quite likely an "undocumented feature".
 
Back
Top