Additional PSRAM ID that works plus goodies

Looks like my changes to cores have been pulled in :) Guess it’s easy to tell that it hasn’t broken use of existing PSRAM, even if you don’t have the new one to try… hopefully LittleFS is pending acceptance of your offer, @KenHahn, or maybe the arrival of Paul’s own set of samples.
 
I don't like the fact that this introduces more memory usage, even if it's only a handful of bytes; as the notes I left on the commit explain, there seems to be no use for the PSRAM_IDs array and using qspi_memory_base to pass the base address of the second chip is unnecessary when it's specified by a hardware register that can be accessed from anywhere. It's also flawed: there are separate C and C++ definitions and one is defined const, meaning the compiler is able to use the const value directly in the code and optimize the variable out completely.
 
I can never find notes left on a commit :(
  • Good point on the PSRAM_IDs array - it was only for debug. I'd remove it, but Paul's now pulled the changes.
  • I feel there's enough obscure register bashing in the code already without doing weird register reads just to save 4 bytes for people who don't use LittleFS (i.e. probably most people - I don't). There was already a (completely unused) variable in LittleFS!
  • I've made the qspi_memory_base linkage "C" in the LittleFS PR
  • The cores definition can't be const, as it has to be set at startup. In LittleFS I don't think it matters about optimisation; it can't be made into a literal constant, and declaring it const ensures no future maintainer accidentally tries to write to it.
 
it was only for debug. I'd remove it, but Paul's now pulled the changes
With the number of changes - the next beta not likely to be the last - and if updated new PR might go in if it safely removes debug?

The initial define then extern const caught my eye too - but figured it worked with various old and new value combos [8, 2*8,16, 24, 32]? Not sure when Ken will get his samples and if one might make it my way for testing ... if it does testing would be done.
 
  • I feel there's enough obscure register bashing in the code already without doing weird register reads just to save 4 bytes for people who don't use LittleFS (i.e. probably most people - I don't). There was already a (completely unused) variable in LittleFS!
That's why I don't really understand why you made a new variable and added a weak linkage hack; all that was necessary was to initialize flashBaseAddr based on the value in FLEXSPI2_FLSHA1CR0 and plug it into the code instead of always using 0x00800000.
Now the variable has been moved from LittleFS to cores where it will take up space in every program whether LittleFS is used or not...
  • The cores definition can't be const, as it has to be set at startup. In LittleFS I don't think it matters about optimisation; it can't be made into a literal constant, and declaring it const ensures no future maintainer accidentally tries to write to it.
I'm saying it shouldn't be declared const anywhere. If it's declared const then the compiler is allowed to assume it never changes and use the value directly rather than loading it from memory.
 
That's why I don't really understand why you made a new variable and added a weak linkage hack
I did that so any combination of cores and LittleFS will work - people get into so much trouble by having a forgotten tweaked library installed!
  • old+old: obviously OK
  • old cores + new LittleFS: no variable in cores, LittleFS falls back to weakly-defined one which has the original hard-coded value
  • new cores + old LittleFS: will work with 8MB PSRAM, or none; will fail with 16MB PSRAM and QSPI Flash, nothing we can do here
  • new+new: should work, but it's only had limited testing here

initialize flashBaseAddr based on the value in FLEXSPI2_FLSHA1CR0 and plug it into the code instead of always using 0x00800000
True ... but where and when to initialise it? I really don't care enough about LittleFS and saving 4 bytes of RAM to make the effort. Sorry.

I'm saying it shouldn't be declared const anywhere. If it's declared const then the compiler is allowed to assume it never changes and use the value directly rather than loading it from memory.
Sure, it can validly assume it never changes, but it has to get the value at some time from somewhere. LittleFS can't get it at compile-time, because it's extern; maybe LTO could try to optimise it away, and then discover it's defined inconsistently, or not notice but yield non-functional code. I just tried, and nope, no errors or warnings (about that, anyway), and the code still works.

TBH I wouldn't mind too much if the const got removed, as future maintainers of LittleFS probably won't screw it up by modifying it. Probably.
 
I did that so any combination of cores and LittleFS will work - people get into so much trouble by having a forgotten tweaked library installed!
  • old+old: obviously OK
  • old cores + new LittleFS: no variable in cores, LittleFS falls back to weakly-defined one which has the original hard-coded value
  • new cores + old LittleFS: will work with 8MB PSRAM, or none; will fail with 16MB PSRAM and QSPI Flash, nothing we can do here
  • new+new: should work, but it's only had limited testing here
Any core + LittleFS that takes the base by looking at the hardware register = no shared dependency so no chance of a conflict.
True ... but where and when to initialise it?
LittleFS_QSPI::begin() would be the obvious place. Buuuut I would bet any money there are people out there using the LittleFS_QSPIFlash and LittleFS_QPINAND classes directly, so probably better to put it in both of their begin functions; having two separate initializations isn't going to matter since the value will always be the same.

Sure, it can validly assume it never changes, but it has to get the value at some time from somewhere. LittleFS can't get it at compile-time, because it's extern;
I wouldn't count on that. It has a local definition and const value. The fact that it's weak just means the linker wouldn't throw a wobbly about a symbol redefinition.
maybe LTO could try to optimise it away, and then discover it's defined inconsistently, or not notice but yield non-functional code. I just tried, and nope, no errors or warnings (about that, anyway), and the code still works.

TBH I wouldn't mind too much if the const got removed, as future maintainers of LittleFS probably won't screw it up by modifying it. Probably.
I'm honestly surprised LTO doesn't have an issue with it, since it typically has problems with variables that don't follow the one-definition-rule (const vs non-const are distinct definitions) and especially because a const variable would be placed in a different program section to a non-const.
 
If the software looks like a go, I can work with one of ISSI's distributors to bring in an MOQ order

Yes, software support will happen. #784 already merged. It will be in 1.60-beta5.

I might rename "qspi_memory_base" before updating LittleFS, and minor changes like removing the unused PSRAM_IDs[] array might still happen, but small details aside, yes, since these chips are now real, software support will happen.
 
Yes, good find.

My samples are coming out of Taiwan, so probably 2 weeks out. @defragster I can hook you up with a part when they come in. I sent a note to Digikey as they seem to be the most active ISSI distributor to see about getting an initial order going.

I received the ISSI SerialRam/QuadRam roadmap. The 128Mb/16MB part we are discussing here is the largest size and the last one on their roadmap that goes out several years. It looks like they are focusing on the higher performance OctalRam / HyperRam for any larger sizes going forward.
 
Paul is posting at 2AM which can only mean he is deep into software mode. Hardware engineers knock off by 7PM ;).

Here are the roadmaps they said can be shared. They are checking to see if they can share their complete slide deck which has additional info.

QuadRAM Roadmap.png


OctalRAM Roadmap.png
HyperRAM Roadmap.png
 
The SOIC 16MB parts aren't in stock anywhere but you can get the BGA version. Since the software changes look to be minor, I sent a note to ISSI to see what the deal on availability is as a larger PSRAM option would be a nice upgrade for Teensy. I'll post here if I hear back.

Have you by chance tested these at the higher 133MHz rate?
Just for giggles, I did.
Note that technically clocking the PSRAM I have at > 104MHz is already considered to be overclocking.
Note also that I am using a 1.8v part here, which still has not smoked or even warm...
I suspect a proper 3.3v version at 200MHz would work as well as the stock ones, I can get up to 180MHz on IPUS chips, so...

Code:
ISSI 1.8v out of spec PSRAM TESTS
PSRAM0 ID 00605d9d 
CCM_CBCMR=95AE8304 (105.6 MHz)
test ran for 62.45 seconds

CCM_CBCMR=B5AE8204 (110.8 MHz)
test ran for 59.96 seconds

CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 55.83 seconds

CCM_CBCMR=55AE8004 (132.0 MHz)
test ran for 51.55 seconds

CCM_CBCMR=95AE8104 (144.0 MHz)
test ran for 47.68 seconds

 CCM_CBCMR=75AE8204 (166.2 MHz)
 Error at 7000FF00, read FF47FF21 but expected 5A698421 (BOOOO! HISSS!)
 
Thanks for doing that. When my 3.3V samples finally arrive, I will do some testing to see how it compares.

Edited: Out of curiosity I just tried two different Teensy with 16MB of the standard 8MB PSRAM at 166.2MHz and both passed OK. A Teensy with 8MB PSRAM and 128Mb NOR Flash also passed OK at that speed for both PSRAM and basic Flash testing.

Two Teensy with 8MB PSRAM and 2Gb NAND Flash chips would immediately fail the PSRAM test apparently due to the presence of the Flash on the bus even though they weren't being accessed. It also failed at 144Mhz, but would work for both PSRAM and Flash testing at 132MHz where I normally test at.

Is there a cheat sheet for the corresponding CCM_CBCMR values somewhere for different valid QSPI bus speeds?
 
Last edited:
Thanks for doing that. When my 3.3V samples finally arrive, I will do some testing to see how it compares.

Edited: Out of curiosity I just tried two different Teensy with 16MB of the standard 8MB PSRAM at 166.2MHz and both passed OK. A Teensy with 8MB PSRAM and 128Mb NOR Flash also passed OK at that speed for both PSRAM and basic Flash testing.

Two Teensy with 8MB PSRAM and 2Gb NAND Flash chips would immediately fail the PSRAM test apparently due to the presence of the Flash on the bus even though they weren't being accessed. It also failed at 144Mhz, but would work for both PSRAM and Flash testing at 132MHz where I normally test at.

Is there a cheat sheet for the corresponding CCM_CBCMR values somewhere for different valid QSPI bus speeds?
Speeds are pretty easy to set.
There are four clocks, and 16 divisors.
choose a clock:
0 = 396 MHz
1 = 720 MHz
2 = 664.62 MHz
3 = 528 MHz
and a divisor from 0 to 15 (0 = 1, 15 = 16)
Then it's a simple matter of:
Code:
CCM_CCGR7 &= CCM_CCGR7_FLEXSPI2(~CCM_CCGR_ON);
  CCM_CBCMR = (CCM_CBCMR & ~(CCM_CBCMR_FLEXSPI2_PODF_MASK | CCM_CBCMR_FLEXSPI2_CLK_SEL_MASK))
                | CCM_CBCMR_FLEXSPI2_PODF(3) // divisor + 1
                | CCM_CBCMR_FLEXSPI2_CLK_SEL(3); // 528 clock / 4 = 132
  CCM_CCGR7 |= CCM_CCGR7_FLEXSPI2(CCM_CCGR_ON);
 
Additional notes, and relationship to my large DMA transfers that happen 1uS apart...
88MHz wasn't enough to sustain 8MHz DMA R/W from SPI port, and not wanting to break specs, I bumped it up by 11 MHz.
99MHz is able to cope with it (clock 396/4) using the same SPI depths that was successful using DMAMEM, which is:
Code:
spi_0_regs->FCR = LPSPI_FCR_RXWATER(0) | LPSPI_FCR_TXWATER(1);
using byte-at-a-time transfers to a circular 200,000 byte buffer on PSRAM that continually cycles every 200mS.
No corruption after hours of looping with random stop/restarts.
 
That's a bit confusing... why does the memory speed need to be nearly 10x the SPI speed, when SPI operates at 1 bit/cycle and the memory is roughly 4 bits/cycle?
 
My samples of the IS66WVS16M8FBLL (3.3V version) came in and I put two 16MB chips (32MB total) on a Teensy 4.1. The Startup.c in cores was updated with the new version that is going into Beta 5.

1751582788263.jpeg


Code:
32MB default 88MHz
test ran for 144.97 seconds. All memory tests passed :-)

32MB CCM_CBCMR=95AE8304 (105.6 MHz)
test ran for 123.66 seconds.  All memory tests passed :-)

32MB CCM_CBCMR=B5AE8204 (110.8 MHz)
test ran for 118.56 seconds.  All memory tests passed :-)

32MB CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 110.65 seconds.  All memory tests passed :-)

32MB CCM_CBCMR=55AE8004 (132.0 MHz) - FAILED
testing with fixed pattern 0F0F0F0F
Error at 710016C0, read 0F0F1F0F but expected 0F0F0F0F

Reran same test and got exact same failure. This failed at a lower speed than xxxajk saw, so I removed 1 PSRAM to see if that was the difference or if it was the fact that I was using 3.3V vs 1.8V parts.

Code:
16MB CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 55.27 seconds.  All memory tests passed :-)

16MB CCM_CBCMR=55AE8004 (132.0 MHz)
test ran for 51.07 seconds.  All memory tests passed :-)

16MB CCM_CBCMR=95AE8104 (144.0 MHz)
test ran for 47.09 seconds.  All memory tests passed :-)

16MB CCM_CBCMR=75AE8204 (166.2 MHz)
testing with fixed pattern 5A698421. 
Error at 70000000, read DF6B9C23 but expected 5A698421

Now the results matched xxxajk's results so it appears that the issue is likely a bus loading issue. These PSRAM are made of two stacked 8MB parts, so guessing that 2 of them probably load the bus like having 4 of the regular 8MB parts.

To further check this, I added a 2Gb NAND Flash chip and the failure point returned to the same as with the 32MB PSRAM setup

1751583793663.jpeg


Code:
16MB + 2Gb Flash CCM_CBCMR=B5AE8104 (120.0 MHz)
test ran for 55.27 seconds.  All memory tests passed :-)

16MB + 2Gb Flash CCM_CBCMR=55AE8004 (132.0 MHz)
testing with fixed pattern 5A698421
Error at 700F7700, read 5A698423 but expected 5A698421

Would have been nice if they worked at the 132MHz that everything else seems to be OK working at, but 120Mhz is still pretty decent.

As a final set of tests, I tried to run my combined PSRAM and NAND Flash test at the default 88MHz bus speed. The PSRAM test passed as expected but received an error starting QSPI Disk which I think was expected since littleFS needs to be updated to work.

I then downloaded @h4yn0nnym0u5e revised LittleFS.cpp file and tried it again. https://github.com/h4yn0nnym0u5e/LittleFS/tree/main/src

It compiled and downloaded but got no output. The PSRAM test didn’t even seem to start. I did not see a revised LittleFS_NAND.cpp file so perhaps that was the issue? Not sure if both files need to be modified for NAND or I was just pulling from the wrong place.
 
Maybe it's worth experimenting with the pin configs (e.g. drive strength, remove pull-ups...) to see if it would stabilise two chips at high speeds.
 
That's a bit confusing... why does the memory speed need to be nearly 10x the SPI speed, when SPI operates at 1 bit/cycle and the memory is roughly 4 bits/cycle?
SPI is in SLAVE MODE and not in control of the clocking to it, so it needs to have something ready ahead of time, and RAM may be BUSY (BUSS contention) since it MAY be writing from the other DMA channel.

There's also addressing overhead, caching, etc.
Same thing happens when using DMAMEM, the TX DMA instance doesn't get the byte out of RAM in time.
 
Back
Top