Well, long story short:
I got two 4.1s:
Name it Unit1 - no ethernet one and have two Adafruit PSRAMs installed, currently is soldered to a prototype. The PSRAM test sketch passes on all frequencies including 133MHz and below (120MHz, 110.77Mhz, and default 88MHz is what i tried)
Unit2 - one with ethernet chip (i don't think it makes a difference though), one Adafruit PSRAM and W25Q01JVZEIQ (1Gbit NOR flash) installed. On tests it was powered via USB. The PSRAM test sketch fails on 133MHz but passes on lower freqs
Story: as i posted, when i changed the frequency to 120MHz on Unit1, it was having a failure reads (bad data) from array in PSRAM. As i mentioned, i do not use Arduino in this project, so i patched startup code which sets clock for PSRAM. I realized what actually happens when i had no more ideas and i verified all bits of my code. So I made a copy of data in tightly coupled fast RAM and was comparing what it reads from both locations, flashing led if it is not equal. It was a surprise, as this unit passes test on 133Mhz. only with 110.77MHz or less it was stable. Other unit, Unit2, which can't pass test sketch on 133MHz was giving no issues on 120Mhz.
Story 2: I was trying to make an Arduino sketch to reproduce this issue with no luck. After quite a while i got an idea. As i said, the PSRAM clock code in startup.c was patched directly to set higher freq. Then i reverted this change, but instead, did a clock change later before i use PSRAM in my project. Damn! Issue disappeared! By some bad luck, it happens only when i set 120Mhz directly on startup code, and only on one of my two Teensies...
So, given that i spent a lot of time already, and it is hardly reproducible, i believe i will leave this mistery with no further investigation. The smallest code which can reproduce it, only will reproduce if you patch the startup.c directly to set 120Mhz and only on one of my two units
C:
EXTMEM uint8_t rom_e[16 * 1024];
extern uint8_t rom[16 * 1024];
extern "C" int main(void) {
uint8_t data, data2;
for (int i=0; i< 16 * 1024; ++i) {
rom_e[i] = rom[i];
}
__asm volatile ( "dsb \n" );
__asm volatile ( "isb \n" );
__asm volatile ( "dmb \n" );
for (int i=0; i< 16 * 1024; ++i) {
data2 = rom[i];
data = rom_e[i];
if (data != data2) {
pinMode(13, OUTPUT);
while (1) {
digitalWriteFast(13, LOW);
delayMicroseconds(500000);
digitalWriteFast(13, HIGH);
delayMicroseconds(500000);
}
}
}
return 0;
}
PS Not sure which of asm commands is enough to reset/flush caches (maybe you need only one, i added all i found just in case, without those this code works well)