but what is it that we are trying to do defining PROGMEM to do something?
PROGMEM means the variable or function is only placed in the external QSPI flash chip. F() and PSTR() are supposed to do this for string constants (or "string literals" in compiler jargon), but I'm not sure if we have those working yet.
FASTRUN is meant to be used only on functions, meaning the function is placed in fast ITCM. Today, we're all non-PROGMEM functions go into ITCM by default, but that may not always be the default in the future. So FASTRUN is good to use on interrupt handlers, even though right now it has no effect. In the future we may have an option to make functions default to PROGMEM, which trades a slight performance hit to gain more fast RAM for variables. It'll probably be needed when/if people write very large programs, for programs needing nearly 500K for variables.
DMAMEM is meant only for variables, and really only for variables without initialization. By default variables are placed into fast DTCM which is not optimized for sharing with DMA. Using DMAMEM puts the variable into OCRAM (on the AXI bus) which is slightly slower, but more optimal for sharing with the DMA controller and peripherals like USB which have their own DMA built in. Memory obtained by malloc() and C++ new is also from OCRAM.
No matter where anything is placed, it's always accessible with ordinary C/C++ semantics. Macros like pgm_read_byte() are never needed, but they are provided as do-nothing macros so legacy code for AVR still compiles and works.
PROGMEM and DMAMEM are cached, which makes them very fast if you're accessing anything within the same 32 byte range as a prior access that's still in the cache. If you care quite a lot about performance, best to align data to 32 bytes. Code uses a completely separate instruction cache, so there's no worry about simply running other code evicting your data from the data cache. But cache misses are quite slow.
So hoping we are not redefining things like this to go counter to expectations
Me too. I believe this use of PROGMEM is pretty close to what people expect from AVR.
Then again, nobody else (as far as I know) had really tried to map M7's memory architecture onto the Arduino world's conventions. I believe some of Adafruit's SAMD51 boards (which are M4) may have the ability to put variables and maybe even code into a QSPI flash chip. Would be interesting to look at what they're doing, if anyone knows or has time to investigate. Maybe the STM32 folks have looked at this? I don't follow STM32 closely.
But my gut feeling is we're probably the first to really tackle how to use M7 with Arduino's conventions and legacy of AVR code.