dirkenstein
Active member
I am trying to modify bmillier's partitioned convolution filter to use EXTMEM arrays on Teensy 4.1 for storing its large internal buffers but I am running into severe performance problems with it.
I can move the IR buffer and maskgen arrays to EXTMEM with no significant performance impact, basically because these are only used to compute the fmask content.
However, any attempt to use EXTMEM for the fftout and fmask buffers doesn't work with buffers over 64K- it uses more CPU than available.
Is there any way to improve EXTMEM performance. Is Dual SPI/QSPI enabled by default for EXTMEM?
Is there some way to cache the contents of these buffers in fast main memory that would reduce/eliminate the use of EXTMEM on every read/write as part of the partitioned convolution algorithm? I don't understand the algorithm well enough to see how to introduce caching.
I can move the IR buffer and maskgen arrays to EXTMEM with no significant performance impact, basically because these are only used to compute the fmask content.
However, any attempt to use EXTMEM for the fftout and fmask buffers doesn't work with buffers over 64K- it uses more CPU than available.
Is there any way to improve EXTMEM performance. Is Dual SPI/QSPI enabled by default for EXTMEM?
Is there some way to cache the contents of these buffers in fast main memory that would reduce/eliminate the use of EXTMEM on every read/write as part of the partitioned convolution algorithm? I don't understand the algorithm well enough to see how to introduce caching.