Does anyone have any practical experience with DMA, priorities and preemption on Teensy 4/4.1?
I have a process that reads out FlexIO shift registers with a DMA into a larger buffer(s) in local memory. Once the buffer is full (interrupt is triggered), I switch to the next one and do a copy from that filled readout buffer to a larger one living in EXTMEM (reading directly to the buffer in EXTMEM doesn't really work, I believe the EXTMEM is just too slow to handle how the data is read).
Inside the interrupt handler fired when the readout buffer is fill, I can just do a memcpy to that large buffer in EXTMEM and it works just fine. But since I also need to do some processing on the data in that buffer elsewhere, I would prefer not to stall the CPU and use another DMA for that copy. But this is when things start to break.
I can kick off that copy DMA, but when I do, the one that reads the data from FlexIO starts hitching - it skips some beats and misses some data. The main suspect is how DMA channels are handled internally - they aren't really running concurrently, but rather there's an arbitration mechanism that picks the one to execute given cycle. And my thinking is that the DMA performing the copy is taking place when the readout should be happening, which makes it miss some of the data.
Technically, IMXRT1060 has two schemes for arbitration DMA request: priority based (on by default) and round-robin. Then, on top of that, the priority based one allows for configuring preemption: whether each channel can be suspended by a higher priority request and if a channel can suspend lower priority channels. But that doesn't really seem to do much: setting the "copy-out" DMA to be lower priority than the "readout", and setting its ECP bit (so that it can be suspended by the higher priority channels) doesn't really change much in the timings - they *are* different, that's for sure, but the the readouts still hitch.
Has anyone had any more luck with playing with the priority and preemption settings and would like to share their experiences?
I have a process that reads out FlexIO shift registers with a DMA into a larger buffer(s) in local memory. Once the buffer is full (interrupt is triggered), I switch to the next one and do a copy from that filled readout buffer to a larger one living in EXTMEM (reading directly to the buffer in EXTMEM doesn't really work, I believe the EXTMEM is just too slow to handle how the data is read).
Inside the interrupt handler fired when the readout buffer is fill, I can just do a memcpy to that large buffer in EXTMEM and it works just fine. But since I also need to do some processing on the data in that buffer elsewhere, I would prefer not to stall the CPU and use another DMA for that copy. But this is when things start to break.
I can kick off that copy DMA, but when I do, the one that reads the data from FlexIO starts hitching - it skips some beats and misses some data. The main suspect is how DMA channels are handled internally - they aren't really running concurrently, but rather there's an arbitration mechanism that picks the one to execute given cycle. And my thinking is that the DMA performing the copy is taking place when the readout should be happening, which makes it miss some of the data.
Technically, IMXRT1060 has two schemes for arbitration DMA request: priority based (on by default) and round-robin. Then, on top of that, the priority based one allows for configuring preemption: whether each channel can be suspended by a higher priority request and if a channel can suspend lower priority channels. But that doesn't really seem to do much: setting the "copy-out" DMA to be lower priority than the "readout", and setting its ECP bit (so that it can be suspended by the higher priority channels) doesn't really change much in the timings - they *are* different, that's for sure, but the the readouts still hitch.
Has anyone had any more luck with playing with the priority and preemption settings and would like to share their experiences?