Simple explanation of DMA?

Status
Not open for further replies.
ILI9341_t3 using serial SPI on Teensy 3.2 at 96 MHz performs slightly faster than your STM32 parallel DMA-based library!

Yeah, I know that's hard to believe, but here's the benchmarks. Only 1 test is slightly slower. The rest are all faster, even though it uses serial SPI.

Code:
Benchmark                Time (microseconds)
Screen fill              281693
Text                     17493
Lines                    73333
Horiz/Vert Lines         23077
Rectangles (outline)     14655
Rectangles (filled)      579250
Circles (filled)         91425
Circles (outline)        76072
Triangles (outline)      17770
Triangles (filled)       195227
Rounded rects (outline)  34128
Rounded rects (filled)   633834
 
Last edited:
ILI9341_t3 performs way better than my library and I don't understand why! It could be because I haven't implemented DMA but does DMA have that much of an impact on performance?
Here is my results:
Code:
ILI9341 Test!
Display Power Mode: 0x9C
MADCTL Mode: 0x48
Pixel Format: 0x5
Image Format: 0x0
Self Diagnostic: 0xC0
Device ID: 0x9341
Benchmark                Time (microseconds)
Screen fill              975217
Text                     425722
Horiz/Vert Lines         69597
Rectangles (outline)     44820
Rectangles (filled)      1998472
Circles (filled)         247083
Circles (outline)        183649
Triangles (outline)      134024
Triangles (filled)       618424
Rounded rects (outline)  86982
Rounded rects (filled)   2092058
Done!
[\code]
 
Oh, but it's nowhere near as fast as this version:

Code:
Screen fill              62999
Text                     20483
Lines                    173830
Horiz/Vert Lines         7604
Rectangles (outline)     5958
Rectangles (filled)      148717
Circles (filled)         91783
Circles (outline)        75096
Triangles (outline)      55138
Triangles (filled)       90553
Rounded rects (outline)  27257
Rounded rects (filled)   186639
 
Do you think accessing hardware registers instead of using digitalWriteFast() would improve the performance by a lot? What about DMA?
 
ILI9341_t3 does not use DMA. But it does take advantage of the SPI FIFO and hardware-based chip select logic. It also has a lot of careful optimization work, which is ultimate what makes all the difference...
 
Similar optimization work was done on UTFT long ago. Maybe you could look at the UTFT code, maybe even find those old conversations on this forum with a little google searching. Use "site:forum.pjrc.com" to tell Google to only search this forum.

There's also a tutorial about native parallel I/O.

About helping with your code, I personally can't get involved. Maybe someone else will. But understand really good optimization takes a lot of work.
 
Status
Not open for further replies.
Back
Top