ST7796 Teensyduino support

You had something similar happen when you were making some connector mods and the CS pin got left disconnected I think.

Along a similar line, when I first modified the prototype to go from ILI9341 to use the new ST7796, I forgot to cut an inner trace. That connected the display SPI MOSI pin to the display touch I2C SDA pin. Both the touch and display worked fine despite the combined signal looking like this, which is another head-scratcher.

1749680336034.png
 
I forgot to change the CS pin from 22 to 10.
I've pushed up a commit which gives you a way to switch the hardware over to the Mini Platform using one macro; change #define notMINI_PLATFORM to #define MINI_PLATFORM. If I've missed anything or got it wrong, please drop me a PR and I'll fix it.
 
Had to find ST7735_t3_font_OpenSans.h

"C:\Users\MEMEME\AppData\Local\Arduino15\packages\teensy\hardware\avr\0.60.4\libraries\ST7735_t3\fonts.zip"

Builds and runs with these notes:
Code:
1.8.19\\hardware\\teensy\\avr\\libraries\\SD\\src" "-IT:\\T_Drive\\arduino-1.8.19\\hardware\\teensy\\avr\\libraries\\SdFat\\src" "-IT:\\T_Drive\\arduino-1.8.19\\hardware\\teensy\\avr\\libraries\\SerialFlash" "-IT:\\T_Drive\\arduino-1.8.19\\hardware\\teensy\\avr\\libraries\\Wire" "T:\\TEMP\\arduino_build_250266\\sketch\\80x80v3.cpp" -o "T:\\TEMP\\arduino_build_250266\\sketch\\80x80v3.cpp.o"
T:\TEMP\arduino_modified_sketch_751414\Audio_TFT_Display_torture_test.ino:107: warning: "INVERT_DISPLAY" redefined
  107 |   #define INVERT_DISPLAY true
      |
T:\TEMP\arduino_modified_sketch_751414\Audio_TFT_Display_torture_test.ino:95: note: this is the location of the previous definition
   95 |   #define INVERT_DISPLAY false
      |
Compiling libraries...
Compiling library "ST7735_t3_H4"

...

T:\T_Drive\tCode\libraries\ST7735_t3_H4\src\ST7735_t3.cpp: In member function 'void ST7735_t3:rocess_dma_interrupt()':
T:\T_Drive\tCode\libraries\ST7735_t3_H4\src\ST7735_t3.cpp:4163:41: warning: variable 'trigSrc' set but not used [-Wunused-but-set-variable]
 4163 |                                 uint8_t trigSrc = _spi_hardware->tx_dma_channel; // assume framebuffer -> screen
      |                                         ^~~~~~~


And SerMon Snippet:
Code:
========================
Check fillRect: 52
Check drawChar_bg: 473
Check drawChar: 55
Check drawFontChar_bg: 699
Check drawFontChar: 457
Check drawAAChar_bg: 1839
Check writeRect: 287
Check writeSubImageRect: 91
Check writeSubImageRectBytesReversed: 84
Check writeRect1BPP: 483
Check writeRect2BPP: 447
Check writeRect4BPP: 501
Async start was OK
Check fillRect: 48
Check fillRect: 49
Check fillRect: 48
Check fillRect: 48
Check fillRect: 49
Took 1ms to stop async update
========================
Check fillRect: 49
Check drawChar_bg: 538
Check drawChar: 54
Check drawFontChar_bg: 699
Check drawFontChar: 394
Check drawAAChar_bg: 1903
Check writeRect: 287
Check writeSubImageRect: 92
Check writeSubImageRectBytesReversed: 83
Check writeRect1BPP: 425
Check writeRect2BPP: 446
Check writeRect4BPP: 500
Async start was OK
Check fillRect: 49
Check fillRect: 48
Check fillRect: 47
Check fillRect: 51
Check fillRect: 48
Took 0ms to stop async update
========================
 
Last edited:
OPPS - F_CPU was left at 150 MHz !!!! this and prior post - recalbrating

NOTE: Audio works at ALL SPI speeds - but display and SerMon output ONLY at 16 and 24 MHz
//#define ST7735_SPICLOCK 80'000'000
//#define ST7735_SPICLOCK 40'000'000
#define ST7735_SPICLOCK 24'000'000
//#define ST7735_SPICLOCK 16'000'000

SerMon at 24 MHz
Code:
========================
Check fillRect: 49
Check drawChar_bg: 473
Check drawChar: 54
Check drawFontChar_bg: 704
Check drawFontChar: 393
Check drawAAChar_bg: 1897
Check writeRect: 292
Check writeSubImageRect: 92
Check writeSubImageRectBytesReversed: 84
Check writeRect1BPP: 421
Check writeRect2BPP: 448
Check writeRect4BPP: 564
Async start was OK
Check fillRect: 48
Check fillRect: 51
Check fillRect: 51
Check fillRect: 49
Check fillRect: 48
Took 1ms to stop async update
========================
 
Last edited:
80 MHz SPI works at F_CPU 600 MHz :: As does Audio playback

Code:
========================
Check fillRect: 21
Check drawChar_bg: 119
Check drawChar: 13
Check drawFontChar_bg: 176
Check drawFontChar: 100
Check drawAAChar_bg: 462
Check writeRect: 81
Check writeSubImageRect: 27
Check writeSubImageRectBytesReversed: 24
Check writeRect1BPP: 106
Check writeRect2BPP: 113
Check writeRect4BPP: 126
Async start was OK
Check fillRect: 22
Check fillRect: 22
Check fillRect: 22
Check fillRect: 21
Check fillRect: 21
Took 0ms to stop async update
========================
 
New update pushed. We now have the ability to update a clipped area of the display asynchronously:
  • create a framebuffer
  • create an intermediate buffer with useIntermediateBuffer(#bytes) - I used 1920 bytes, enough for 2 full screen lines
  • write to the framebuffer
  • set the clip area
  • call updateScreenAsync(false,true,true) - the last true says to respect the clip area
  • reset the clip area before doing further output
Example is in the test as USAGE_MODE 8.

For audio to work you need a modified version of the DMA utilities, which can be found here - with luck this will get into Teensyduino at some point. In most cases it will also need a call to forceDMAinterruptPriority(224) after the first async update has been started, to ensure the interrupt priority is low - I found that because the 32 DMA channels share 16 interrupts, if SPI and audio hardware are on the same one then you won't necessarily have it correct. It's all a bit messy...
 
Sure, here you go. Note this is a "snapshot in time", so could potentially go out of date if @jmarsh find bugs or needs to make changes. The .zip includes a link to the exact commit - obviously don't put that in your cores/teensy4, just the code!
 

Attachments

  • pre-emptible DMA.zip
    74.1 KB · Views: 15
I'm hoping to make some changes to streamline the user experience with this, but those are unlikely to touch the basic principles of how the clipped async updates are done.
 
Sure, here you go
Thanks! Had to force a clean build as it was using cached parts that were missing updates :( and failing :(

It is running the demo here - except the annoying RED LED on the MINI is triggered :) And it is 180° rotated wrong :)

The updates are done in visible parts and not really fast? Not sure what to expect - but the result images just cycle colors and are sensible looking.

OPPS - EDIT - forgot to edit to MINI with library rewrite :( - proper rotation :)
Background now black white bars instead of white with black bars?
 
That sounds about right. I tried to follow what @KenHahn said about the changes he had to make for the MINI, so it should be rotated correctly, not have the red LED illuminated, and have the inversion set correctly.

Indeed, the clip update is done as a weird sort of mosaic with all different-sized rectangles, with 50ms delay between each one; the actual microseconds taken for each rectangle is output on the serial monitor. Seems to be about 1.12µs/pixel, so 12% extra time but of course the CPU can largely be getting on with other stuff while the DMA and interrupt shenanigans proceed :).

I added a white lines on black background grid, yes.
 
New commit, can do async update of the changed (rectangular) area of the framebuffer, which saves you keeping track yourself. See update mode 9 for an example.

Thus far this can’t also be continuous, but who knows what tomorrow will bring?
 
New commit, can do async update of the changed (rectangular) area of the framebuffer, which saves you keeping track yourself. See update mode 9 for an example.

Thus far this can’t also be continuous, but who knows what tomorrow will bring?
Mini looks very good! Smoother update across the blocks! Nice at 16 MHzSPI and all good at 80 MHz SPI

16 MHz SPI:
Code:
========================
Async check fillRect: 22; took 7230us
Async check drawChar_bg: 120; took 6927us
Async check drawChar: 14; took 8629us
Async check drawFontChar_bg: 177; took 11854us
Async check drawFontChar: 99; took 9720us
Async check drawAAChar_bg: 462; took 10395us
Async check writeRect: 81; took 114us
Async check writeSubImageRect: 27; took 7237us
Async check writeSubImageRectBytesReversed: 20; took 7207us
Async check writeRect1BPP: 106; took 7239us
Async check writeRect2BPP: 114; took 7205us
Async check writeRect4BPP: 158; took 7200us
========================

80 MHz SPI:
Code:
========================
Async check fillRect: 22; took 1867us
Async check drawChar_bg: 120; took 1785us
Async check drawChar: 14; took 2211us
Async check drawFontChar_bg: 176; took 3002us
Async check drawFontChar: 99; took 2471us
Async check drawAAChar_bg: 462; took 2651us
Async check writeRect: 81; took 114us
Async check writeSubImageRect: 20; took 1872us
Async check writeSubImageRectBytesReversed: 27; took 1871us
Async check writeRect1BPP: 106; took 1845us
Async check writeRect2BPP: 112; took 1877us
Async check writeRect4BPP: 126; took 1877us
========================
 
Note two other examples on Mini working properly!

Notably the one of earlier focus: P3_03_TFT_DispBig.ino running faster than ever with rare updates taking double digits for the bars - with min update 2 ms and MAX at 12 ms (SPI at 80 MHz)
 
Notably the one of earlier focus: P3_03_TFT_DispBig.ino running faster than ever with rare updates taking double digits for the bars - with min update 2 ms and MAX at 12 ms (SPI at 80 MHz)
Earlier probably at default 16 MHz - so with that the current library giving MIN=12ms and MAX=56ms, and audio playing again no issue
 
Polling for thoughts here, plus maybe a bit of help
  • any opinions on the asynchronous callbacks, particularly the “half-done” one? I think it’s impossible to be 100% accurate with that, so it’s likely to happen at just over the halfway point. Anyone with real-life use cases?
  • can anyone check I’ve not broken Teensy 3.x support? I don’t have any here
  • …I’m sure I had something else to ask … it’ll have to be another post, when I remember what it was :unsure:
 
Last edited:
No help from here ... only one of the ST7796's here
Of course the lib affects other similar displays ST7735? - assume that is the concern? Not sure I have any of them here either?

@KurtE or @mjs513 would be inclined from their prior work - not sure what T_3.x support there was for how much of this?
 
There's a ton of conditional compilation relating to Teensy models (3.5, 3.6, 4.x), which I've tried to avoid altering. The ST7789 and ST7796 files are pretty much stubs, with just the hardware-specific initialisation code; the meat is all in the ST7735 files which I've been editing.

EDIT: so, in theory these changes should be good for other displays with T4.x; just not sure about 3.x
 
The half way interrupt, was optional and added for use cases like FrankB had where he was using continuous frame updates,
And when you got half way through the frame, he would load in that half of the next frame into the top half of the buffer and when
it completed the bottom half, he would then fill in that part... Exactness of the interrupt at the exact middle was not important, being able
to fill in the new frame half before the screen update cycled back was the only needed thing...
 
Thanks. My new scheme can, for some settings, finish DMA, interrupt and relinquish the SPI bus momentarily, which allows the stock audio playback from SD card to work, at the cost of a slightly reduced update speed. But if that happened every 3 lines on a display with 320 lines, then halfway is after 53.333 chunks, which can't (easily) be done; best to leave it until 54 chunks, or 162 lines, have been output.
 
Pushed some changes:
  • continuous updates of a pre-set clip rectangle now working
  • callback on async end-of-frame implemented; should work for whole screen, clipped area, and with continuous updates
Things I think should still be on the roadmap:
  • need to think about changing the update area during continuous updates, including using the "changed area" capability
  • callback at half-frame point
NOTE: pre-emptible DMA as mentioned above / posted in #108 is still required. Due to issues with interrupt clashes, it attempts to claim a unique pre-emptible DMA channel and the corresponding non-pre-emptible one. This wastes one DMA channel, but on the up-side I found the original code was squatting on a bunch of DMA channels it might never use, and fixed that, so there's very likely a net gain!
 
Back
Top