ST7796 Teensyduino support

I had an inspiration, and lied about the screen resolution, which made the asynchronous update spring into life. The truth is 320x480, but if I say 320x240 it works OK. I'd have a crack at fixing it myself, but the DMA code makes my head spin :eek:
Join the crowd. It does for me as well, everytime I look at it. Especially since DMA works differently for T4.x, 3.5, 3.6, lc, ...
On T4.x if I remember correctly: With something this big you can not do it all with one simple DMA structure, but need to chain them:
I believe you can do a max of 32K per chunk: 320*240*2=15360 / 32768 = 4.6... round to 5
With 320*480*2 you now need: 10... And all of the stuff to set all of that stuff up...

Right now playing with zephyr...

This 4" display with more area and resolution is quite nice and has been running the example for 10 hours it seems since posted above at "80 MHz" - not sure what it really resolves to something faster than 40 for sure based on the numbers if not all of 80! And it has a very readable off center wide viewing angle as was noted before as well.

Yep - same resolution as the ILI9488 we have had around for awhile now... One nice thing is that it does support 16 bit mode by SPI whereas
the ILI9488 does not.

Will play more with it at some point.
 
That would be good.

I've followed the code just far enough to determine that there's an assumption that if 2 chunks would be too big, then 3 will be fine. However, this isn't true, and it attempts to do 3x transfers of 51200 pixels, which stuffs up the word count because it's still over 32768...

When you get a chance to have a play, it would perhaps be cleaner to interrupt out at the end of a chunk to set up the next DMA transfer. Otherwise there's kind of a risk that the number of DMASettings in the ST7735DMA_Data just keeps growing as displays get bigger!

If the max chunk size could be set by the user, that would potentially be an opportunity to modify the ISR with a "mid transaction break" such as I'm looking at. I believe (without evidence, currently) that this could be useful in scenarios other than SD audio playback.
 
Will play more with it at some point.
This may be awhile:
I am not suggesting that this will be today or tomorrow or ... Could be weeks, months...

I am not overly inspired... Hopefully that will come back. And if it does I will probably mainly fix the things are broken, and not to rewrite
large portions of it.

Side comment:
I still don't understand why some of the code is the way it is.
That is why does the Audio library cause the SPI code on all SPI objects to go into this mode where it screwing around with the
beginTransaction/endTransaction processing for every SPI object regardless of being applicable to it.

That is if the SD is on SDIO - it should not touch any of the SPI processing. If it is on SPI object is should only impact the code
using the SPI object. i.e. why is it static methods/data?

One answer to that is that you don't know what the SD is on.... BUT the SD code, the SPI code and the AUDIO code are all
owned by Paul, so one should be able to add something to make it possible...

Ability to get called back while doing DMA:

Code:
void setFrameCompleteCB(void (*pcb)(), bool fCallAlsoHalfDone = false);
Not sure how this is implemented right now... Mostly was used for continuous.
But did have the ability to interrupt at half way or so... On some this may interrupt on each setting completed...
 
I still don't understand why some of the code is the way it is.
That is why does the Audio library cause the SPI code on all SPI objects to go into this mode where it screwing around with the
beginTransaction/endTransaction processing for every SPI object regardless of being applicable to it.

That is if the SD is on SDIO - it should not touch any of the SPI processing. If it is on SPI object is should only impact the code
using the SPI object. i.e. why is it static methods/data?

One answer to that is that you don't know what the SD is on.... BUT the SD code, the SPI code and the AUDIO code are all
owned by Paul, so one should be able to add something to make it possible...
Because this thread, as mentioned in post #26. Yes, it's fixable. Yes, in post #3 of that thread, dated June 18th 2023, Paul stated
Today I do not have a solution for this problem. I just want to confirm it is on my list of high priority issues to fix for 1.59 or 1.60.
Obviously he's passed a lot of water under the bridge since that date nearly two years ago ... but he's neither fixed it, nor posted anything further to suggest that this is in fact a high priority but due to calls on his time he'd like to delegate the fix to someone else.

I will say again ... the audio is a metric, but it is not the only reason we might want to adopt a practice of not hogging the SPI bus for long periods.

I am not suggesting that this will be today or tomorrow or ... Could be weeks, months... I am not overly inspired...
Understood. If I happen to be inspired myself, and don't get distracted, I may delve into it. Please don't get the impression I think you're solely responsible for fixing this, and derelict if you don't! I'm not thinking that way at all.
 
Asynchronous updates are now working for a 480x320 display, on this branch - not the same as previous, but branched from it. These are not finished for my purposes - I'd need to add the mid-transaction break, which is going to take some thinking about. A corollary of that will probably be the ability to asynchronously update only part of the display, which will also be non-trivial, I suspect.
 
Out of curiosity, I went back to the original problem statement from when Defragster first detected this issue with the Audio Tutorial 3-3. I put everything back to original except I increased SPI clock to 80MHz and that also resolved that particular issue just because the display wasn't tying up the bus for as long.

I am not suggesting that as the fix in lieu of the library improvements being made, but this display can obviously handle pretty high data rates and being able to increase the display clock can be handy in many display intensive applications. It would be really nice to add a user function to be able to set the TFT SPI speed without having to hack the ST7735 library which is difficult for many people. Something along the lines of tft.setSPISpeed(80000000);
 
The LPSPI modules aren't meant to be driven anywhere close to that though, their documented max is only around 30MHz.
 
The LPSPI modules aren't meant to be driven anywhere close to that though, their documented max is only around 30MHz.
Good point. Your comment led me to this post you were on which I had not previously seen that has some good background info on LPSPI bus speed for anyone else interested in such things: http://forum.pjrc.com/index.php?threads/lpspi-in-teensy-4-0-4-1-maximum-clock-frequency.76495/

It may make sense to increase the default from 16Mhz to 24/30MHz in the library to stay within the safe bounds of the NXP spec, but I am still a fan of giving the user the option to adjust up from there if their setup will handle it.
 
It would be really nice to add a user function to be able to set the TFT SPI speed without having to hack the ST7735 library which is difficult for many people. Something along the lines of tft.setSPISpeed(80000000);
Done on the big-screen-t4 branch. A bit hacky because the underlying SPISettings class doesn't provide a method just to change the clock, so I've had to assume SPI_MODE0. This is consistent with the existing library code, though not with some example code.

It may make sense to increase the default from 16Mhz to 24/30MHz in the library
I won't do that. If this ever gets as far as a PR, Paul will not want anything in there which suddenly breaks existing code / hardware.
 
I'd need to add the mid-transaction break, which is going to take some thinking about. A corollary of that will probably be the ability to asynchronously update only part of the display, which will also be non-trivial, I suspect.
Congrats on getting stuff to work...

As I mentioned earlier, there was the option in most of the drivers to do callback when frame completes and optionally half completes. With
some of them I think it give a finer resolution partial frames... But this still left the transfer going on (it continued to the next item in the
DMA chain. You can of course simply break up your transfers. Also add the disable on completion, do a callback or whatever code you
want, then reinitialize to do the next portion of your update.

As for partial frames. I have a version in ILI9341_t3n where I was experimenting with this... But sort of a hacky way. As the
memory for the transfer is not contiguious... In that version I have two temporary buffers, where I copy in so much of the
actual pixels that will be output into a frame, and start it up. It also initialized the 2nd buffer with the next blob. I had it
interrupt at completion of each blob and fill in the next memory region... It works. It was based off of the ILI9486, where I
had to do something like this, as with that display you can not output RGB565 over SPI. So instead of just copying the data
into the temporary buffer, it converted each one to RGB888 (or 18 bit which is same format of data transferred)...

As an alternative one way I would like to try is to, setup where each DMASetting only output one row of data and try
to setup that setting such that instead of resetting the address to start of where it output, it would update it to skip N bytes forward.
You could potentially get away with just two settings, but I would probably want more, as to not need an interrupt per line, especially
in the worst case where you are only outputting 1 pixel per row. But maybe something like 8 DMA settings, that are chained.
So the skip forward would need to be for 8 row pixels minus how much you output per row... And probably interrupt on the
8th one, where it would decrement how more rows are left and if necessary update the chain to interrupt and disable after
the correct number of rows were output...

Not sure if that makes sense... But could be interesting.
 
You could do that with two DMASettings chained to each other; the first one would loop over the first X-8 bytes in a line, and the second one would handle the remaining 8 bytes with the loop count set to the total number of lines to process.
 
Thanks folks. Much to consider here, though I'm going to be AFK for a little while now, at least as far as coding is concerned...

I've not re-instated the [half-]complete callbacks as yet, because it's not been 100% clear to me how the final structure is going to look. I do intend to do so, assuming the rest of it gets to a place where it looks coherent.

As noted by @KurtE, the partial frame capability is not easy, because the worst case is 1 pixel wide and thus roughly 1µs per DMA (at 16MHz SPI clock) , and not actually worth the effort to set up. Even chaining 8 TCD settings together is only fractionally better - I've no idea what the typical (let alone worst case) interrupt latency might be, but I'd've thought well over 8µs. The current 480x320 full-screen update is divided into 5 chain links of 30,720 pixels each, using the existing 3-setting structure, so there's lots of time to update a TCD before it gets chained to.

I very much like the idea of an intermediate buffer, though. A 300kB buffer is pretty expensive in RAM, but much less so in PSRAM, which I have tested and works OK. I was messing about with minor loops, and came up against the SPI FIFO limit of 16 words; I could use it for the full-screen update (16 pixels in the minor loop, major loop count of 9,600), but couldn't fathom a way of making it work for partial frames without still needing ridiculously frequent interrupts or large numbers of TCD settings. However, given an intermediate buffer, I think one could set up the first TCD to DMA the frame buffer to the intermediate buffer, doing the interleaving, and then the second to stream the intermediate buffer out to the display.
 
The big-screen-t4 branch mostly works for me with the 4" ST7796! Async update wasn't working in the normal library, which is what I needed. However, it doesn't work if turning on continuous updates. Only one-shot. That's fine for me, but just FYI.
 
Thanks. I've switched issue reporting on for that repo, and added that to it. There's a lot of housekeeping stuff like that which will get done either if (a) someone needs it urgently, or (b) I get everything else to a state where I think it might be useful to others. So far it's all very much "in progress", and vying for my attention with other distracting stuff!
 
OK, I've pushed a bunch of changes to the big-screen-t4 branch, which should include continuous updates now working again - use
UPDATE_MODE values of 3 or 7 in Audio_TFT_Display_torture_test.ino to test. Apart from those, there are some preparatory changes in there:
  • updateScreenAsync() now has an optional second boolean parameter, which if set to true stops DMA after each transaction and interrupts; the ISR then starts a new DMA transaction, until the whole screen update is complete
  • setMaxDMAlines(int) takes a line count, which sets the size of each DMA transaction. The count must divide exactly into the number of lines the screen has (for now). If it's not physically possible for the requested number of lines to be output in a single DMA transaction, then the code is intended to use the fewest number of valid transactions
  • attachInterrupt(int) actually just sets the DMA interrupt priority (I should, and probably will, rename this). This is in preparation for ending and re-starting the SPI transaction in the ISR, which if the priority is 224 (less than the Audio priority of 208) should allow the audio update to run. This should also work for any other ISR-based SPI clients which have used SPI.usingInterrupt() to avoid bus conflicts
 
More updates...
  • attachInterrupt() renamed to setDMAinterruptPriority()
  • mid-transaction break added in DMA+IRQ asynchronous updates, i.e. those started with updateScreenAsync([true|false], true);
  • updates to torture test
Brief testing so far seems to show we can now get clean audio playback from SD, despite the buggy AudioPlaySdWav code, while using asynchronous screen updates - provided you get your settings right! Key points:
  • use setDMAinterruptPriority(224) (or 240) to lower the DMA interrupt priority below that of the audio update (208)
  • pick a value for setMaxDMAlines() to suit your SPI bus speed. For example, 16MHz gives 1µs/pixel; 480 pixel lines means 480µs/line; so setMaxDMAlines(2) will interrupt every two lines, or 960µs, and that will therefore be the maximum latency the audio update interrupt will experience.
  • use updateScreenAsync(?, true); - ?=false (non-continuous updates) is probably better for finer control of SPI bus use
Next ... asynchronous updates of only part of the screen :) wish me luck!
 
@h4yn0nnym0u5e I finally had a chance to download your big-screen-t4 branch. With no changes to my test sketch, it compiled and ran fine and audio was good.

I then ran your Audio_TFT_Display_torture_test. It initially ran OK but there was no audio after a power cycle. I traced that down to
sgtl5000_1.setAddress(HIGH);. I am using a D2 type SGTL5000 with one address, so this command causes confusion. Setting it to LOW or commenting out the line fixed the problem.

You mentioned to use Update Mode 7 or 3. In my setup modes 0, 1, 6 and 7 worked OK. The others were bad in some fashion.

While testing, I did occasionally see the display image freeze immediately after downloading. The static image left on the screen during the software update remained on the screen after the download was completed though audio and serial output restarted or in some cases the screen would just remain black. Redownloading the program had no effect. Power cycling would always clear.

Seemed like something in the display wasn't getting reinitialized properly on occasion, perhaps depending on what state it was left in when the download was started. Fairly reproducible on my setup by repeatedly pressing the program button. Notice this on both update mode 0 and 7.
 
Ah, yes, sorry, I’m always doing that. My poor long-suffering test audio adaptor has had its I2C address changed - I really should change it back, or fit a jumper.

I haven’t been rigorous about re-testing all the update modes as I go along; when I get back to this I’ll aim to do that, it’s quite likely I’ve broken some things.

Could the screen freeze be to do with the fact I haven’t specified a /RST pin?
 
Still between blocking issues here.

I did run the display with audio the other day at 150 MHz for a test and it worked ... in the end.

First build and upload ran off color on END of display (wrong orientation and green? IIRC) ? Then changed sketch and speed and that ran fine - then back to the test and it ran fine at 150 MHz with working audio. It went away with that so not repeated - not sure if power cycled? But it was on Ken's Mini hardware w/ST7796 and updated version that day?

So, there is some instance of 'sticky setup' in display?

Had to restart today and not opened IDE's again so some 9+ days ago?
 
I've pushed another commit which should fix the mode 4 (whole screen in one DMA) updates, and has the sgtl5000.setAddress(HIGH) commented out; I've jumpered that back to the default address on my hardware, so it shouldn't be a problem for future commits.

Note that some update modes do still interfere with audio - that's to be expected.
Still between blocking issues here.

I did run the display with audio the other day at 150 MHz for a test and it worked ... in the end.

First build and upload ran off color on END of display (wrong orientation and green? IIRC) ? Then changed sketch and speed and that ran fine - then back to the test and it ran fine at 150 MHz with working audio. It went away with that so not repeated - not sure if power cycled? But it was on Ken's Mini hardware w/ST7796 and updated version that day?

So, there is some instance of 'sticky setup' in display?

Had to restart today and not opened IDE's again so some 9+ days ago?
I'm not planning on testing outrageous SPI bus speeds any time soon, as (a) there's the potential to make it appear audio is working when in fact it's marginal, and (b) if it stops working I don't know if it's a bug or just out-of-spec speed.

As noted above, I don't have /RST wired or configured in the torture test, so yes, "sticky" behaviour is entirely possible, I've certainly seen a need to cycle power if e.g. the Teensy had ILI9341 code in it when powered up - that seems to get the ST7796 wedged.
 
Could the screen freeze be to do with the fact I haven’t specified a /RST pin?
On my setup, reset is pulled high and I don't specify a RST pin in my software and I have not come across the image freezing during download issue before. That includes multiple downloads with your current library to try to duplicate the issue.

It feels like the torture test is touching something in either the display or Teensy that isn't getting cleaned up after a soft reset. Once in this state, redownloading the torture test does not clear it. However if I then download my software without a power cycle it does clear it.

BTW, all my testing is being done at default 16MHz SPI speeds to remove that as a variable. I also verified that screen rotation made no difference since I normally use rotation 3.

Here is a case where it looks like the image slide sideways and then froze when this happened.
1749653575679.jpeg
 
It feels like the torture test is touching something in either the display or Teensy that isn't getting cleaned up after a soft reset. Once in this state, redownloading the torture test does not clear it. However if I then download my software without a power cycle it does clear it.
Hmm. I'm not aware of anything I'm omitting from the torture test startup that I should be including. One would hope that tft.init() would include everything needed, but maybe not. Is your software online somewhere I can look at it to see what might be different?

Here is a case where it looks like the image slide sideways and then froze when this happened.
That definitely looks like it was halfway through an update from framebuffer to screen when the firmware download kicked in. Seems possible that a write got corrupted and maybe set an offset register value, and maybe a bunch of other registers which aren't supposed to be touched. TBH I haven't looked at the datasheet - perhaps the duff write could have put it into I2C or parallel mode? But then ... why would writing your software recover it?

150 MHz was F_CPU! It was in response to a post about failing audio at some speeds changed during operation with set_arm_clock
Ah, that makes sense ... I know people like to push SPI speeds, but 150MHz did seem a tad ambitious :ROFLMAO:
 
why would writing your software recover it?
Never mind.... I was being an idiot. When I updated the torture test, I forgot to change the CS pin from 22 to 10. After the change, I cannot duplicate the error.

How it worked at all with the wrong CS pin specified is a whole different question.
 
Back
Top