ILI9488_t3 - Support for the ILI9488 on T3.x and beyond...

The library with which the display works on the Teensy contains the following:

Code:
inline void lcdWriteReg (uint8_t reg)
{
digitalWrite (LCD_DC, LOW);
SPI.transfer (0);
SPI.transfer (reg);
}

How can I write 0 in t3n lib?
I just tried it like this:
Code:
void writecommand_cont(uint8_t c) __attribute__((always_inline)) {
		uint16_t d = c;
		maybeUpdateTCR(LPSPI_TCR_PCS(0) | LPSPI_TCR_FRAMESZ(15) | LPSPI_TCR_CONT);
		_pimxrt_spi->TDR = d;
		pending_rx_count++;	//
		waitFifoNotFull();
	}

EDIT: I added the working lib to GitHub https://github.com/sepp89117/Waveshare_ILI9486_t4
I would be happy if you can help me that your lib with buffer can operate the display
 
Last edited:
Looks like you may want to do the following but @KurtE can confirm:
Code:
		beginSPITransaction();
		setAddr(x, y, x, y);
		writecommand_cont(ILI9488_RAMWR);   /// this ILI9388_RAMWR would be replaced with REG or use LAST depending on the function
		write16BitColor(color, true);
		endSPITransaction();
This from the drawpixel function.

You might want to look at it from a function comparison basis or compare registers.

Now if its really a ILI9486 it might be more compatilble with the ILI9488 library. You might want to see if that works out of the box. Pretty much the same functions that are in the ILI9431 are in the ILI9488 library?
 
@mjs513 - Sounds like you are having some fun... Hopefully will get back to this display soon!

@sepp89117 - As we mentioned no idea, What version of Teensyduino? The latest released code has some SPI update to allow longer time from CS assert to start of signal. Maybe needed?

Maybe issues with some of their init commands in python maybe run slower, where on T4 maybe runs faster... Note: with most of our displays there are delays built in after some of the init strings, as a way to allow display to process previous command... You might want to compare your init to existing and see if you need to build in similar delays. Or sometimes it just is bad contacts with jumper wires that can give you a lot of grief.

I am now using the latest version of Teensyduino and Arduino IDE.
There are no delays in the working lib that I have not transferred to the t3n.
When I insert delays in beginSPITransaction or endSPITransaction or writecommand_cont or waitTransmitComplete nothing works anymore.
The wiring is good.
 
You don’t need delays anymore like they. There implementation is for rip and completely different. Adding delays will do exactly like what you are seeing
 
Looks like you may want to do the following but @KurtE can confirm:
Code:
		beginSPITransaction();
		setAddr(x, y, x, y);
		writecommand_cont(ILI9488_RAMWR);   /// this ILI9388_RAMWR would be replaced with REG or use LAST depending on the function
		write16BitColor(color, true);
		endSPITransaction();
This from the drawpixel function.

You might want to look at it from a function comparison basis or compare registers.

Now if its really a ILI9486 it might be more compatilble with the ILI9488 library. You might want to see if that works out of the box. Pretty much the same functions that are in the ILI9431 are in the ILI9488 library?

I have tested ILI9488 and HX8357 lib out of the box and with customized init.
I also noticed the RAMWR in the Waveshare lib. But only in the following functions:
Code:
inline void lcdWriteDataRepeat(uint16_t data, unsigned long count)
	{
		lcdWriteReg(0x2C); //RAMWR
		digitalWrite(LCD_DC, HIGH);

		for (unsigned long i = 0; i < count; i++)
		{
			SPI.transfer16(data);
		}

	}

	inline void lcdWriteDataCount(uint16_t *pData, unsigned long count)
	{
		lcdWriteReg(0x2C); //RAMWR
		digitalWrite(LCD_DC, HIGH);

		while (count--)
		{
			SPI.transfer16(*pData++);
		}
	}

EDIT: I made the following adjustment and since then the display reacts to the t3n lib:
Code:
void writecommand_cont(uint8_t c) __attribute__((always_inline)) {
		maybeUpdateTCR(LPSPI_TCR_PCS(0) | LPSPI_TCR_FRAMESZ(7) | LPSPI_TCR_CONT);
		_pimxrt_spi->TDR = 0;
		maybeUpdateTCR(LPSPI_TCR_PCS(0) | LPSPI_TCR_FRAMESZ(7) | LPSPI_TCR_CONT);
		_pimxrt_spi->TDR = c;
		pending_rx_count++;	//
		waitFifoNotFull();
	}
	void writedata8_cont(uint8_t c) __attribute__((always_inline)) {
		maybeUpdateTCR(LPSPI_TCR_PCS(1) | LPSPI_TCR_FRAMESZ(7) | LPSPI_TCR_CONT);
		_pimxrt_spi->TDR = 0;
		maybeUpdateTCR(LPSPI_TCR_PCS(1) | LPSPI_TCR_FRAMESZ(7) | LPSPI_TCR_CONT);
		_pimxrt_spi->TDR = c;
		pending_rx_count++;	//
		waitFifoNotFull();
	}
void writecommand_last(uint8_t c) __attribute__((always_inline)) {
		maybeUpdateTCR(LPSPI_TCR_PCS(0) | LPSPI_TCR_FRAMESZ(7));
		_pimxrt_spi->TDR = 0;
		maybeUpdateTCR(LPSPI_TCR_PCS(0) | LPSPI_TCR_FRAMESZ(7));
		_pimxrt_spi->TDR = c;
		pending_rx_count++;	//
		waitTransmitComplete();
	}
	void writedata8_last(uint8_t c) __attribute__((always_inline)) {
		maybeUpdateTCR(LPSPI_TCR_PCS(1) | LPSPI_TCR_FRAMESZ(7));
		_pimxrt_spi->TDR = 0;
		maybeUpdateTCR(LPSPI_TCR_PCS(1) | LPSPI_TCR_FRAMESZ(7));
		_pimxrt_spi->TDR = c;
		pending_rx_count++;	//
		waitTransmitComplete();
	}
 
You don’t need delays anymore like they. There implementation is for rip and completely different. Adding delays will do exactly like what you are seeing

okay

I forgot to mention that no read is possible. The display has no MISO. Is that a problem with your lib?
 
I have it! :D
The RPi display now runs on Teensy 4.0 with the t3n lib.
I will now clean up the lib and make it available afterwards.
 
I thought I would try a speed up that was applied recently to the ili9341_t3 library, that when we call setAddr which controls the rectangle of the screen that we are going to write pixels to, we check to see if either the Xs or Ys are the same as the previous time we called and if so we don't output those bytes. Since I already had a T4.1 hooked up to an ILI9488 display I thought I would try this change on this library to see how much of help it would be...

So I ran the graphic test program before I made the change.

Code:
::begin 200010f0 255 255 255
ILI9488_t3::begin - End
ILI9488 Test!
Display Power Mode: 0x0
MADCTL Mode: 0x0
Pixel Format: 0x0
Image Format: 0x0
Self Diagnostic: 0x0
Benchmark                Time (microseconds)
Screen fill              615266
Text                     12121
Lines                    154779
Horiz/Vert Lines         51432
Rectangles (outline)     28580
Rectangles (filled)      1487031
Circles (filled)         182145
Circles (outline)        124185
Triangles (outline)      33620
Triangles (filled)       491821
Rounded rects (outline)  55229
Rounded rects (filled)   1626714
Done!
Then after the change. And there are some improvements.

Code:
::begin 200010f0 255 255 255
ILI9488_t3::begin - End
ILI9488 Test!
Display Power Mode: 0x0
MADCTL Mode: 0x0
Pixel Format: 0x0
Image Format: 0x0
Self Diagnostic: 0x0
Benchmark                Time (microseconds)
Screen fill              615251
Text                     10900
Lines                    154565
Horiz/Vert Lines         51149
Rectangles (outline)     28389
Rectangles (filled)      1487036
Circles (filled)         177769
Circles (outline)        104362
Triangles (outline)      33453
Triangles (filled)       487330
Rounded rects (outline)  54478
Rounded rects (filled)   1626238
Done!

Quick check to see how much:
Code:
Orig	updated	delta	delta/orig*100
Screen fill            	615266	615251	15	0.00243797
Text                   	12121	10900	1221	10.07342628
Lines                  	154779	154565	214	0.13826165
Horiz/Vert Lines       	51432	51149	283	0.550241095
Rectangles (outline)   	28580	28389	191	0.66829951
Rectangles (filled)    	1487031	1487036	-5	-0.00033624
Circles (filled)       	182145	177769	4376	2.402481539
Circles (outline)      	124185	104362	19823	15.96247534
Triangles (outline)    	33620	33453	167	0.496728138
Triangles (filled)     	491821	487330	4491	0.913137097
Rounded rects (outline)	55229	54478	751	1.359792862
Rounded rects (filled) 	1626714	1626238	476	0.029261444
Sum of times	4862923	4830920	32003	0.658102133

As I suspected, the main place I expected speed up is in text output. Was surprised with circles... Will play some more.
 
Any chance to port these changes to the HX8357 library once you’re satisfied with the results?
Happy to test on the 4.0/4.1. Im using triangles (the gauge sketch you put together for me) and text on the display.
 
Once @KurtE finishes porting and testing changes will wind up making same changes to the HX8357 and ILI9488 libraries as well.
 
I can potentially help out with the porting of my optimizations to the ILI9488/ILI9486. At the moment I only have a 3.5" Kuman Uno shield that seems to want 5v signals. My initial test on the Teensy4 failed to make it work. Does anyone here know if they have gotten this display (link below) to work on the Teensy 4 maybe with some mod to the power circuit?

https://www.amazon.com/gp/product/B075FP83V5

Larry B.
 
So far the code change is pretty simple, In ILI9488_t3.h I changed setAddr like:
Code:
 	uint16_t _previous_addr_x0 = 0xffff; 
 	uint16_t _previous_addr_x1 = 0xffff; 
 	uint16_t _previous_addr_y0 = 0xffff; 
 	uint16_t _previous_addr_y1 = 0xffff; 

	void setAddr(uint16_t x0, uint16_t y0, uint16_t x1, uint16_t y1)
	  __attribute__((always_inline)) {
	  	if ((x0 != _previous_addr_x0) || (x1 != _previous_addr_x1)) {
			writecommand_cont(ILI9488_CASET); // Column addr set
			writedata16_cont(x0);   // XSTART
			writedata16_cont(x1);   // XEND	
			_previous_addr_x0 = x0;
			_previous_addr_x1 = x1;
	  	}
	  	if ((y0 != _previous_addr_y0) || (y1 != _previous_addr_y1)) {
			writecommand_cont(ILI9488_PASET); // Row addr set
			writedata16_cont(y0);   // YSTART
			writedata16_cont(y1);   // YEND
			_previous_addr_y0 = y0;
			_previous_addr_y1 = y1;
		}
	}
I did a quick PR of it, but more things can probably be done.
 
I can potentially help out with the porting of my optimizations to the ILI9488/ILI9486. At the moment I only have a 3.5" Kuman Uno shield that seems to want 5v signals. My initial test on the Teensy4 failed to make it work. Does anyone here know if they have gotten this display (link below) to work on the Teensy 4 maybe with some mod to the power circuit?

https://www.amazon.com/gp/product/B075FP83V5

Larry B.

Looks like this might be an ILI9486 display. Also it may not support SPI, but instead parallel data mode.

We probably should integrate SPI versions of ILI9486 into somewhere. I know some others like @sepp89117 has something similar. What I don't know about these are do they support 16 bit colors in SPI mode or 18 bit like the 88? Would make a difference on what code base it might be integrated with.

The two 88 ones I have, one is from EBay and other form BuyDisplay. We also have a real funky one for RPI (that we wasted far more time on than it was worth!)
 
It is the ILI9486 in 8-bit parallel mode. I would like to see how fast I can push it on the Teensy 4. The ATmega328 speed controlling it doesn't impress me much, but at least it performs better than an SPI display on slow CPUs.
 
@bitbank - I have not done much in the parallel mode for display. Did some a long long time ago with AVR board.

As mentioned in a different thread (https://forum.pjrc.com/threads/61121-Teensy-4-1-LCD-output?p=242322&viewfull=1#post242322) There are examples of using FlexIO for parallel mode.

However with the T4, I am not sure how easy it would be to get enough FlexIO pins configured with the different objects and the like.
With T4.1 there are more FlexIO pins, but many of the new ones are on FlexIO3 controller which I don't believe has DMA support. At least not the normal Dma stuff.
 
Any chance to port these changes to the HX8357 library once you’re satisfied with the results?
Happy to test on the 4.0/4.1. Im using triangles (the gauge sketch you put together for me) and text on the display.

I just pushed up a change back to @mjs513... Again not breathtaking speed ups, but...

Code:
	Orig	updated	delta	delta/orig*100
Screen fill            	512766	512757	9	0.001755187
Text                   	113393	97927	15466	13.6392899
Lines                  	166179	166406	-227	-0.136599691
Horiz/Vert Lines       	43599	43266	333	0.763778986
Rectangles (outline)   	24406	24182	224	0.917807097
Rectangles (filled)    	1239425	1239577	-152	-0.012263751
Circles (filled)       	169231	164399	4832	2.855268834
Circles (outline)      	136611	112142	24469	17.91144198
Triangles (outline)    	35298	35191	107	0.303133322
Triangles (filled)     	421949	416593	5356	1.269347717
Rounded rects (outline)	55128	53980	1148	2.082426353
Rounded rects (filled) 	1365167	1364834	333	0.02439262
Sums	4283152	4231254	51898	1.211677755
With T4.1
 
Thanks Mike,

Not sure at times how much these change help or not... Often times it depends on how much time the code overhead is versus how many times it avoids with reducing the bytes sent out SPI. Example did same code change for the ST7735/89_t3 code base and tried it on T4 with ST7789 2" version... Code change:

I tried it with and without the change:
Code:
#if 1
  uint16_t _previous_addr_x0 = 0xffff; 
  uint16_t _previous_addr_x1 = 0xffff; 
  uint16_t _previous_addr_y0 = 0xffff; 
  uint16_t _previous_addr_y1 = 0xffff; 

  void setAddr(uint16_t x0, uint16_t y0, uint16_t x1, uint16_t y1)
    __attribute__((always_inline)) {
      if ((x0 != _previous_addr_x0) || (x1 != _previous_addr_x1)) {
        writecommand(ST7735_CASET); // Column addr set
        writedata16(x0+_xstart);   // XSTART 
        writedata16(x1+_xstart);   // XEND
        _previous_addr_x0 = x0;
        _previous_addr_x1 = x1;
      }
      if ((y0 != _previous_addr_y0) || (y1 != _previous_addr_y1)) {
        writecommand(ST7735_RASET); // Row addr set
        writedata16(y0+_ystart);   // YSTART
        writedata16(y1+_ystart);   // YEND
        _previous_addr_y0 = y0;
        _previous_addr_y1 = y1;
    }
  }
#else
  void setAddr(uint16_t x0, uint16_t y0, uint16_t x1, uint16_t y1)
    __attribute__((always_inline)) {
        writecommand(ST7735_CASET); // Column addr set
        writedata16(x0+_xstart);   // XSTART 
        writedata16(x1+_xstart);   // XEND
        writecommand(ST7735_RASET); // Row addr set
        writedata16(y0+_ystart);   // YSTART
        writedata16(y1+_ystart);   // YEND
  }
#endif
Using a version of the graphic test that works in all 4 orientations and prints out timings...
And not much difference:

Code:
	BEFORE			AFTER	
Rotations	0	1		0	1
tftPrintTest: 	1611	1611		1610	1610
testlines: 	536	535		535	535
tftPrintTest: 	73	73		73	74
testdrawrects: 	65	71		65	70
tftPrintTest: 	593	1199		592	1199
testfill/drawcircles: 	204	205	191	190
testroundrects: 	124	123		121	122
testtriangles: 	73	80		74	79
mediabuttons: 	1115	1114		1115	1115
Totals:	4394	5012		4376	4995

I show two of the orientations. Find that the landscape versus portrait at least on one of the tests is a lot different in timing...
But the before versus the later is not a big difference.
 
I just pushed up a change back to @mjs513... Again not breathtaking speed ups, but...

Code:
	Orig	updated	delta	delta/orig*100
Screen fill            	512766	512757	9	0.001755187
Text                   	113393	97927	15466	13.6392899
Lines                  	166179	166406	-227	-0.136599691
Horiz/Vert Lines       	43599	43266	333	0.763778986
Rectangles (outline)   	24406	24182	224	0.917807097
Rectangles (filled)    	1239425	1239577	-152	-0.012263751
Circles (filled)       	169231	164399	4832	2.855268834
Circles (outline)      	136611	112142	24469	17.91144198
Triangles (outline)    	35298	35191	107	0.303133322
Triangles (filled)     	421949	416593	5356	1.269347717
Rounded rects (outline)	55128	53980	1148	2.082426353
Rounded rects (filled) 	1365167	1364834	333	0.02439262
Sums	4283152	4231254	51898	1.211677755
With T4.1
Awesome. Will update this week and test.
Have there been any plans to utilise the extra PSRAM on the 4.1 to rebuild the frame buffer or perhaps expand to a dual frame buffer? Would display performance benefit from either changes?
 
Thanks Mike,

Not sure at times how much these change help or not... Often times it depends on how much time the code overhead is versus how many times it avoids with reducing the bytes sent out SPI. Example did same code change for the ST7735/89_t3 code base and tried it on T4 with ST7789 2" version... Code change:

I tried it with and without the change:
Code:
#if 1
  uint16_t _previous_addr_x0 = 0xffff; 
  uint16_t _previous_addr_x1 = 0xffff; 
  uint16_t _previous_addr_y0 = 0xffff; 
  uint16_t _previous_addr_y1 = 0xffff; 

  void setAddr(uint16_t x0, uint16_t y0, uint16_t x1, uint16_t y1)
    __attribute__((always_inline)) {
      if ((x0 != _previous_addr_x0) || (x1 != _previous_addr_x1)) {
        writecommand(ST7735_CASET); // Column addr set
        writedata16(x0+_xstart);   // XSTART 
        writedata16(x1+_xstart);   // XEND
        _previous_addr_x0 = x0;
        _previous_addr_x1 = x1;
      }
      if ((y0 != _previous_addr_y0) || (y1 != _previous_addr_y1)) {
        writecommand(ST7735_RASET); // Row addr set
        writedata16(y0+_ystart);   // YSTART
        writedata16(y1+_ystart);   // YEND
        _previous_addr_y0 = y0;
        _previous_addr_y1 = y1;
    }
  }
#else
  void setAddr(uint16_t x0, uint16_t y0, uint16_t x1, uint16_t y1)
    __attribute__((always_inline)) {
        writecommand(ST7735_CASET); // Column addr set
        writedata16(x0+_xstart);   // XSTART 
        writedata16(x1+_xstart);   // XEND
        writecommand(ST7735_RASET); // Row addr set
        writedata16(y0+_ystart);   // YSTART
        writedata16(y1+_ystart);   // YEND
  }
#endif
Using a version of the graphic test that works in all 4 orientations and prints out timings...
And not much difference:

Code:
	BEFORE			AFTER	
Rotations	0	1		0	1
tftPrintTest: 	1611	1611		1610	1610
testlines: 	536	535		535	535
tftPrintTest: 	73	73		73	74
testdrawrects: 	65	71		65	70
tftPrintTest: 	593	1199		592	1199
testfill/drawcircles: 	204	205	191	190
testroundrects: 	124	123		121	122
testtriangles: 	73	80		74	79
mediabuttons: 	1115	1114		1115	1115
Totals:	4394	5012		4376	4995

I show two of the orientations. Find that the landscape versus portrait at least on one of the tests is a lot different in timing...
But the before versus the later is not a big difference.

For my own LCD libraries, this change makes no difference because I cache the data in RAM and try to write everything in one SPI transaction. At the other extreme is the old Adafruit GFX library which draws 1 pixel at a time for everything. If the drawing occurs along a single axis, then this change will make a measurable difference. If the library you're testing already groups pixels together when possible, then you will not see much benefit.
 
It is the ILI9486 in 8-bit parallel mode. I would like to see how fast I can push it on the Teensy 4. The ATmega328 speed controlling it doesn't impress me much, but at least it performs better than an SPI display on slow CPUs.

My project with ILI9341 (320x240) and the _t3n lib (SPI @ 80Mhz) runs super fast on teensy 4.0! No parallel mode necessary.
My new project with ILI9486 (480x320 @ 4inch) is slower because of the higher resolution, but thanks to the "_updateChangedAreasOnly" funktion it is also acceptable. I am curious how high I will get the SPI clock when I get my board to connect Teensy and display. Currently with cables there are interference above 60Mhz.
 
Back
Top