tgx: a 2D/3D graphics library for Teensy.

I've been playing around with this library for a bit with thoughts on using with T41 & a 800x480 display; impressive library, excellent work.

Whilst playing with the Mars demo, I noted a fatal error within loadFalconTexture() where it assigns the Falcon mesh data (falcon_vs_1). The 'while ... mesh = mesh->next' relies upon the NULL for completion, standard stuff for running through a list, but nowhere is the final 'next' expectedly set to NULL, thus leading to unpredictable results (hard fault).
My fix was to escape on the 60'th iteration.

Impressive demo, I'll try to get a video together.
 
I've been playing around with this library for a bit with thoughts on using with T41 & a 800x480 display; impressive library, excellent work.

Whilst playing with the Mars demo, I noted a fatal error within loadFalconTexture() where it assigns the Falcon mesh data (falcon_vs_1). The 'while ... mesh = mesh->next' relies upon the NULL for completion, standard stuff for running through a list, but nowhere is the final 'next' expectedly set to NULL, thus leading to unpredictable results (hard fault).
My fix was to escape on the 60'th iteration.

Impressive demo, I'll try to get a video together.

Thought this might have been related to a suspected bug I've been tracking (extmem_calloc does not zero the returned memory like a regular call to calloc should) but it's not, the copyMeshEXTMEM function in tgx doesn't assign a value to the next pointer of the final mesh.
(The bug only appears when using a Teensy 4.1 with extmem, since otherwise the mesh is accessed directly from progmem.)
 
Thought this might have been related to a suspected bug I've been tracking (extmem_calloc does not zero the returned memory like a regular call to calloc should) but it's not, the copyMeshEXTMEM function in tgx doesn't assign a value to the next pointer of the final mesh.
(The bug only appears when using a Teensy 4.1 with extmem, since otherwise the mesh is accessed directly from progmem.)


I investigated this a little further this morning and discovered the root cause.
In Mesh3D.h, struct Mesh3D defines .next as "const Mesh3D *next", but this should actually be "Mesh3D *next". This then allows us to remove the const from "const tgx::Mesh3D<tgx::RGB565> falcon_vs_60 = " from falcon_vs.h (line 15896). Finally, in loadFalconTexture(), the .next cast ("(tgx::Mesh3D<tgx::RGB565>*)") is no longer necessary, thus the line should be "mesh = mesh->next;".
This resolves the type issue with nullptr.
 
Also, If you receive a compiler warning regarding "MAXVIEWPORTDIMENSION", move the "static const int MAXVIEWPORTDIMENSION = ..." line in Renderer3D.h -> "class Renderer3D" to public: and remove "static"
 
Still having some fun playing with some of this.

Something I have meant to do for awhile with ILI9341_t3n (and others based on it), was to support updateScreenAsync that optionally could use the clip rectangle or dirty area rect to limit how much of the screen to update using DMA.

So have been hacking on it. So far I have only done T4.x and it does not support continuous updates. But it appears to be working (needs some cleanup) and then try with 3.5/6
https://github.com/KurtE/ILI9341_t3n/tree/try_clipped_async

First trued it with the crazy clock setup.

Then tried the Borg 3d...
On the Borg one I have it such that I can compile for _t3n _t4...

One of the things I have not done yet, was to see what you are doing with the FPS output. Which I am not emulating yet.

Anyway having some fun!
 

Attachments

  • CrazyClock_ili9341_t3x-230420a.zip
    270.3 KB · Views: 105
  • borg_cube_ili9341_t3n-230420a.zip
    3.1 KB · Views: 94
Depends, if you look at other thread today on ili9488 I have hacked up one example and it runs with problems…

But you can not put both frame buffers in dmamem. Won’t fit. But one in dma and other in DTCM I.e. ram fit. And with t4.1 you can solder memory to bottom and use that

I have been busy with other things, but finally have some time to look into this some more.

I was playing around today with Vindar's ili9341_t4 driver, and I noticed that putting the 'internal frame buffer' onto PSRAM actually works pretty well when using double buffering and differential buffers. Looks like there might be change of adapting his library for the ili9488.

Failing that, I think I will have to scrap his library and drawing to a framebuffer.
 
I have been using this library on a different project without problems, it has been a big help, thanks!

Now I have another project that uses an 48x7 array of WS2812 LEDs with Adafruit GFX via neomatrix (neomatrix lets you use the LED array as a "screen" with the usual line, circle, etc) that works great, but i would like to use it with TGX to get transparency and nicer effects. but can not figure out what to do about the frame buffer in this:

Code:
 tgx::Image<tgx::RGB565> tg(???, 48, 7);



For anyone that cares:
The array uses the new 1mm square WS2812B LEDs on a 4 layer board that is only 77x14mm (soldered by JLC) i will put it up on github eventually as "fakeVFD" which it was originally made for.

Screenshot 2023-12-15 154629.png
 
Hi,

I am just bumping this thread to let you know that I updated the library (by merging the "improved-drawing-primitives" branch)...

There are a bunch of new features:
  • Rewrote most of the 2D drawing methods (implying some API breaking change unfortunately). Added drawing primitives to render polygons, curves, arc and pies with anti-aliasing and thickness. Lines and curves ending can now be flat, rounded or have an arrowhead.
  • Some speed improvement in the 3D rasterizer.
  • Updated the demos/examples. Added a demo showcasing all the 2D drawing methods.
  • Added bindings to the OpenFontRender library to make it easy to draw text with any TrueType font (the library natively support drawing with GFXFont and ILI9341_t3 fonts, either anti-aliased or plain).
  • Added bindings to PNGdec, JPEGDEC and AnimatedGIF libraires to make it easy to display images in PNG, JPEG and GIF format (stored in memory or on an external device such as a SD card).
  • Last but not least: an extensive documentation is now available at: https://vindar.github.io/tgx/html/index.html ! All methods in TGX are now fully documented and I also wrote a little explanation on basic usage of the library and the 2D API. Hopefully, when I have some free time I will do the same for the 3D API...

Also, I finally added the library to Arduino's library manager so it can be installed directly from within the Arduino IDE...
 
I've been using your library (the improved-drawing-primitives branch) for more than a year now. It is amazing!

After updating, I noticed a substantial regression in performance (up to 50% slower) for the drawing mode I use -- drawing triangles using SHADER_GOURAUD to an RGB565 framebuffer, on an ARM Cortex-M33 (Pi Pico2). The cause seems to be the new dynamic scaling in Shaders.h to prevent overflow during color interpolation. Would you have any ideas for speeding that up?

Thanks for all your work on this impressive library.
 
Hi,

Such as performance impact seems strange indeed. I have a Pico2 lying around. Could you provide me with a complete sketch that exhibits the slowdown so I can experiment and see what I can do ?

BTW: What display/library combos would you suggest on a Pico 2 ? It seems that TFT_eSPI (my go-to library for non-teensy projects) does not support RP2350 yet, does it ?
 
Last edited:
@sublinear

I just pushed a small change that should mostly cancel the slowdown the color interpolation bugfix may have created... Can you pull the newest version of the library from github and let me know if it is better ? Thanks.
 
Hello,
Yes! I confirmed your latest version (with templated shiftC) restored the performance, while still preventing overflows in the interpolated shading. Thank you!!

Here's my profile of the 3 versions, where "Draw:" column shows the rasterization (in MHz consumed). You can see it's fixed now.

Code:
// old version (improved-drawing-primitives)
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 7  Xform: 12  Draw: 158  Update: 0  Audio: 4  Wait: 87 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 7  Xform: 12  Draw: 155  Update: 0  Audio: 4  Wait: 110 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 7  Xform: 11  Draw: 135  Update: 0  Audio: 4  Wait: 131 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 10  Draw: 121  Update: 0  Audio: 4  Wait: 147 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 10  Draw: 112  Update: 0  Audio: 4  Wait: 156 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 9  Draw: 106  Update: 0  Audio: 4  Wait: 162 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 9  Draw: 96  Update: 0  Audio: 4  Wait: 174 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 8  Draw: 92  Update: 0  Audio: 4  Wait: 178 

// new version, with overflow scaling
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 7  Xform: 12  Draw: 239  Update: 0  Audio: 4  Wait: 5 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 7  Xform: 12  Draw: 229  Update: 0  Audio: 4  Wait: 35 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 7  Xform: 11  Draw: 198  Update: 0  Audio: 4  Wait: 68 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 10  Draw: 179  Update: 0  Audio: 4  Wait: 88 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 10  Draw: 166  Update: 0  Audio: 4  Wait: 102 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 9  Draw: 159  Update: 0  Audio: 4  Wait: 109 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 9  Draw: 144  Update: 0  Audio: 4  Wait: 126 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 8  Draw: 141  Update: 0  Audio: 4  Wait: 129

// new version, with templated shiftC
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 12  Draw: 161  Update: 0  Audio: 4  Wait: 85 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 6  Xform: 12  Draw: 157  Update: 0  Audio: 4  Wait: 110 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 11  Draw: 136  Update: 0  Audio: 4  Wait: 131 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 10  Draw: 122  Update: 0  Audio: 4  Wait: 146 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 10  Draw: 113  Update: 0  Audio: 4  Wait: 156 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 9  Draw: 108  Update: 0  Audio: 4  Wait: 162 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 8  Draw: 97  Update: 0  Audio: 4  Wait: 173 
CLK:300  SPI:75  FPS:31  Other: 1  Clear: 5  Anim: 5  Xform: 8  Draw: 93  Update: 0  Audio: 4  Wait: 178
 
For a display, I'm using a 240x240 st7789v LCD. I'm able to hit 60 FPS by clocking SPI at 75MHz, with double-buffering and non-blocking DMA. There's a quirk in the Pico2 SDK where overclocking forces the SPI clock to 24MHz, but you can prevent this by defining PICO_CLOCK_ADJUST_PERI_CLOCK_WITH_SYS_CLOCK=1.

Neither TFT_eSPI or LovyanGFX seemed to support my setup yet, so I modified code from the ThumbyColor here:

The rendering quality of TGX looks fantastic. Due to RGB565, I had obvious banding when using fillScreenVGradient() for the background clear. For my usage, it was worth the minor slowdown to add 4x4 ordered dither:

Code:
    const uint8_t dither[4][4] = {{0, 8, 2, 10}, {12, 4, 14, 6}, {3, 11, 1, 9}, {15, 7, 13, 5}};
    uint32_t* p = (uint32_t* )buffer;

    for (int j = 0; j < SCREEN_HEIGHT; j++) {

        // apply 4x4 ordered-dither to 1x4 pixels
        RGB64 c0(c64_a), c1(c64_a), c2(c64_a), c3(c64_a);
        c0.R = min(0xffff, c0.R + (dither[j & 0x3][0] << 8));
        c0.G = min(0xffff, c0.G + (dither[j & 0x3][0] << 7));
        c0.B = min(0xffff, c0.B + (dither[j & 0x3][0] << 8));
        c1.R = min(0xffff, c1.R + (dither[j & 0x3][1] << 8));
        c1.G = min(0xffff, c1.G + (dither[j & 0x3][1] << 7));
        c1.B = min(0xffff, c1.B + (dither[j & 0x3][1] << 8));
        c2.R = min(0xffff, c2.R + (dither[j & 0x3][2] << 8));
        c2.G = min(0xffff, c2.G + (dither[j & 0x3][2] << 7));
        c2.B = min(0xffff, c2.B + (dither[j & 0x3][2] << 8));
        c3.R = min(0xffff, c3.R + (dither[j & 0x3][3] << 8));
        c3.G = min(0xffff, c3.G + (dither[j & 0x3][3] << 7));
        c3.B = min(0xffff, c3.B + (dither[j & 0x3][3] << 8));

        uint32_t cc0 = (uint16_t)((RGB565)c0) | ((uint16_t)((RGB565)c1) << 16);
        uint32_t cc1 = (uint16_t)((RGB565)c2) | ((uint16_t)((RGB565)c3) << 16);

        static_assert(SCREEN_WIDTH % 16 == 0);
        for (int l = SCREEN_WIDTH / 16; l > 0; l--) {
            *(p++) = cc0; *(p++) = cc1;
            *(p++) = cc0; *(p++) = cc1;
            *(p++) = cc0; *(p++) = cc1;
            *(p++) = cc0; *(p++) = cc1;
        }

        c64_a.R += dr;
        c64_a.G += dg;
        c64_a.B += db;
        c64_a.A += da;
    }
}
 
Thanks for the confirmation, good to know it is working better again. BTW, it seems you are overclocking the Pico2 from 150Mhz to 300Mhz ? Is it safe/stable to do so ?

Even though TFT_eSPI doesn't officially support RP2350 yet, I just tested it with an ILI9341 screen and DMA uploads worked fine using the exact same setup as for RP2040. So the library is, at least partially, usable...

I agree that color gradients in RGB565 are quite ugly ! Dithering seem like a good solution to improve it ! Oh well, now it makes me want to implement it :) When I have some free time, I will experiment with it and maybe update or create new "high quality" vfill/hfill methods.
 
Ha, yes, I was experimenting with how high I could clock the RP2350. I have two: a SEEED XIAO which seems stable at 300MHz, and a ThumbyColor that only runs at 250MHz. Not sure if that is sample-to-sample variation, or different QSPI flash used by each board? I certainly wouldn't deploy in a real product (at least without validating properly) but they do seem highly overclockable.

Here is how the dithered RGB565 gradient looks on my hand-wired project. Sorry about the crap photos!
 

Attachments

  • sharktank3.jpg
    sharktank3.jpg
    260.2 KB · Views: 35
  • sharktank4.jpg
    sharktank4.jpg
    98.4 KB · Views: 41
Back
Top