Forum Rule: Always post complete source code & details to reproduce any issue!
Page 1 of 2 1 2 LastLast
Results 1 to 25 of 38

Thread: 3D Rendering on Teensy

  1. #1
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114

    3D Rendering on Teensy

    Had some fun porting my tiled software rasterizer to Teensy & ILI9341



    Cheers, Jarkko

  2. #2
    Senior Member
    Join Date
    Apr 2017
    Posts
    214
    That is really cool!

  3. #3
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    10,936
    Very smooth - go T_3.6

  4. #4
    That is goooood!

  5. #5
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    21,473
    Wow, looks amazing.

    Any chance we'll get to see this incredible code?

  6. #6
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    4,807
    Looks really good. Interesting note is that you are using 4k triangles. Didn't think there was enough space so would be interesting to see the code as well. I did mange to get Michael Rule's Arduino 3d code ported over which you can see in this video: https://vimeo.com/150386845, works even better on a Teensy 4.

  7. #7
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    This 3D model is taking only 37kb of memory and stored in MCU flash so it's not an issue at all. I have a tool which processes and compresses 3D objects with custom vertex formats so that they can be embedded to your program and be directly rendered from the program memory. The model is broken into meshlets and vertices processed in batches (max 64/batch) to reduce vertex transform cost. There's also vertex cache to reduce retransforms across tiles but I had it disabled for this demo.

    I have thought of releasing the 3d graphics lib at some point, but it needs some more work. For example I like to implement DMA transfer of tiles to ILI9341, add texture sampling, visibility cone culling, some other further optimizations, some cleanup & restructuring, etc.

  8. #8
    Senior Member Projectitis's Avatar
    Join Date
    Feb 2018
    Location
    New Zealand
    Posts
    168
    Super impressive. I’ll be keeping an eye open for that lib when it’s released

  9. #9
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    4,807
    @JarkkoL
    Indeed impressive. Looking forward to see how you implemented it.

    Wait till you get yourself a T4.

  10. #10
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    Thanks for the nice feedback

  11. #11
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    Added DMA support, which gave nice performance boost


  12. #12
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    4,807
    Yep DMA support makes a big difference as well as frame buffering which most of the current display libraries support. Inspired me to keep working on my merging of two libraries highly modified opengl (v1) for the T4 and the Arduino3D rendering libs that I posted previously. While not as good as yours I did manage to get the beginning of shading incorporated but still need to work on back face culling, etc. Heres a video I just posted showing a meshlab teapot imported into the lib (only 256 faces for this test):


  13. #13
    Hello,

    To start I would like to say that I am impressed by the work you have provided.
    Do you think it would be possible to adapt it for the gameduino 3x and more generally for the FT81x and BT81x chips?

    They have a rather particular functioning but it has the advantage of managing large resolutions and having a very good 2D acceleration.
    That said it does not (or almost not) 3D rendering.

  14. #14
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    @mjs513 Very cool! Nice to have other 3D graphics enthusiast working on Arduino as well I got the lib working on T4 via synchronous SPI transfer, but I have to port the DMA code to T4 as well. Should be trivial with code references from KurtE's graphics lib.

    @Armadafg Thanks! It's definitely possible as long as the device supports sending pixel data to it. I have designed this lib portability in mind and abstracted the actual display device in the code so that you only need to implement HW-specific tile submission to the display. You can also write the pixel shading to either output to the native pixel format directly, or to some intermediate format and convert to native format upon tile submission.

  15. #15
    Precisely, these graphics chips have a big defect which is to have a very limited instruction number compared to the power of the chips (from 1000 to 2000 instruction around).
    This is due to the fact that the CPU sends in a well-defined part of graphics chips the instructions that it will have to execute. And this part of memory is very very limited.

    However it can stream images stored at the cpu level, here the RAM is the most suitable.
    The image can be in format BMP, DXT1, JPG or PNG. Do you think it would be possible to generate an image of one of this format from your 3D graphics engine?

  16. #16
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    The rasterizer renders the image in tiles (64x64px tiles in the video), so it doesn't require much memory for operation. For ILI9341 after rendering each tile I submit the pixel data for that tile to the display and then use that same memory to render the next tile, so instead of requiring 320x240px frame buffer, it only requires 64x64px buffer. For asynchronous DMA transfer on ILI9341, it copies the tile data to another buffer, so that requires some extra memory, but that's configurable how asynchronous rendering you want. But this is just how I implemented it for ILI9341. If you would want to, you could encode each tile to DXT for example and submit the compressed image to the display (if this is what your display supports), or copy the tile data to larger image and encode it in one go.

  17. #17
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    Ported the 3D renderer to T4 (w/ DMA), runs much faster!


  18. #18
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    Some more fun optimizing geometry processing in the graphics lib This video visualizes "cluster cone culling", which omits processing of back-facing and occluded geometry clusters and helps in rendering more complex models. "Stanford dragon" model in the video has 23490 triangles and 11745 vertices split into 248 clusters, running on T4.


  19. #19
    Senior Member Projectitis's Avatar
    Join Date
    Feb 2018
    Location
    New Zealand
    Posts
    168
    Incredible work. Very impressive!

  20. #20
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    4,807
    Agreed impressive - all done on the T4

  21. #21
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    Thanks! Added texture support, so here's happy cube Textures are stored in program memory to save RAM. It's doing perspective correct interpolation with point sampling, and supports different pixel formats. Sorry about the washed out colors, it's tricky to record decent video of an LCD screen.



    I also fixed rasterization fill rules and implemented Hi-Z cluster occlusion culling which reduces geometry processing further.

  22. #22
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    4,807
    @JarkkoL
    Very cool indeed. Love the references keep them coming. Haven't had much time to play with 3d stuff for awhile other things got me tied up for now.

    Are you using the ILI9341_t3n library?

  23. #23
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    I implemented my own ILI9341 lib using Kurt's lib as a reference. It's further optimized and the DMA transfer supports partial tiled updates, so only updated regions are transferred over SPI. And it doesn't require holding the entire frame buffer in memory either, so could run this in higher resolutions without memory issues

  24. #24
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    4,807
    Quote Originally Posted by JarkkoL View Post
    I implemented my own ILI9341 lib using Kurt's lib as a reference. It's further optimized and the DMA transfer supports partial tiled updates, so only updated regions are transferred over SPI. And it doesn't require holding the entire frame buffer in memory either, so could run this in higher resolutions without memory issues
    Cool - did you ever think about doing a PR back to @KurtE's library. Think Kurt was looking at implementing something similar.

  25. #25
    Senior Member JarkkoL's Avatar
    Join Date
    Jul 2013
    Posts
    114
    The design is quite a bit different and I think Kurt would have to reimplement all the draw functions. Effectively his lib is immediate mode rendering while I defer rendering by recording draw commands to a command buffer and dispatch them to tiles when they are being processed. Here's what the rendering code of a 3D model looks like:
    Code:
      // render mesh
      for(uint16_t seg_idx=0; seg_idx<s_mesh.num_segments(); ++seg_idx)
      {
        test_shader sh;
        sh.m_mesh=&s_mesh;
        sh.m_seg=&s_mesh.segment(seg_idx);
        sh.m_o2c=o2c; // object->camera matrix
        sh.m_o2p=o2p; // object->projection matrix
        s_gfx_dev.dispatch_shader(sh);
      }
      s_gfx_dev.commit(); // kick off tile rendering
    dispatch_shader() queues commands and commit() performs the actual rendering of all the tiles at the end of the frame.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •