Drawing the viewport with multiple calls as above permits to use less memory and so we can put the buffers in faster memory region. However, this approach also has its drawback because, for each call, the renderer must iterate over all the triangles of the mesh to see which ones should be drawn and which one are clipped/discarded... Therefore, the speed up obtained by using a faster memory can be lost because of the increase in the number of triangle to iterate over. I think that splitting the viewport instead of using EXTMEM will be efficient when the screen is large but the mesh has relatively few triangles (maybe 5K like 'bunny" in the texture example) but it may not prove very good for detailed mesh like 'buddha' (20K).
A solution for large meshes would be to split it into is many smaller sub-meshes. Then the renderer will discard the sub-meshes whose bounding box do not intersect the image being drawn without having to loop over its triangles. But splitting a mesh in sub-meshes like this is a tedious process...
Thanks for this! I went with this loop in the end, which works for even multiples of the screen height:
Code:
static const int SLX = 480;
static const int SLY = 800;
static const int chunks = 4;
static const int chunkSize = SLY / chunks;
// main screen framebuffer
uint16_t fb[SLX * chunkSize];
// zbuffer in 16bits precision
DMAMEM uint16_t zbuf[SLX * chunkSize];
// image that encapsulates fb.
Image<RGB565> im(fb, SLX, chunkSize);
...
for (uint16_t y=0; y<SLY; y+=chunkSize) {
// draw chunk
im.fillScreen(RGB565_Blue); // we must erase it
renderer.clearZbuffer(); // and do not forget to erase the zbuffer also
renderer.setOffset(0, y); // set the offset inside the viewport for the lower half
renderer.drawMesh(buddha_cached, false); // draw that part on the image
tft.setAddrWindow(0, y, SLX -1, y + chunkSize - 1);
// update the screen
tft.pushPixels16bit(fb, fb + (SLX * chunkSize) - 1);
}
Works like a charm! On an SSD1963 800*480 screen over 16 bit parallel GPIO6, frame rate is a steady 13.57fps; that includes the rendering calculations and the (blocking) screen transfer time. Thanks, this is fun!
(My setAddressWindow takes x1, y1, x2, y2 rather than x1, y1, w, h, hence the -1 in the calls, and for iterating an N byte block of memory, that's contained in <start> to <start> + N-1 , hence the -1 in the pushPixels bufferEnd)