RA8876 Parallel Display Library Testing

What speed is the FlexIO clock set at? If it’s running at 480Mhz and core at anything lower than 600Mhz you might get erratic behavior with the timer triggers.
I discovered this when developing the parallel library initially, clock speed needs to be half the PLL speed, so I divided by 1 to run at 240Mhz

Also, RD bus speed needs to be lower than WR
I recalled around 8Mhz max for the 9488 - not sure what it is on the RA

So just noting the above to make sure you have taken into account
 
What speed is the FlexIO clock set at? If it’s running at 480Mhz and core at anything lower than 600Mhz you might get erratic behavior with the timer triggers.
I discovered this when developing the parallel library initially, clock speed needs to be half the PLL speed, so I divided by 1 to run at 240Mhz
Went from 30 to 240 without success...
 
Inquiring minds want to know :). Was playing a bit more as well this morning, including the clock read speed but no luck (was using @KurtEs version). So as Kurt said - go with what works.
Yeah, That is the version I am using as well and will finish with it. Then onto testing other testing boards...
 
What speed is the FlexIO clock set at? If it’s running at 480Mhz and core at anything lower than 600Mhz you might get erratic behavior with the timer triggers.
I discovered this when developing the parallel library initially, clock speed needs to be half the PLL speed, so I divided by 1 to run at 240Mhz

Also, RD bus speed needs to be lower than WR
I recalled around 8Mhz max for the 9488 - not sure what it is on the RA

So just noting the above to make sure you have taken into account
Thanks, I think I have covered everything. Hopefully :)
 
@KurtE @mjs513 - Using @KurtE's repo and doing the mods to "combined_t4x_wip" everything is working:cool:
Now to wire up the MicroMod and...
:confused: Where is the current code base you are trying? My current Repo or yours with the branch name combined_t4x_wip, I can probably deduce by looking up at github.

Have you run the Kurt... example. If so does the copy of the bars show up correctly likewise the text color toward the top... Match correctly
My latest version is closer, very much like @mjs513 had this morning:
1720034753987.png

edit closer up of the copy of color bands:
1720035666708.png

You can see it may be mostly correct but there are patchs/starts that are not right

@mjs513 look familiar?
Note: Right now I am using flexio RD pin setup, I believe similar issues with using digitalWrite.

I could either post changes or update if you are interested to see what I have done:
Code is setup to easily change how the RD pin changes:

Two sets of functions:
Code:
FASTRUN void RA8876_t41_p::gpioWrite() {
#ifndef READS_USE_DIGITAL_WRITE
    pFlex->setIOPinToFlexMode(_wr_pin);
    pinMode(_rd_pin, OUTPUT);
    digitalWriteFast(_rd_pin, HIGH);
#endif  
}


FASTRUN void RA8876_t41_p::gpioRead() {
#ifndef READS_USE_DIGITAL_WRITE
    pFlex->setIOPinToFlexMode(_rd_pin);
    pinMode(_wr_pin, OUTPUT);
    digitalWriteFast(_wr_pin, HIGH);
#endif  
}

And:
Code:
inline void RA8876_t41_p::RDHigh() {
#ifdef READS_USE_DIGITAL_WRITE
   digitalWriteFast(_rd_pin, HIGH);
#endif
}

inline void RA8876_t41_p::RDLow() {
#ifdef READS_USE_DIGITAL_WRITE
   digitalWriteFast(_rd_pin, LOW);
#endif
}
And both sets of calls... #define at top conrols which one does anything...

I changed the RD clock speed:
Code:
#ifndef RA8876_CLOCK_READ
#define RA8876_CLOCK_READ
#endif  
    p->TIMCMP[0] =
        (((1 * 2) - 1) << 8) /* TIMCMP[15:8] = number of beats x 2 – 1 */
        | ((RA8876_CLOCK_READ / 2) - 1);    /* TIMCMP[7:0] = baud rate divider / 2 – 1 */

And in HDR file I have:
Code:
// #define RA8876_CLOCK_READ 30   //equates to 8mhz
#define RA8876_CLOCK_READ 60 // equates to 4mhz
//#define RA8876_CLOCK_READ 120   //equates to 2mhz


In common I commented out the read of dummy:
Code:
ru16 RA8876_common::getPixel(ru16 x, ru16 y) {
    ru16 rdata = 0;
    ru16 dummy __attribute__((unused)) = 0;


    selectScreen(currentPage);
    graphicMode(true);
    setPixelCursor(x, y); // set memory address
    ramAccessPrepare();   // Setup SDRAM Access
    //dummy = lcdDataRead();
    rdata = (lcdDataRead() & 0xff); // read low byte
    rdata |= lcdDataRead() << 8;    // add high byte
    return rdata;
}

And it is closer to reading:
One (of many) confusing things to me in the RA8876 manual (P144 in mine) defining MRWDP
Read Function : Memory Read Data
Data to read from memory corresponding to the setting of
REG[04h][1:0]. Continuous data read cycle can be accepted in
bulk data read case.
Note1: if you set this port address from different port address,
must issue a dummy read, the first data read cycle is dummy
read and data should be ignored. Graphic Cursor RAM & Color
palette RAM data are not support data read function.
Note2: read memory data is 4 bytes alignment no matter color
depth setting.
Note3: If user write data to SDRAM user must make sure write
FIFO is empty before he change register number or core task
busy status bit becomes idle.
So how do I detect if I should do dummy read?

Also, currently we are doing readRect by reading each pixel one at a time.
Is it easily possible to read in all in one operation, or if not all of the same row?

Now back to seeing what to try next.
 
edit closer up of the copy of color bands:
1720035666708.png

You can see it may be mostly correct but there are patchs/starts that are not right

@mjs513 look familiar?
Thats exactly what I was seeing this morning with my hacky experiments.

One (of many) confusing things to me in the RA8876 manual (P144 in mine) defining MRWDP
So how do I detect if I should do dummy read?

Also, currently we are doing readRect by reading each pixel one at a time.
Is it easily possible to read in all in one operation, or if not all of the same row?
In one of my hacked up versions I removed the dummy read - which was when I was reading a single byte. But I had to add the dummy read in when I tried to read both bytes at the same time. also noticed, but did verify was that in linear address mode it reads in a byty, but in block mode the out of the read is the pixel - little confused on that one - what a pixel versus byte.
 
@KurtE - Sorry, I had go out for a while. I'll put it up on my GitHub now. Let you know when it's done...
Not a problem. Thought I would mention, I just moved the display from T41 to MMOD and:
1720046490111.png


It is using FlexIO pins (as defined in the one header file) and no changes to sketch.
Shows same issues with the Reads...

Still ahve not put in code to detect that we have DMA and use it. Will try that soon.
 
Not a problem. Thought I would mention, I just moved the display from T41 to MMOD and:
View attachment 34921

It is using FlexIO pins (as defined in the one header file) and no changes to sketch.
Shows same issues with the Reads...

Still ahve not put in code to detect that we have DMA and use it. Will try that soon.
Awsome :D I had just disconnected the RA8876 from the T41 before I had to run.
Now to hookup the MicroMod and play...
 
I pretty much synced up my branch to yours. I left in the code to allow the READ/write to have different SPEEDs.
Also the code to experiment with how to set the RD pin... But that part is defined now to be functionally the same
as yours and the read colors appear right on my MMOD running the parallel code.

However, the colors are off on the SPI screen. LIke the bytes are swaped, and/or maybe should skip byte or not skip
byte...

More tomorrow.
 
I pretty much synced up my branch to yours. I left in the code to allow the READ/write to have different SPEEDs.
Also the code to experiment with how to set the RD pin... But that part is defined now to be functionally the same
as yours and the read colors appear right on my MMOD running the parallel code.

However, the colors are off on the SPI screen. LIke the bytes are swaped, and/or maybe should skip byte or not skip
byte...

More tomorrow.
Nice, My MicroMod is not working for some reason. Will have to check it out tomorrow...
 
Reconnected the T41 just to make sure the display was not damaged and it still works. This is the second time the MicroMod has failed. Last time I had to adjust the MCU board in the socket and that fixed it. Hopefully it is just a wiring mistake on my part otherwise I'll just wire up the Dev Board...
 
Nice, My MicroMod is not working for some reason. Will have to check it out tomorrow...
I marked out a piece of paper with which pins on T41 moves to which pin on MMOD. The default
pins and the like are in the one header file: But boils down to:
T41 Data pins (19,18,14,15,40,41,17,16) RD 37, WR 36
MM Data Pins (40,41,42,43,44,45, 6. 9) RD 8 , WR 7
Sketch unchanged so DC=13, CS=11, RST=12

Next up: to detect that the Shifter supports DMA and add that code in, to not need all of those interrupts.

But question: to you (@wwatson @mjs513), is best way to get to a consistent code base. That is I created the combined_t4x_wip
branch out of the combined... As to make the initial changes to support both boards, and when that was working do a PR back to
at the time @mjs513 branch.... Then delete the branch... Then I was assuming Mikes branch would get merged into @wwatson branch...

But looks like we now have things mostly merged into:

Wondering should we assume this is the new common branch, where I should maybe, copy out (or stash a couple of my later changed
files, reset my branch to @wwatson fork of it and restore my couple of changed files to either this branch, or
create a new branch like combined_t4x_wip_dma.... to do the next step?

Or did you want to potentially migrate your copy back into the RA8876_Combined branch... (or master?)...

This morning I will probably do the step of bringing my fork/branch into sync with yours, and restore the few files changed since then.
 
I marked out a piece of paper with which pins on T41 moves to which pin on MMOD. The default
pins and the like are in the one header file: But boils down to:
T41 Data pins (19,18,14,15,40,41,17,16) RD 37, WR 36
MM Data Pins (40,41,42,43,44,45, 6. 9) RD 8 , WR 7
Sketch unchanged so DC=13, CS=11, RST=12

Next up: to detect that the Shifter supports DMA and add that code in, to not need all of those interrupts.

But question: to you (@wwatson @mjs513), is best way to get to a consistent code base. That is I created the combined_t4x_wip
branch out of the combined... As to make the initial changes to support both boards, and when that was working do a PR back to
at the time @mjs513 branch.... Then delete the branch... Then I was assuming Mikes branch would get merged into @wwatson branch...

But looks like we now have things mostly merged into:

Wondering should we assume this is the new common branch, where I should maybe, copy out (or stash a couple of my later changed
files, reset my branch to @wwatson fork of it and restore my couple of changed files to either this branch, or
create a new branch like combined_t4x_wip_dma.... to do the next step?

Or did you want to potentially migrate your copy back into the RA8876_Combined branch... (or master?)...

This morning I will probably do the step of bringing my fork/branch into sync with yours, and restore the few files changed since then.
I usually do a periodic download of the branches when I am messing with changes. If we migrate to master should we start a new repo with a slight name change?
EDIT: Maybe something like Ra8876LiteTeensy2 or since it's not really lite anymore, Ra8876Teensy2 and then archive the original repo?
 
Last edited:
Wondering should we assume this is the new common branch, where I should maybe, copy out (or stash a couple of my later changed
files, reset my branch to @wwatson fork of it and restore my couple of changed files to either this branch, or
create a new branch like combined_t4x_wip_dma.... to do the next step?

Or did you want to potentially migrate your copy back into the RA8876_Combined branch... (or master?)...

This morning I will probably do the step of bringing my fork/branch into sync with yours, and restore the few files changed since then.
Still finding some issues when using setOrigin read/drawpixels (one I think I got resolved) but... So would recommend leaving it as branch - your choice on how you want to do this or branch name - I synched my changes back into yours in my combined branch.

For instance if you just go back and forth between frame buffer Kurts FB sketch works fine but if you shift origin you run into issues:
1720096893394.png

vs
1720096922373.png

Colorboar from readPixels is missing (but I can fix that one) but MONOBOLD is shifted too much, TEST is off from above ADAFRUIT, gradient fills is acutally fine ( was playing with fixing things) but something is still off with some of the functions
 
I usually do a periodic download of the branches when I am messing with changes. If we migrate to master should we start a new repo with a slight name change?
That is a possibility. That is more up to you. We may also want to play around with the combined layout. Currently I have created 3 symbolic links to this project for the different directories. Might be nice to have it simply as one logical library.

Like maybe one src directory for both the top-level objects (SPI, FLEXIO) and sub-directory for the GFX code? Potentially both top level objects
change names? _t3 means what? Although is consistent with our other display drivers with _t3 or _t3n... maybe _t41_p goes to t4x_p
Probably set the library attribute to be archive, such that objects that are not referenced don't end up in sketch.

Also been tempted to run something like clang-format on all of the code, to cleanup space/tab... But probably just me.

EDIT: my WIP branch has been git reset --hard to your branch, and changed git pushed --force up to github...
I then pushed my changes for allowing easy experiment of RD pin use flexio or GPIO...
 
Last edited:
Question: Are there any examples or code here that uses the multi-beat code?
I see one method:
FASTRUN void RA8876_t41_p::pushPixels16bitAsync(const uint16_t *pcolors, uint16_t x1, uint16_t y1, uint16_t x2, uint16_t y2) {

But I don't see anywhere that calls this.
 
Question: Are there any examples or code here that uses the multi-beat code?
I see one method:
FASTRUN void RA8876_t41_p::pushPixels16bitAsync(const uint16_t *pcolors, uint16_t x1, uint16_t y1, uint16_t x2, uint16_t y2) {

But I don't see anywhere that calls this.
I don't think so. I only have used it for testing with my test program:
Code:
//#include "images.h"
//#include "Teensy41_Cardlike.h"
//#include "flexio_teensy_mm.c"
#include "teensy41.c"

//#define use_spi
#if defined(use_spi)
#include <SPI.h>
#include <RA8876_t3.h>
#else
//#include <RA8876_t3.h>
#include <RA8876_t41_p.h>
#endif
#include <math.h>

#if defined(use_spi)
#define RA8876_CS 10
#define RA8876_RESET 9
#define BACKLITE 7 //External backlight control connected to this Arduino pin
RA8876_t3 tft = RA8876_t3(RA8876_CS, RA8876_RESET); //Using standard SPI pins
#else
uint8_t dc = 13;
uint8_t cs = 11;
uint8_t rst = 12;
#define BACKLITE 7 //External backlight control connected to this Arduino pin
RA8876_t41_p lcd = RA8876_t41_p(dc,cs,rst); //(dc, cs, rst)
//RA8876_t3 lcd = RA8876_t3(dc,cs,rst); //(dc, cs, rst)
#endif

uint32_t start = 0;
uint32_t end =  0;

uint8_t busSpeed = 12;

uint8_t rData = 0;
uint16_t  rslt = 0;

void setup() {
  while (!Serial && millis() < 3000) {} //wait for Serial Monitor
  Serial.printf("%c SDRAM Dev Board and RA8876 parallel 8080 mode testing (8/16)\n\n",12);
//  Serial.print(CrashReport);
//  pinMode(WINT, INPUT); // For XnWAIT signal if connected and used.

#if defined(use_spi)
  lcd.begin();
#else
  lcd.begin(busSpeed);// 20 is working in 8bit and 16bit mode on T41
#endif
//  if(!lcd.begin(busSpeed)) Serial.printf("lcd.begin(busSpeed) FAILED!!!\n");
  delay(100);

  Serial.print("Bus speed: ");
  Serial.print(busSpeed,DEC);
  Serial.println(" MHZ");
  Serial.print("Bus Width: ");
  Serial.print(BUS_WIDTH,DEC);
  Serial.println("-bits");

  lcd.graphicMode(true);
  lcd.fillScreen(0x0000);
  lcd.setRotation(0);
}

int i=0, j=0;

void loop() {
//  rData = lcd.lcdStatusRead();
//  Serial.printf("rData = 0x%2.2x\n",rData);

//  start = micros();
//  start = millis();

//  lcd.drawPixel(0x0000,0x0000,0xffff);
//  for(i = 0; i < 2; i++) {
//  rslt = lcd.getPixel(i,0);
//  Serial.printf("rslt = 0x%4.4X\n",rslt);
//  }
//  lcd.pushPixels16bitAsync(teensy41_Cardlike,10,10,575,424);
//  lcd.pushPixels16bitAsync(flexio_teensy_mm,0,0,480,320); // 480x320
  lcd.pushPixels16bitAsync(teensy41,0,0,480,320); // 480x320
/*
    for (i = 0; i < 240; i++){
      for (j = 0; j < 184; j++){
        lcd.drawPixel(i, j + 136, Dallas[i][j]);
      }
    }
    for (i = 0; i < 240; i++){
      for (j = 0; j < 184; j++){
        lcd.drawPixel(i+250, j + 136, Salt_Lake[i][j]);
      }
    }
    for (i = 0; i < 182; i++){
      for (j = 0; j < 185; j++){
        lcd.drawPixel(i + 500, j + 135, Jewish_style_building[i][j]);
      }
    }
    for (i = 0; i < 240; i++){
      for (j = 0; j < 236; j++){
        lcd.drawPixel(i+746, j + 85, Flower_pattern[i][j]);
      }
    }

    for (i = 0; i < 89; i++) {
      for (j = 0; j < 92; j++) {
        lcd.drawPixel(i + 31, j + 98 + 300, quarter_pattern[i][j]);
      }
    }
    for (i = 0; i < 89; i++) {
      for (j = 0; j < 92; j++) {
        lcd.drawPixel(i + 120, j + 98 + 300, quarter_pattern[88 - i][j]);
      }
    }

    for (i = 0; i < 89; i++) {
      for (j = 0; j < 92; j++) {
        lcd.drawPixel(i + 31, j + 190 + 300, quarter_pattern[i][91 - j]);
      }
    }
    for (i = 0; i < 89; i++) {
      for (j = 0; j < 92; j++) {
        lcd.drawPixel(i + 120, j + 190 + 300, quarter_pattern[88 - i][91 - j]);
      }
    }
*/
//  lcd.pushPixels16bitDMA(teensy41,1,1,480,320);    // FLASHMEM buffer
//  lcd.pushPixels16bitDMA(teensy41_Cardlike,1,1,575,424);    // FLASHMEM buffer
//  lcd.pushPixels16bitDMA(flexio_teensy_mm,530,260,480,320); // FLASHMEM buffer
//  lcd.pushPixels16bitDMA(frameBuffer,530,260,480,320);        // SDRAM buffer
//  lcd.pushPixels16bitDMA(frameBuffer1,1,1,575,424);           // SDRAM buffer

//  lcd.writeRect(10,10,575,424,teensy41_Cardlike);
//  lcd.writeRect(10,280,480,320,teensy41);
//  lcd.writeRect(530,0,480,320,flexio_teensy_mm);
//  end = micros() - start;
//  Serial.printf("Wrote %d bytes in %dus\n\n",(575*424)+(480*320), end);
//  end = millis() - start;
//  Serial.printf("Wrote %d bytes in %dms\n\n",(575*424)+(480*320), end);
waitforInput();
//  lcd.graphicMode(true);
//  lcd.clearScreen(0x0000);
}

void waitforInput()
{
  Serial.println("Press anykey to continue");
  while (Serial.read() == -1) ;
  while (Serial.read() != -1) ;
}
 
Looks like this is in _combined branch.

@KurtE - @wwatson

Fixed the issue with the origin offset shown in post #119. Pushed changes to my branch and issued a PR to your branch so we all stay in synch
I issued a PR against the other branch: combined_t4x_wip
Nothing major, simple RD pin experiments setup
cut the number of lines in source for cursors by factor of about 16
Callback register function had no implementation.

I included here your example sketch plus a MMOD image,
Also had it do callback when async completed to get an idea of how long that takes...

I am noticing that the IRQ version now is a bit flaky. Usually shows up OK on first drawing, but subsequent ones usually show some corruption.

Now, will start the DMA stuff, although not sure how much I will get done today
 

Attachments

  • RA8876P_async_test-240704a.zip
    185.4 KB · Views: 487
Back
Top