Call to arms | Teensy + SDRAM = true

I’ll make a gen5 of this board, all I need from you is a complete list of pins. I know you’re very good with the pin stuff since you made those PDF’s on Git.

The SD-Card uses the SDIO pins yes.
@mjs513 did the one for the devboard that I am looking at. Looks like most of the pins are there.
We are currently using AD_B1_04 through AD_B1_15

Mike does the below look correct to you as well?

The T4.1 defines we were using.. The numbers at start are I believe on this board with pin... ## are missing ones.
Code:
## #define OV7670_PLK 40 // 40 // AD_B1_04 CSI_PIXCLK
## #define OV7670_XCLK 37  // 41 // AD_B1_05 CSI_MCLK
16 #define OV7670_HREF 16  // AD_B1_07 CSI_HSYNC
17 #define OV7670_VSYNC 17 // AD_B1_06 CSI_VSYNC
27 #define OV7670_D0 27 // AD_B1_15 CSI_D2
26 #define OV7670_D1 26 // AD_B1_14 CSI_D3
## #define OV7670_D2 39 // AD_B1_13 CSI_D4
## #define OV7670_D3 38 // AD_B1_12 CSI_D5
21 #define OV7670_D4 21 // AD_B1_11 CSI_D6
20 #define OV7670_D5 20 // AD_B1_10 CSI_D7
23 #define OV7670_D6 23 // AD_B1_09 CSI_D8
22 #define OV7670_D7 22 // AD_B1_08 CSI_D9

Note the T4.1 also has two other pins that can be used for PLK(B1_12) and XCLK(B1_13):
34B1_13WDOG1_BIOMUXC_LPUART5_RX_SELECT_INPUT=1CSI_VSYNCENET_1588_EVENT0_OUTFLEXIO2_FLEXIO29GPIO2_IO29USDHC1_WPSEMC_DQS4FLEXIO3_FLEXIO29
35B1_12IOMUXC_LPUART5_TX_SELECT_INPUT=1CSI_PIXCLKENET_1588_EVENT0_INFLEXIO2_FLEXIO28GPIO2_IO28USDHC1_CD_BFLEXIO3_FLEXIO28

Also note: using D2-D9 input pins are good for 4 bit? camera and 8 bit camera. They also have support 19 and 16 bit and 24 bit, which I have
never tried. All of these cameras do 4 bit and/or 8 bit communications.
1713642796182.png

So not sure if the HM01B0 Arduino one will work with this or not.

Question is, do you want to support 10 and 16 bits? if so would need to add more to the table of pins.

If so; you can find in the table 34-2
CSD_DATA0 - GPIO_B1_10
CSD_DATA1 - GPIO_B1_11

For 16 bits you need to add D10-D15
CSD_DATA10 - GPIO_B1_09
CSD_DATA11 - GPIO_B1_08
CSD_DATA12 - GPIO_B1_07
CSD_DATA13 - GPIO_B1_06
CSD_DATA14 - GPIO_B1_05
CSD_DATA15 - GPIO_B1_04

Probably clear as mud!
 
@KurtE I’ll add all the pins you want. As that will help the community. So go ahead and together with @mjs513 or anyone else, compile a table of the missing pins.

Ideally you’d make a complete table of ALL pins, maybe some of the ones exposed now should be removed. That would be the most structured and correct way of doing it.
 
@mjs513 did the one for the devboard that I am looking at. Looks like most of the pins are there.
We are currently using AD_B1_04 through AD_B1_15

Mike does the below look correct to you as well?
Quick answer is yes.

in one of the conversation posts I listed them as well:
1713648000067.png


which is the same on your list.
 
You have B1_12 and B1_15 available for pixclk and mclk.
I would recommend using the pins shared with FlexIO2 wherever possible (for example hsync and vsync are on B1_14 and B1_13), since that's better at handling variable-length (compressed) transfers than CSI - using a shifter in compare mode to match the termination pattern.
 
Last edited:
You have B1_12 and B1_15 available for pixclk and mclk.
Yes you are right that B1_12-15 are already on the board.

We should probably make a new version of the excel document which has all of these pins on it...

As for mixing CSI and FlexIO. Not sure how... Probably need to read those sections in the PDF a few more times
(like 10 or 20) to try to pick out all of the things. I keep getting distracted while trying to add the CSI code to the new
camera library. But then again really only doing this for our own fun anyway.

For me, it would be nice if one could test the FLEXIO camera code, such that it was as compatible as possible with the Micromod setup, likewise for CSI, as close to as possible as the T4.1, that way the majority of the code could be used with PJRC boards...

Side note: as I mentioned yesterday, I prefer to not modify my Teensy installs, to be specific for these boards, as I mainly build for standard boards like Micromod... Which is why I have a variant setup. It is not perfect by any means, as there are so many places in the code that are hard coded to say on MicroMOD these pins do X...

After setting up the variant, I then marked up my board:
1713706462334.png

It sure makes things a lot easier to use those pins...

Now back to figuring out CSI
 
My main question would be, what do you even need the Teensy/Arduino pin numbers for when writing code for modules like CSI and FlexIO? You can use IOMUXC_SW_MUX_CTL_PAD_GPIO_x and IOMUXC_SW_PAD_CTL_PAD_GPIO_x to initialize them (which works identically for any board), and from that point the pins either have dedicated usages (like with CSI) or are referred to using module-specific IDs like the FlexIO pins.
 
My main question would be, what do you even need the Teensy/Arduino pin numbers for when writing code for modules like CSI and FlexIO? You can use IOMUXC_SW_MUX_CTL_PAD_GPIO_x and IOMUXC_SW_PAD_CTL_PAD_GPIO_x to initialize them (which works identically for any board), and from that point the pins either have dedicated usages (like with CSI) or are referred to using module-specific IDs like the FlexIO pins.
If you are building a specific board for a specific purpose, I say have at it! IF you like hard coding it like:
Code:
IOMUXC_SW_MUX_CTL_PAD_GPIO_AD_B1_13 = 0x4U;
IOMUXC_CSI_DATA04_SELECT_INPUT = 0;
IOMUXC_SW_PAD_CTL_PAD_GPIO_AD_B1_13 = 0x0U,
go for it. This could be fine for CSI where for there are maybe only 2-3 possible pins for each function of it...

As for FlexIO, that is not so clear cut. As there are lots of different uses for the FLEXIO pins. For example hooking up camera. Maybe I only hook up using 4 pins other times 8, which ones do I use. They need to be continuous. Or maybe I am using it for Serial ports, where I need only one or two flexIO pins..

And for me one major reason for IO pins, is I have no specific purpose for these boards! Just as I don't have any specific
purpose for a Teensy 4 or 4.1 or Micromod. I just sort of look at it like it is sort of a bastardized version of a T4.1 or Micromod
which logically removed the usage of a lot of the standard IO pins and added some additional one back. And like the T4.1 it has
some additional memory, which hopefully is a whole lot faster.

And where you might look at the pins on the board as labeled as B0_00-B0_15 and B1_00-B1_05 as simply
flexio pin, or the like, I look at it like there are 30 IO pins, which I Might use some of the for FlexIO, but I am very likely to use some of
them for other purposes, Like SPI or Wire or...

Hope that makes sense
 
@KurtE can we get an original excel copy of this so that we can edit it?
Note: that is simply a print to pdf of one of the pages in the excel document:

Note: @mjs513 did a modified version of this that I think is in @defragster github project:
 
Update ER-TFTM1011-1 10.1" LCD is working on the T4.1, MicroMod and SDRAM Dev Board V4.0. All except MicroMod work with 8-bit and 16 bit bus configuration. The T4.1 is by far the most stable of all three. The SDRAM board and MicroMod are unstable at high bus speeds. They will not work above 8MHz and thats pushing it whereas the T4.1 works up to 24MHz. The effective bus speed limit is 20MHz as the RA8876 internal 2D engine limits any faster speeds (Wait States). The only difference between the T4.1 and the other two boards are trace lengths between the MCU and it's connectors. The SDRAM Board and MicroMod are close in trace length and respond at about the same bus speeds. Again, I think bus buffers such as used on the Arduino Due and Mega would help.

I split the Ra8876Lite library into four separate libraries. SPI bus, T4.1 8080, SDRAM Dev board 8080, and MicroMod 8080. For anybody interested in testing. I have tested all of the examples and all seem to be working an all three boards except for the SDRAM Dev board and "grahicsCursor.ino" because a USB host port is needed for the mouse driver. Frame Buffer reads do NOT work in 8-bit mode. Spent weaks trying to get a stable read W/O success. 16-bit frame buffer reads work reliably on the T4.1 and SDRAM board. The MicroMod boarddoes not support a 16-bit bus AFAIK.
Links:
SPI Bus, T41, SDRAM DevBoard, MicroMod.

Now it's time to start learning about elcdif. @Rezo, what are you using for hardware to interface a TFT screen?
Still find FlexIO a little confusing :confused:
 
@wwatson
Really nice job and alot of testing to find the issue!!!!

Nice that you got the three branches set up - at least for SPI probably keeps it cleaner. Still waiting for my lcd/camera shield to test with the SDRAM board - hopefully today. Yesterday we got the OV5640 working for quite a bit thanks to @KurtE and his FlexIO code.
 
Here is a slightly updated version of the spreadsheet that I currently have:
 

Attachments

  • DogBoneSDRAMv1.zip
    1.2 MB · Views: 84
As some of you know I made a 4.5 (facelift) that adds USB-PD (USB Power Delivery up to 12V). And it also has SDCARD.
Those are the only changes made (moved the boot button position as well), hence the word facelift.

The USB Host port has sort of two modes:
* USB-PD, you supply your own external power via screw terminal connector.
* USB 5V, the from the input USB-C port is also present on the USB-C Host port.

Simply bridge two of the 2 pads to choose between the two.

Here are both images, if you want to make it look even nicer in the spreadsheet.

As soon as someone presents a complete spreadsheet for a gen5 board, I will make that happen!
I'm hoping that a few pins that are currently present can be removed, to give more space under that tight BGA.

Boards are blue in the images because the green color of the CAD software is not to kind on the eyes.
1713869240292.png

1713869207827.png
 
Update ER-TFTM1011-1 10.1" LCD is working on the T4.1, MicroMod and SDRAM Dev Board V4.0. All except MicroMod work with 8-bit and 16 bit bus configuration. The T4.1 is by far the most stable of all three. The SDRAM board and MicroMod are unstable at high bus speeds. They will not work above 8MHz and thats pushing it whereas the T4.1 works up to 24MHz. The effective bus speed limit is 20MHz as the RA8876 internal 2D engine limits any faster speeds (Wait States). The only difference between the T4.1 and the other two boards are trace lengths between the MCU and it's connectors. The SDRAM Board and MicroMod are close in trace length and respond at about the same bus speeds. Again, I think bus buffers such as used on the Arduino Due and Mega would help.

I split the Ra8876Lite library into four separate libraries. SPI bus, T4.1 8080, SDRAM Dev board 8080, and MicroMod 8080. For anybody interested in testing. I have tested all of the examples and all seem to be working an all three boards except for the SDRAM Dev board and "grahicsCursor.ino" because a USB host port is needed for the mouse driver. Frame Buffer reads do NOT work in 8-bit mode. Spent weaks trying to get a stable read W/O success. 16-bit frame buffer reads work reliably on the T4.1 and SDRAM board. The MicroMod boarddoes not support a 16-bit bus AFAIK.
Links:
SPI Bus, T41, SDRAM DevBoard, MicroMod.

Now it's time to start learning about elcdif. @Rezo, what are you using for hardware to interface a TFT screen?
Still find FlexIO a little confusing :confused:

@wwatson echoing @mjs513 - great work getting the driver ported to the different boards!
FlexIO is a bit confusing at first, but the 8080 stuff is fairly simple to setup and debug compared to other protocols.
The place it gets messy and really confusing is the DMA transfers - to this day I need to consult with the RM and the forum to understand how to configure the DMA channel settings.

It's odd that you got bad results on bus speed with the Dev board and the MicroMod board - I tested FlexIO on both of them without issues. I developed the 8 bit 8080 with DMA on the MicroMod ATP with 15cm wires and an Amazon TFT

As soon as I get a bit more free time, I can take a look. The last year has been overwhelming with the kids, work, moving to a new place, and other things - so not much time lately to do any hobby stuff.

Regarding eLCDIF - you can use any RBG "dumb" display.
All you need is the four signal for control = Pixel clock, Vsync, Hsycn and DE
Then it's up to you to choose what color depth you want to support with the bus wiring and eLDCIF config.
I have a basic library set up here that I got working with SDRAM and LVGL v9.0 - https://github.com/david-res/eLCDIF_t4
 
As soon as someone presents a complete spreadsheet for a gen5 board, I will make that happen!
An obvious change would be to connect the "PD" USB port to USB2_DN and USB2_DP on the IMXRT1062 so it can be properly used as a host port, rather than using both ports identically as either device or host.

Although looking at the spacing of the current ports, they might be a bit too close together to have things plugged into both of them at the same time.
 
It's odd that you got bad results on bus speed with the Dev board and the MicroMod board - I tested FlexIO on both of them without issues. I developed the 8 bit 8080 with DMA on the MicroMod ATP with 15cm wires and an Amazon TFT
I think it might have more to do with the ER-TFTM101-1 display as I am using the same length wires as well. I do not have any other displays to test with...
 
As soon as someone presents a complete spreadsheet for a gen5 board, I will make that happen!
I'm hoping that a few pins that are currently present can be removed, to give more space under that tight BGA.
I slightly updated @mjs513 copy of it he sent recently.

In that I filled in more information about the newer pins
1713990458500.png


I Hid most of the columns, which shows:
1713990709528.png

The ones in Yellow are the ones we do not have. 3 are critical for CSI (D3-D5)

The PIXLK/MCLK - There are currently other pins on the board that can work, that is pins 59-62 (B1_12 - B1_25) are alternatives to the pins we
use on the T4.1...

Hope that makes sense.

As for can you remove other pins? ??? I don't know. For example are there some of you who would like to try to read in an image using CSI, to display on your parallel display on FlexIO... You probably need a lot of pins
 

Attachments

  • DogBoneSDRAMv1.zip
    1.2 MB · Views: 327
Ok folks got the SDRAM adapter board setup and running with a OV5640 @VGA with a few caveats.

When I first hooked it up the camera clock is was set at 10Mhz which works nicely with a regular Micromod. However, with SDRAM and using SDRAM:
Image_20240424_152117.jpg


without DMA it gets a bit better:
Image_20240424_152148.jpg


After some experimenting set the camera clock to 8Mhz and got an image using non-dma option using SDRAM
Image_20240424_202032.jpg


Other thing I noticed is you can not go above about 15Mhz SPI clock for the ILI9488 without issues otherwise image is corrupt.

Ok thats about it for now
 
@mjs513, looks interesting. Wonder what the differences are from the DMA and non-dma? Other than the obvious. Like when we are not using DMA operation, the new data is written to the cache, and only later, when necessary, it flushes it out to the real memory. But with DMA we are writing directly to the physical memory. So, wondering what that does with the timing...

Not sure if there is any way to monitor when the different IO pins going to the SDDram to see what the timing differences are.
Sorry I don't know how to say that very clearly. How to see when data is received from the camera, versus when the bytes/words are written to memory and potentially when cache operations happen... My older and slowest Logic Analyzer has 16 inputs, my faster ones have only 8...
 
Sounds a bit like the same thing I saw when I tried using SDRAM with FlexIO VGA output, using DMA to feed the shifter registers - DMA was too slow to keep up with the higher pixel clocks. Yet somehow the eLCDIF module (which has its own bus mastering) has no problem reading SDRAM at the same frequencies.
 
We finally have DMA working properly on the MicroMod and presumably the SDRAM Dev Board (not tested yet). My MicroMod quit while testing the 8080 mode for the ER-TFTM101-1 display. Had to reposition the MCU board in the connector. It's working fine now. Anyway, Another forum user decided to use Ra8876LiteTeensy 8080 library on his TFT0784 display. I was using the SDRAM Dev Board and he was using the MicroMod.
This is the conversation. Using his scope and LA he figured out why DMA was not working properly on the SDRAM and MicroMod boards. Basically there were ~ 12 clocks being sent at the beginning of the DMA transfer that were not part of the data buffer. This part of the code explains it better:
Code:
FASTRUN void RA8876_t3::MulBeatWR_nPrm_DMA(const void *value, uint32_t const length)
{
  while(WR_DMATransferDone == false) {}  //Wait for any DMA transfers to complete

    uint32_t BeatsPerMinLoop = SHIFTNUM * sizeof(uint32_t) / sizeof(uint8_t);   // Number of shifters * number of 8 bit values per shifter
    uint32_t majorLoopCount, minorLoopBytes;
    uint32_t destinationModulo = 31-(__builtin_clz(SHIFTNUM*sizeof(uint32_t))); // defines address range for circular DMA destination buffer

//    FlexIO_Config_SnglBeat();
    CSLow();
    DCHigh();

  if (length < 8){
//Serial.println ("In DMA but to Short to multibeat");
    const uint16_t * newValue = (uint16_t*)value;
    uint16_t buf;
    for(uint32_t i=0; i<length; i++)
      {
        buf = *newValue++;
          while(0 == (p->SHIFTSTAT & (1U << 0)))
          {
          }
          p->SHIFTBUF[0] = buf >> 8;
          while(0 == (p->SHIFTSTAT & (1U << 0)))
          {
          }
          p->SHIFTBUF[0] = buf & 0xFF;
      }       
      //Wait for transfer to be completed
      while(0 == (p->TIMSTAT & (1U << 0)))
      {
      }
    CSHigh();

  } else {
    //memcpy(framebuff, value, length);
    //arm_dcache_flush((void*)framebuff, sizeof(framebuff)); // always flush cache after writing to DMAMEM variable that will be accessed by DMA
    
    FlexIO_Config_MultiBeat();
    
    MulBeatCountRemain = length % BeatsPerMinLoop;
    MulBeatDataRemain = (uint16_t*)value + ((length - MulBeatCountRemain)); // pointer to the next unused byte (overflow if MulBeatCountRemain = 0)
    TotalSize = (length - MulBeatCountRemain)*2;               /* in bytes */
    minorLoopBytes = SHIFTNUM * sizeof(uint32_t);
    majorLoopCount = TotalSize/minorLoopBytes;
//Serial.printf("Length(16bit): %d, Count remain(16bit): %d, Data remain: %d, TotalSize(8bit): %d, majorLoopCount: %d \n",length, MulBeatCountRemain, MulBeatDataRemain, TotalSize, majorLoopCount );
    /* Configure FlexIO with multi-beat write configuration */
    flexDma.begin();

    /* Setup DMA transfer with on-the-fly swapping of MSB and LSB in 16-bit data:
     *  Within each minor loop, read 16-bit values from buf in reverse order, then write 32bit values to SHIFTBUFBYS[i] in reverse order.
     *  Result is that every pair of bytes are swapped, while half-words are unswapped.
     *  After each minor loop, advance source address using minor loop offset. */
    int destinationAddressOffset, destinationAddressLastOffset, sourceAddressOffset, sourceAddressLastOffset, minorLoopOffset;
    volatile void *destinationAddress, *sourceAddress;

    DMA_CR |= DMA_CR_EMLM; // enable minor loop mapping

/* My most time-consumed lines of code to get a perfect image is here. I still don't fully understand why this is needed. But here is my clue:
 * The DMA setup further down uses the transfers in reverse mode. That's another thing that I don't understand as this is the only way I get it working.
 * It seems that because of the reverse transfer logic of DMA the first (in forward thinking) bytes which equals the first 12 clocks are zero.
 * This lead to 6 black pixels for every new buffer transfered. Together with my Logic Analyzer and countless hours I found out that I can set the data
 * for the first 12 clocks if I set the first three SHIFTERS with the first 12 bytes of the image. This is done here.
 * Why 12 and not 16 and why SHIFTER 4 is not having this problem - I don't know.....
 * Somebody more celver than me can maybe explain this.
 */

p->SHIFTBUFHWS[0] = *(uint32_t *)value;
uint32_t *value32 = (uint32_t *)value;
value32++;
p->SHIFTBUFHWS[1] = *(uint32_t *)value32;
value32++;
p->SHIFTBUFHWS[2] = *(uint32_t *)value32;

/*
 * this is a regular "forward" way and high-level way of doing DMA transfers which should work in our use-case. But it doesn't.
 * for whatever reason, the buffer must be filled in reverse order. I don't know why. But it works. It could be that finer control
 * of the minor Loop behavior is needed which is not available through this DMA API therefore this is setup manually below.
          flexDma.begin();
          flexDma.sourceBuffer((volatile uint16_t*)value, majorLoopCount*2);
          flexDma.destinationCircular((volatile uint32_t*)&p->SHIFTBUF[SHIFTNUM-1], SHIFTNUM);
          flexDma.transferCount(majorLoopCount);
          flexDma.transferSize(minorLoopBytes*2);
          flexDma.triggerAtHardwareEvent(hw->shifters_dma_channel[SHIFTER_DMA_REQUEST]);
          flexDma.disableOnCompletion();
          flexDma.interruptAtCompletion();
          flexDma.clearComplete();
          flexDma.attachInterrupt(dmaISR);
          flexDma.enable();
          dmaCallback = this;
          return;
 */
    /* From now on, the SHIFTERS in MultiBeat mode are working correctly. Begin DMA transfer */
    sourceAddress = (uint16_t*)value + minorLoopBytes/sizeof(uint16_t) - 1; // last 16bit address within current minor loop
    sourceAddressOffset = -sizeof(uint16_t); // read values in reverse order
    minorLoopOffset = 2*minorLoopBytes; // source address offset at end of minor loop to advance to next minor loop
    sourceAddressLastOffset = minorLoopOffset - TotalSize; // source address offset at completion to reset to beginning
// Use SHIFTBUHWS instead of SHIFTBUF or SHIFBUFBYS.
destinationAddress = (void *)&p->SHIFTBUFHWS[SHIFTNUM - 1]; // last 32bit shifter address (with reverse byte order)
    destinationAddressOffset = -sizeof(uint32_t); // write words in reverse order
    destinationAddressLastOffset = 0;

    flexDma.TCD->SADDR = sourceAddress;
    flexDma.TCD->SOFF = sourceAddressOffset;
    flexDma.TCD->SLAST = sourceAddressLastOffset;
    flexDma.TCD->DADDR = destinationAddress;
    flexDma.TCD->DOFF = destinationAddressOffset;
    flexDma.TCD->DLASTSGA = destinationAddressLastOffset;
    flexDma.TCD->ATTR =
        DMA_TCD_ATTR_SMOD(0U)
      | DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_16BIT) // 16bit reads
      | DMA_TCD_ATTR_DMOD(destinationModulo)
      | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); // 32bit writes
    flexDma.TCD->NBYTES_MLOFFYES =
        DMA_TCD_NBYTES_SMLOE
      | DMA_TCD_NBYTES_MLOFFYES_MLOFF(minorLoopOffset)
      | DMA_TCD_NBYTES_MLOFFYES_NBYTES(minorLoopBytes);
    flexDma.TCD->CITER = majorLoopCount; // Current major iteration count
    flexDma.TCD->BITER = majorLoopCount; // Starting major iteration count

    flexDma.triggerAtHardwareEvent(hw->shifters_dma_channel[SHIFTER_DMA_REQUEST]);
    flexDma.disableOnCompletion();
    flexDma.interruptAtCompletion();
    flexDma.clearComplete();
    //Serial.println("Dma setup done");

    /* Start data transfer by using DMA */
    WR_DMATransferDone = false;
    flexDma.attachInterrupt(dmaISR);
    flexDma.enable();
    //Serial.println("Starting transfer");
    dmaCallback = this;
   }
}
This produces a very quick and clean image on the display (other than my camera work which is a little blurry):
DMA_FlexIO_MM.jpg

The image can be repeatedly displayed accurately at 24 MHz. Just need to wire up the SDRAM Board and test it as well. Wondering if this is the same problem I am having with 8-bit reads:unsure: Once done I'll update the GitHub libraries...
 
I find it odd that SDRAM is not working well with FlexIO & DMA..

On the original version of the devboard, @Dogbone06 has just standard T4.1 PSRAM, and I was able to use a frame buffer in there with my Micromod 8080 library without any issues..

The SDRAM, eLCDIF and PXP all run on the same bus master, which allows them to communicate without holding up the program, but not sure how DMA fits in here.

I did come across this thread on the NXP community about data corruption with SDRAM and DMA:
 
I find it odd that SDRAM is not working well with FlexIO & DMA..

On the original version of the devboard, @Dogbone06 has just standard T4.1 PSRAM, and I was able to use a frame buffer in there with my Micromod 8080 library without any issues..

The SDRAM, eLCDIF and PXP all run on the same bus master, which allows them to communicate without holding up the program, but not sure how DMA fits in here.

I did come across this thread on the NXP community about data corruption with SDRAM and DMA:
That was an interesting forum post especially if you read all the way through it. The fix they recommended was basically to change both BCMRX registers to 0x81.

Doing that
C++:
    // TODO: reference manual page 1364 says "Recommend to set BMCR0 with 0x0 for
    // applications that require restrict sequence of transactions", same on BMCR1
    //SEMC_BMCR0 = SEMC_BMCR0_WQOS(5) | SEMC_BMCR0_WAGE(8) |
    //    SEMC_BMCR0_WSH(0x40) | SEMC_BMCR0_WRWS(0x10);
    //SEMC_BMCR1 = SEMC_BMCR1_WQOS(5) | SEMC_BMCR1_WAGE(8) |
    //    SEMC_BMCR1_WPH(0x60) | SEMC_BMCR1_WRWS(0x24) | SEMC_BMCR1_WBR(0x40);
    SEMC_BMCR0 = 0x81;
    SEMC_BMCR1 = 0x81;

and still reducing the camera clock to 8MHZ (has issues at 10Mhz) seems to have helped getting sdram working. Here is an VGA image:

IMG_1160.jpg


We finally have DMA working properly on the MicroMod and presumably the SDRAM Dev Board (not tested yet). My MicroMod quit while testing the 8080 mode for the ER-TFTM101-1 display. Had to reposition the MCU board in the connector. It's working fine now. Anyway, Another forum user decided to use Ra8876LiteTeensy 8080 library on his TFT0784 display. I was using the SDRAM Dev Board and he was using the MicroMod.
This is the conversation. Using his scope and LA he figured out why DMA was not working properly on the SDRAM and MicroMod boards.
The same code base we are using on the SDRAM Board is working on the Micromod. Only the SDRAM board was giving us issues. That does not mean that it can't be imrpoved - have to really look at that multibeat code.
 
The SDRAM, eLCDIF and PXP all run on the same bus master, which allows them to communicate without holding up the program, but not sure how DMA fits in here.
LCDIF, CSI, and PXP all have their own masters. Their priorities (against each other) are controlled by the SIM_MAIN NIC registers. The documentation hints that LCDIF has a bunch of cache memory tucked away inside of it, that isn't directly accessible.
The DMA controller and the CPU have their priorities controlled by the SIM_M7 NIC registers.

I have a feeling DMA is only going to work at an acceptable rate with SDRAM when configured to transfer data in "32-byte burst (4x 64-bit beats)" which is similar to how the CPU cache accesses it.

Changing the BMCRx registers would only help if the issues are caused by reads and writes happening to the same locations at the same time, and they were being reordered incorrectly with respect to each other. I still can't see how that would occur without it already being a race condition - if two peripherals are accessing the same memory without any sort of synchronization to control which one goes first, you can't predict which one will be served first.
 
Back
Top