Call to arms | Teensy + SDRAM = true

From memory, the PD chip is only meant to power the second USB-C port (not the board) and it only does that if you bridge the middle pad (which is connected to the second USB-C port's V+ lines) to the PD's output. If you want the USB-C port to not do PD / only supply 5V (from the board's 5V supply) then you need to bridge the middle pad to the other side.
 
Changing the BMCRx registers would only help if the issues are caused by reads and writes happening to the same locations at the same time, and they were being reordered incorrectly with respect to each other. I still can't see how that would occur without it already being a race condition - if two peripherals are accessing the same memory without any sort of synchronization to control which one goes first, you can't predict which one will be served first.
Replying to myself here: I have (very painfully) discovered how this reordering can be a problem!

In my USB host code, I use this very specific sequence to add new transfers to a queue head:
Code:
  usb_transfer* p = dummy.release(); // copy dummy ptr
  dummy.reset(head);                 // head becomes new dummy
  head->token.status = 0x40;         // set inactive+halted
  *p = *head;                        // copy over dummy (becomes new head)
  cache_flush(p);                    // ensure new qTD data is updated _before_ activating
  cache_flush(head);
  cache_sync();                      // ensure cache flush has completed
  p->token.status = 0x80;            // set old dummy/new head active
  cache_flush(p);                    // flush (triggers overlay if QH is idle)

The tricky thing that may not be obvious here is that the USB Host Controller is constantly checking the memory pointed to by p to see if the status is set to active. That's why cache_flush() is called on p twice; once when the contents are initially copied from head (followed by a dsb instruction to ensure it completes) then again after token.status is set to active, to ensure that the status is the final modification made to the physical memory. But from what I was seeing, the host controller was reading the status as active when the other fields were still set to incorrect values, as if they had not been updated before the status.

The fix, as has been mentioned earlier in the thread, was to set the SEMC_BMCR0 and SEMC_BMCR1 registers to 0x81 to prevent reads/writes to the SDRAM from being reordered by the SEMC controller. This ensures the cache lines evictions occur in the correct order.
 
Dog bones boards arrived today.

I've soldered the bootloader chip on and on pressing the button I'm getting 3 red blinking lights.

According to pjrc store link that means it's not recognising the flash chip. Realised I've used W25Q128JVPQ instead of dogbones recommended W25Q128JVPIM. Is this chip incompatible?
 
Dog bones boards arrived today.

I've soldered the bootloader chip on and on pressing the button I'm getting 3 red blinking lights.

According to pjrc store link that means it's not recognising the flash chip. Realised I've used W25Q128JVPQ instead of dogbones recommended W25Q128JVPIM. Is this chip incompatible?
Yeah, that flash chip won't work. See the "supported chips" section here: https://www.pjrc.com/store/ic_mkl02_t4.html
 
I've got DogBones board up and running with the replacement flash chip. It works great. Been running some 24bit LCD screen tests.

I've been bashing my head against the wall trying to output using the LCDIF without much success. In theory it should all be working. I'm trying to use Vysnc mode with a set transfer size that will output my buffer - then stop. This is using 24 bit. I'm not worried about Vsync timings or anything like that - i dont have a screen even connected. At this point i'm just trying to generate the clock signal and output the buffer once.

Reading on the scope ive got a 25mhz signal on the clock pin B0_00. However i cannot get it to output anything on the data pins. I'm stuck in the while loop waiting for run to clear. Doing some register printouts I'mseeing the lcd fifos are all empty. Yet the APB clock is running.

I've just created a buffer in RAM1 to narrow down testing. Code here should run as standalone (Should even be on a Teensy 4.1 - Need it to get passed the while loop) I'm not sure why the transfers are not starting.

Ive been following the order from the data sheet (Page 1877)
35.5.1 Write Modes
The following initialization steps are common to all eLCDIF write modes of operation
before entering any particular mode.
Initialization steps:
1. Configure the external I/Os to correctly interface the external display, when required.
2. Start the Display Clock (pix_clk) clock and set the appropriate frequency by
programming the registers in CCM.
3. Start the Bus Clock (apb_clk) and set the appropriate frequency by programming the
registers in CCM.
4. Bring the eLCDIF out of soft reset and disable the clock gate bit.
5. Set the transfer mode of operation to bus master. The LCDIF_CTRL[MASTER] bit
determines the transfer mode selected. To select bus master mode, set
LCDIF_CTRL[MASTER] =1 (to select APBDMA, set LCDIF_CTRL[MASTER]
=0).
6. Set the LCDIF_CTRL[INPUT_DATA_SWIZZLE] according to the endianness of
the eLCDIF controller. Also, set the LCDIF_CTRL[DATA_SHIFT_DIR] and
LCDIF_CTRL[SHIFT_NUM_BITS] if it is required to shift the data left or right
before it is output.

7. Set the LCDIF_CTRL[WORD_LENGTH] field appropriately: 0 = 16-bit input, 1 =
8-bit input, 2 = 18-bit input, 3 = 24/32-bit input. Also, select the correct 16/18/24 bit
data format with the corresponding fields in LCDIF_CTRL register.
8. Set the LCDIF_CTRL1[BYTE_PACKING_FORMAT] field according to the input
frame.
9. Set the LCDIF_CTRL[LCD_DATABUS_WIDTH] appropriately: 0 = 16-bit output,
1 = 8-bit output, 2 = 18-bit output, 3 = 24/32-bit output.
10. Enable the necessary IRQs.

Code
Code:
#include <Arduino.h>
const uint32_t frameWidth = 100;
const uint32_t frameHeight = 100;
const uint32_t frameSize = frameWidth * frameHeight;

uint32_t bufferTest[frameSize];

void setup()
{
  FillBuffer();
  SetLCDMUX();
  SetLCDCLK25();
  InitLCD();
  OutputBuffer();
}

void loop()
{
 
}

void FillBuffer()
{
  for (uint32_t i = 0; i < frameSize; i++)
  {
    if (i & 1)
    {
      bufferTest[i] = 0x55555555;
    }
    else
    {
      bufferTest[i] = 0xAAAAAAAA;
    }
  }
}

void SetLCDMUX()
{
  delay(1);
  uint32_t pinSettings = 0x10B0;

  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_00 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_00 & ~0x7) | 0x0;  // ALT0 = LCD_CLK
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_00 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_01 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_01 & ~0x7) | 0x0;  // ALT0 = LCD_ENABLE
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_01 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_02 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_02 & ~0x7) | 0x0;  // ALT0 = LCD_HSYNC
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_02 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_03 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_03 & ~0x7) | 0x0;  // ALT0 = LCD_VSYNC
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_03 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_04 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_04 & ~0x7) | 0x0;  // ALT0 = LCD_DATA00
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_04 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_05 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_05 & ~0x7) | 0x0;  // ALT0 = LCD_DATA01
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_05 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_06 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_06 & ~0x7) | 0x0;  // ALT0 = LCD_DATA02
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_06 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_07 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_07 & ~0x7) | 0x0;  // ALT0 = LCD_DATA03
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_07 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_08 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_08 & ~0x7) | 0x0;  // ALT0 = LCD_DATA04
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_08 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_09 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_09 & ~0x7) | 0x0;  // ALT0 = LCD_DATA05
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_09 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_10 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_10 & ~0x7) | 0x0;  // ALT0 = LCD_DATA06
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_10 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_11 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_11 & ~0x7) | 0x0;  // ALT0 = LCD_DATA07
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_11 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_12 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_12 & ~0x7) | 0x0;  // ALT0 = LCD_DATA08
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_12 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_13 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_13 & ~0x7) | 0x0;  // ALT0 = LCD_DATA09
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_13 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_14 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_14 & ~0x7) | 0x0;  // ALT0 = LCD_DATA10
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_14 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_15 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_15 & ~0x7) | 0x0;  // ALT0 = LCD_DATA11
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B0_15 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_00 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_00 & ~0x7) | 0x0;  // ALT0 = LCD_DATA12
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_00 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_01 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_01 & ~0x7) | 0x0;  // ALT0 = LCD_DATA13
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_01 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_02 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_02 & ~0x7) | 0x0;  // ALT0 = LCD_DATA14
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_02 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_03 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_03 & ~0x7) | 0x0;  // ALT0 = LCD_DATA15
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_03 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_04 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_04 & ~0x7) | 0x0;  // ALT0 = LCD_DATA16
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_04 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_05 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_05 & ~0x7) | 0x0;  // ALT0 = LCD_DATA17
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_05 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_06 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_06 & ~0x7) | 0x0;  // ALT0 = LCD_DATA18
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_06 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_07 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_07 & ~0x7) | 0x0;  // ALT0 = LCD_DATA19
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_07 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_08 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_08 & ~0x7) | 0x0;  // ALT0 = LCD_DATA20
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_08 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_09 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_09 & ~0x7) | 0x0;  // ALT0 = LCD_DATA21
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_09 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_10 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_10 & ~0x7) | 0x0;  // ALT0 = LCD_DATA22
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_10 = pinSettings;
  IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_11 = (IOMUXC_SW_MUX_CTL_PAD_GPIO_B0_11 & ~0x7) | 0x0;  // ALT0 = LCD_DATA23
  IOMUXC_SW_PAD_CTL_PAD_GPIO_B1_11 = pinSettings;
}

void SetLCDCLK25()
{
  uint32_t divSelect = 33;
  uint32_t numerator = 8;
  uint32_t denominater = 24;
  uint32_t finalDivide = 3;  //25mhz
  //uint32_t finalDivide = 2;  //33.3mhz
  //uint32_t finalDivide = 1;  //50mhz
  //uint32_t finalDivide = 0;  //100mhz

  // Switch the video PLL to bypass, enable it, and set the divider.
  CCM_ANALOG_PLL_VIDEO = CCM_ANALOG_PLL_VIDEO_BYPASS | CCM_ANALOG_PLL_VIDEO_ENABLE | CCM_ANALOG_PLL_VIDEO_DIV_SELECT(divSelect);
 
  //set POST_DIV_SELECT
  CCM_ANALOG_PLL_VIDEO_CLR = CCM_ANALOG_PLL_VIDEO_POST_DIV_SELECT(3); // clear bits
  CCM_ANALOG_PLL_VIDEO_SET = CCM_ANALOG_PLL_VIDEO_POST_DIV_SELECT(0); // e.g. "1" => /2
 
  // Clear the misc2 post-divider bits.
  CCM_ANALOG_MISC2_CLR = CCM_ANALOG_MISC2_VIDEO_DIV(3);
  CCM_ANALOG_MISC2_SET = CCM_ANALOG_MISC2_VIDEO_DIV(2);

  // Set the PLL numerator and denominator.
  CCM_ANALOG_PLL_VIDEO_NUM = numerator;
  CCM_ANALOG_PLL_VIDEO_DENOM = denominater;

  // Ensure PLL is powered.
  CCM_ANALOG_PLL_VIDEO_CLR = CCM_ANALOG_PLL_VIDEO_POWERDOWN;
 
  // Wait for PLL lock.
  while (!(CCM_ANALOG_PLL_VIDEO & CCM_ANALOG_PLL_VIDEO_LOCK)) { }
 
  // Deactivate bypass.
  CCM_ANALOG_PLL_VIDEO_CLR = CCM_ANALOG_PLL_VIDEO_BYPASS;

  // Gate off LCD clocks.
  CCM_CCGR2 &= ~CCM_CCGR2_LCD(CCM_CCGR_ON);
  CCM_CCGR3 &= ~CCM_CCGR3_LCDIF_PIX(CCM_CCGR_ON);

  // Set LCDIF clock source
  uint32_t r = CCM_CSCDR2;
  r &= ~(CCM_CSCDR2_LCDIF_PRE_CLK_SEL(7) | CCM_CSCDR2_LCDIF_PRED(7));
  r |= CCM_CSCDR2_LCDIF_PRE_CLK_SEL(2) | CCM_CSCDR2_LCDIF_PRED(finalDivide);
  CCM_CSCDR2 = r;

  // Set LCDIF post-divider to 1.
  CCM_CBCMR &= ~CCM_CBCMR_LCDIF_PODF(7);

  // Gate on clocks
  CCM_CCGR2 |= CCM_CCGR2_LCD(CCM_CCGR_ON);
  CCM_CCGR3 |= CCM_CCGR3_LCDIF_PIX(CCM_CCGR_ON);
}

void InitLCD()
{
//Step 4 Soft Reset
  // Ungate the clock.
  LCDIF_CTRL_CLR = LCDIF_CTRL_CLKGATE;
  while (LCDIF_CTRL & LCDIF_CTRL_CLKGATE) {}  // Wait for clock gating to clear.

  // Soft Reset.
  LCDIF_CTRL_SET = LCDIF_CTRL_SFTRST;
  while (!(LCDIF_CTRL & LCDIF_CTRL_SFTRST)) {}  // Wait for reset flag.

  // Clear the reset and ungate the clock.
  LCDIF_CTRL_CLR = LCDIF_CTRL_SFTRST | LCDIF_CTRL_CLKGATE;

//Step 5 Bus Master using LCD DMA (Not APBDMA)
  LCDIF_CTRL |= LCDIF_CTRL_MASTER;

//Step 6 Set Swizzle.  Clear and set all to 0.  Bits 15-14 and 13-12
  LCDIF_CTRL &= ~((0x3U << 14) | (0x3U << 12));

//Step 7 Set Word Length to 3 (24-bit mode) Bits 9–8
  LCDIF_CTRL &= ~(0x3U << 8);
  LCDIF_CTRL |= (3U << 8);

//Step 8 Byte packing format 24/32bit Bits 19-16 to 0x7
  LCDIF_CTRL1 &= ~(0xFU << 16);
  LCDIF_CTRL1 |= (0x7U << 16);

//Step 9 Databus Width 24 bit Bits 11–10
  LCDIF_CTRL &= ~(0x3U << 10);
  LCDIF_CTRL |= (3U << 10);

//Disable dotclock mode. Use Vsync mode.  Bit 17
  LCDIF_CTRL &= ~(1U << 17);

//Bypass Count set to 0.  Set Run bit to zero after transfer
  LCDIF_CTRL &= ~(1U << 19);

//Enable recover on underflow Bit 24
  LCDIF_CTRL1 |= (1U << 24);

//Clear FIFO Set Bit 21
  LCDIF_CTRL1 |= (1U << 21);

//Set the transfer size
  LCDIF_TRANSFER_COUNT = LCDIF_TRANSFER_COUNT_V_COUNT(frameHeight) | LCDIF_TRANSFER_COUNT_H_COUNT(frameWidth);

//Set Timing Generators
  LCDIF_VDCTRL0 = LCDIF_VDCTRL0_ENABLE_PRESENT | LCDIF_VDCTRL0_VSYNC_PERIOD_UNIT |  // Count VSYNC period in lines
                  LCDIF_VDCTRL0_VSYNC_PULSE_WIDTH_UNIT |                            // Count VSYNC pulse width in lines
                  LCDIF_VDCTRL0_VSYNC_PULSE_WIDTH(1);                               // 1 line VSYNC pulse

  LCDIF_VDCTRL1 = frameHeight;  //Frameheight.  No blanking

  LCDIF_VDCTRL2 = LCDIF_VDCTRL2_HSYNC_PULSE_WIDTH(1) | LCDIF_VDCTRL2_HSYNC_PERIOD(frameWidth);

  // VDCTRL3: Low Wait Counts.  We are only using Vsync
  LCDIF_VDCTRL3 = LCDIF_VDCTRL3_MUX_SYNC_SIGNALS | LCDIF_VDCTRL3_VSYNC_ONLY | LCDIF_VDCTRL3_HORIZONTAL_WAIT_CNT(0) | LCDIF_VDCTRL3_VERTICAL_WAIT_CNT(0);

  LCDIF_VDCTRL4 = LCDIF_VDCTRL4_SYNC_SIGNALS_ON | LCDIF_VDCTRL4_DOTCLK_H_VALID_DATA_CNT(frameWidth);
}

bool hasPrintedDebug = false;
void OutputBuffer()
{
  // Update the buffer pointers
  LCDIF_CUR_BUF = (uint32_t)bufferTest;
  LCDIF_NEXT_BUF = (uint32_t)bufferTest;

  // Start the LCDIF to output the frame.
  LCDIF_CTRL |= LCDIF_CTRL_RUN;



  //Poll until the RUN bit is cleared - frameSize has been set
  //Run bit should automatically be cleared afterwards
  while (LCDIF_CTRL & LCDIF_CTRL_RUN)
  {
    if(!hasPrintedDebug)
    {
      hasPrintedDebug = true;
      PrintCTRL();
      PrintClockStatus();
      PrintLCDIFStatus();
      PrintBusMasterError();
    }
  }
  Serial.println("Frame output complete.");
}


//Debug Register Printing
void PrintCTRL()  //Page 1882
{
  Serial.println("      DUMPING LCDIF_CTRL");
  uint32_t ctrl = LCDIF_CTRL;
  Serial.print("LCDIF_CTRL = 0x");
  Serial.println(ctrl, HEX);

  // Bit 0: RUN
  Serial.print("RUN (bit 0): ");
  Serial.println((ctrl & (1U << 0)) ? "SET" : "CLEAR");

  // Bit 1: Data_FORMAT_24_BIT
  Serial.print("Data_FORMAT_24_BIT (bit 1): ");
  Serial.println((ctrl & (1U << 1)) ? "DROP_UPPER_2_BITS_PER_BYTE" : "ALL_24_BITS_VALID");

  // Bit 5: MASTER
  Serial.print("MASTER (bit 5): ");
  Serial.println((ctrl & (1U << 5)) ? "ENABLED" : "DISABLED");

  // Bit 6: ENABLE_PXP_HANDSHAKE
  Serial.print("ENABLE_PXP_HANDSHAKE (bit 6): ");
  Serial.println((ctrl & (1U << 6)) ? "ENABLED" : "DISABLED");

  // Bits 9-8: WORD_LENGTH
  uint8_t word_length = (ctrl >> 8) & 0x3;
  Serial.print("WORD_LENGTH (bits 9-8): ");
  switch (word_length)
  {
    case 0: Serial.println("16_BIT"); break;
    case 1: Serial.println("8_BIT"); break;
    case 2: Serial.println("18_BIT"); break;
    case 3: Serial.println("24_BIT"); break;
    default: Serial.println("Unknown");
  }

  // Bits 11-10: LCD_DATABUS_WIDTH
  uint8_t databus_width = (ctrl >> 10) & 0x3;
  Serial.print("LCD_DATABUS_WIDTH (bits 11-10): ");
  switch (databus_width)
  {
    case 0: Serial.println("16_BIT"); break;
    case 1: Serial.println("8_BIT"); break;
    case 2: Serial.println("18_BIT"); break;
    case 3: Serial.println("24_BIT"); break;
    default: Serial.println("Unknown");
  }

  // Bits 13-12: CSC_DATA_SWIZZLE
  uint8_t csc_data_swizzle = (ctrl >> 12) & 0x3;
  Serial.print("CSC_DATA_SWIZZLE (bits 13-12): ");
  switch (csc_data_swizzle)
  {
    case 0: Serial.println("NO_SWAP"); break;
    case 1: Serial.println("BIG_ENDIAN_SWAP (or SWAP_ALL_BYTES)"); break;
    case 2: Serial.println("HWD_SWAP"); break;
    case 3: Serial.println("HWD_BYTE_SWAP"); break;
    default: Serial.println("Unknown");
  }

  // Bits 15-14: INPUT_DATA_SWIZZLE
  uint8_t input_data_swizzle = (ctrl >> 14) & 0x3;
  Serial.print("INPUT_DATA_SWIZZLE (bits 15-14): ");
  switch (input_data_swizzle)
  {
    case 0: Serial.println("NO_SWAP"); break;
    case 1: Serial.println("BIG_ENDIAN_SWAP (or SWAP_ALL_BYTES)"); break;
    case 2: Serial.println("HWD_SWAP"); break;
    case 3: Serial.println("HWD_BYTE_SWAP"); break;
    default: Serial.println("Unknown");
  }

  // Bit 17: DOTCLK_MODE
  Serial.print("DOTCLK_MODE (bit 17): ");
  Serial.println((ctrl & (1U << 17)) ? "ENABLED" : "DISABLED (VSYNC mode)");

  // Bit 19: BYPASS_COUNT
  Serial.print("BYPASS_COUNT (bit 19): ");
  Serial.println((ctrl & (1U << 19)) ? "SET (indefinite operation)" : "CLEAR (stop after transfer)");

  // Bits 25-21: SHIFT_NUM_BITS (5 bits)
  uint8_t shift_num_bits = (ctrl >> 21) & 0x1F;
  Serial.print("SHIFT_NUM_BITS (bits 25-21): ");
  Serial.println(shift_num_bits);

  // Bit 26: DATA_SHIFT_DIR
  Serial.print("DATA_SHIFT_DIR (bit 26): ");
  Serial.println((ctrl & (1U << 26)) ? "TXDATA_SHIFT_RIGHT" : "TXDATA_SHIFT_LEFT");

  // Bit 30: CLKGATE
  Serial.print("CLKGATE (bit 30): ");
  Serial.println((ctrl & (1U << 30)) ? "GATED" : "UNGATED");

  // Bit 31: SFTRST
  Serial.print("SFTRST (bit 31): ");
  Serial.println((ctrl & (1U << 31)) ? "ASSERTED (in reset)" : "DEASSERTED (normal operation)");
}

void PrintClockStatus()   //Page 1080 & 1082
{
  Serial.println("      DUMPING CLOCK STATUS");
  uint32_t ccgr2 = CCM_CCGR2;
  uint32_t ccgr3 = CCM_CCGR3;
 
  //CCGR2 Bits 28-29
  uint32_t apbField = (ccgr2 >> 28) & 0x3;
  Serial.print("APB clock (CCGR2, CG14): ");
  if (apbField == 0x3)
  {
    Serial.println("ENABLED");
  }
  else {
    Serial.print("DISABLED/Partially enabled (0x");
    Serial.println(apbField, HEX);
  }
 
  ////CCGR3 Bits 10-11
  uint32_t pixField = (ccgr3 >> 10) & 0x3;
  Serial.print("Pixel clock (CCGR3, CG5): ");
  if (pixField == 0x3)
  {
    Serial.println("ENABLED");
  }
  else
  {
    Serial.print("DISABLED/Partially enabled (0x");
    Serial.println(pixField, HEX);
  }
}

void PrintLCDIFStatus() //Page 1896
{
  Serial.println("      DUMPING LCDIF STATUS");
  uint32_t stat = (*(volatile uint32_t *)(0x402B81B0));
  Serial.print("LCDIF_STAT = 0x");
  Serial.println(stat, HEX);

  // Bit 31: PRESENT
  Serial.print("PRESENT (bit 31): ");
  Serial.println((stat & (1U << 31)) ? "Yes (LCDIF present)" : "No (LCDIF not present)");

  // Bit 30: DMA_REQ
  Serial.print("DMA_REQ (bit 30): ");
  Serial.println((stat & (1U << 30)) ? "Active" : "Inactive");

  // Bit 29: LFIFO_FULL
  Serial.print("LFIFO_FULL (bit 29): ");
  Serial.println((stat & (1U << 29)) ? "Yes" : "No");

  // Bit 28: LFIFO_EMPTY
  Serial.print("LFIFO_EMPTY (bit 28): ");
  Serial.println((stat & (1U << 28)) ? "Yes" : "No");

  // Bit 27: TXFIFO_FULL
  Serial.print("TXFIFO_FULL (bit 27): ");
  Serial.println((stat & (1U << 27)) ? "Yes" : "No");

  // Bit 26: TXFIFO_EMPTY
  Serial.print("TXFIFO_EMPTY (bit 26): ");
  Serial.println((stat & (1U << 26)) ? "Yes" : "No");

  // LFIFO_COUNT: Bits 8–0 (9 bits)
  uint32_t lfifo_count = stat & 0x1FF;
  Serial.print("LFIFO_COUNT (bits 8-0): ");
  Serial.println(lfifo_count);
}

void PrintBusMasterError()   //Page 1895
{
  Serial.println("      DUMPING BUS MASTER ERROR STATUS");
  uint32_t errorAddr = LCDIF_BM_ERROR_STAT;
 
  Serial.print("LCDIF BM Error Status Register = 0x");
  Serial.println(errorAddr, HEX);
 
  if (errorAddr != 0)
  {
    Serial.print("Bus master error occurred at virtual address: 0x");
    Serial.println(errorAddr, HEX);
  } else
  {
    Serial.println("No bus master error detected.");
  }
}

This is the debugging output ive got on the main registers
DUMPING LCDIF_CTRL
LCDIF_CTRL = 0xF21
RUN (bit 0): SET
Data_FORMAT_24_BIT (bit 1): ALL_24_BITS_VALID
MASTER (bit 5): ENABLED
ENABLE_PXP_HANDSHAKE (bit 6): DISABLED
WORD_LENGTH (bits 9-8): 24_BIT
LCD_DATABUS_WIDTH (bits 11-10): 24_BIT
CSC_DATA_SWIZZLE (bits 13-12): NO_SWAP
INPUT_DATA_SWIZZLE (bits 15-14): NO_SWAP
DOTCLK_MODE (bit 17): DISABLED (VSYNC mode)
BYPASS_COUNT (bit 19): CLEAR (stop after transfer)
SHIFT_NUM_BITS (bits 25-21): 0
DATA_SHIFT_DIR (bit 26): TXDATA_SHIFT_LEFT
CLKGATE (bit 30): UNGATED
SFTRST (bit 31): DEASSERTED (normal operation)
DUMPING CLOCK STATUS
APB clock (CCGR2, CG14): ENABLED
Pixel clock (CCGR3, CG5): ENABLED
DUMPING LCDIF STATUS
LCDIF_STAT = 0x94000000
PRESENT (bit 31): Yes (LCDIF present)
DMA_REQ (bit 30): Inactive
LFIFO_FULL (bit 29): No
LFIFO_EMPTY (bit 28): Yes
TXFIFO_FULL (bit 27): No
TXFIFO_EMPTY (bit 26): Yes
LFIFO_COUNT (bits 8-0): 0
DUMPING BUS MASTER ERROR STATUS
LCDIF BM Error Status Register = 0x0
No bus master error detected.
 
Oh man if I new that was there 3 days ago. I'll have a good look at that, thanks.

Is it using the lcd module to output?

Currently I'm going to use it on a smart(er) display, ssd1963 (has back buffer memory) but eventually a dumb display.

Using the lcdif module for output, you can set dma to output just the buffer size then stop. You can also do burst fetches into the lcd FIFO buffer

Because I'll be using a smart display with the pixel clock running on the wr pin, I was hoping to set the output, mux b0_00 to the pixel clock, start the transfer, when finished on an interrupt mux b0_00 back to gpio.

Then just ignore any vsync and hsync lines.
 
You can also do burst fetches into the lcd FIFO buffer
I don't think you can. You can't trust the manual here - the "APBDMA only" mode that it talks about practically doesn't exist because the FIFOs have no external entry point. The only way to get data into them is using the dotclock mode, pointing LCDIF_CUR_BUF to an existing framebuffer in memory. Then the eLCDIF acts as a bus master and pulls in data as it needs it.

Frankly if you don't need the VGA signals (VSYNC/HSYNC/etc) you're better off using FlexIO anyway.
 
I don't think you can. You can't trust the manual here - the "APBDMA only" mode that it talks about practically doesn't exist because the FIFOs have no external entry point. The only way to get data into them is using the dotclock mode, pointing LCDIF_CUR_BUF to an existing framebuffer in memory. Then the eLCDIF acts as a bus master and pulls in data as it needs it.

Frankly if you don't need the VGA signals (VSYNC/HSYNC/etc) you're better off using FlexIO anyway.
So im still bashing my head against a wall here. But i think your right. And if we could get the APBDMA mode working it only supports 16 bit.

Is 16 the maximum burst length for sdram reads?

Im intent on trying to get data out without using FlexIO. My theory is there would be less memory collisions when on the sdram bus as if we are using the LCD FIFOS (76 x 256 LFIFO and 38 x 16 TXFIFO) we could do burst reads to fill them up. If i was running the output data at a low enough frequency (Say 25mhz) that would give time between these burst reads to write to the back buffer.

Having a front and back buffer is good on SDRam, however your fighting for contention on it.
 
I don't think you can. You can't trust the manual here - the "APBDMA only" mode that it talks about practically doesn't exist because the FIFOs have no external entry point. The only way to get data into them is using the dotclock mode, pointing LCDIF_CUR_BUF to an existing framebuffer in memory. Then the eLCDIF acts as a bus master and pulls in data as it needs it.

Frankly if you don't need the VGA signals (VSYNC/HSYNC/etc) you're better off using FlexIO anyway.

I've tried mixing and matching many different register settings and could not get mpu or vsync only mode to work.

Have you got any libraries or code for Flexio and Dma?
 
Thanks Jmarsh.

Theres another 2 weeks of trying to learn FlexIO and DMA. The setup looks complicated.

I might just get a bare display from buydisplay and try to drive one of them with lcdif pxlck mode seeing as ive wrapped my head around that side of things. Will need to set up a constant current source for the led backlight.
 
If you’re using a smart display with a built in controller then take a look at my ILI9488 FlexIO/DMA driver for some inspiration

Ive recently purchased some 7” displays from BuyDisplay that are controlled bu an RA8889 and I am writing up a driver to do basic frame buffer/image transfers between a DB5 and the RA.
My only concern here is at higher bus speed, this RA seems to assert a wait pin when busy, so I might have to build in a mechanism to stop the DMA transfer and kick it off from where it stopped
 
If you’re using a smart display with a built in controller then take a look at my ILI9488 FlexIO/DMA driver for some inspiration

Ive recently purchased some 7” displays from BuyDisplay that are controlled bu an RA8889 and I am writing up a driver to do basic frame buffer/image transfers between a DB5 and the RA.
My only concern here is at higher bus speed, this RA seems to assert a wait pin when busy, so I might have to build in a mechanism to stop the DMA transfer and kick it off from where it stopped

That's awesome! Thanks for that, it's definitely some inspiration and help.
 
I think it's been mentioned before but the "C" variants of the IMXRT1062 (different from the "D" variants used on the Teensys produced by PJRC) are only nominally rated for 500MHz rather than 600MHz. This means they not only require a higher voltage setting for 600MHz (and it's technically overclocking which risks shortening the lifespan) but also for 528MHz. Using the regular voltage setting from Teensyduino can trigger some very strange crashes due to the CPU misbehaving.

Code to configure the voltage setting which takes the MCU model/speed rating into account (by reading one of the fuses) is available here.
 
I think it's been mentioned before but the "C" variants of the IMXRT1062 (different from the "D" variants used on the Teensys produced by PJRC) are only nominally rated for 500MHz rather than 600MHz. This means they not only require a higher voltage setting for 600MHz (and it's technically overclocking which risks shortening the lifespan) but also for 528MHz. Using the regular voltage setting from Teensyduino can trigger some very strange crashes due to the CPU misbehaving.

Code to configure the voltage setting which takes the MCU model/speed rating into account (by reading one of the fuses) is available here.

My board from dogbones schematic uses the D version. I had to change a lot off parts to ones JLCPCB had in stock, as well as change a lot of the components to standard components to not get stung with extra setup fees.

I do wish there was some ground pins along the b0 and b1 rails. If I recall correctly the closest ground pin from these pins is along the power rail on the right hand side of the board.
 
Has anyone tested using edma to copy from sdram to sdram to see how much quicker it is than a memcopy?

For example copying an image from sdram into a backbuffer in sdram.
 
I haven't done that, I purposefully avoid having SDRAM as both the source and destination for large copies (whether it's done using the CPU or DMA).
 
I had no luck getting SDRAM to SDRAM at a decent speed with DMA.

The fastest i could get transfers with copying from 1 buffer in SDRAM to another in SDRAM and SDRAM overclocked to 221 was using an Assembly routine. Load registers 3 to 10 sequentially from SDRAM buffer1 , then output them sequentially to SDRAM buffer2

Buffer size of 384000 int16_t read and copy was 6200 Microseconds

Buffer size of 384000 int32_t read and copy was 12440 Microseconds.

I think throughput is about 120 Megabytes a second

I tried various other methods. burst reading into a temporary array before reading out. Assembly reading into the FPU registers (Theres a lot more of them). Unrolled loops etc.


Its the reading from SDRAM that slows it down
 
Its the reading from SDRAM that slows it down
IMG_3936.jpeg


Seems in like with the application notes
 
Back
Top