Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 17 of 17

Thread: T4 Pixel Pipeline Library

  1. #1
    Senior Member vjmuzik's Avatar
    Join Date
    Apr 2017
    Posts
    732

    T4 Pixel Pipeline Library

    I started working on adding support for the Pixel Pipeline built in to the T4.
    So far I have the overlay and output stages working correctly, rotate/flip in the output stage also works.
    I have tested color key and alpha for the overlay stage so I can confirm that it's working.

    As for the input stage I can't figure out exactly why it's not working as it should be.
    It draws it in the correct place with the correct background color, but its only drawing black where the input buffer should be.
    I haven't tried any color space conversions yet since the input stage isn't fully working.

    You will need to update your Teensy core from here: https://github.com/PaulStoffregen/cores
    The PXP library can be found here: https://github.com/vjmuzik/T4_PXP
    An example for a ST7789 display is included in this library here: https://github.com/vjmuzik/Adafruit_GFX_Buffer

    If you use the example above you can see a black square inside of a blue square.
    The black square should be a copy of the green square that is next to it.

    I'd like to get a second set of eyes to look at this to find where the issue occurs.

  2. #2
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    I just started putting together some pixel pipeline code today--aimed at comparing its rotate function with a software version in terms of speed and complexity. I haven't even started on the overlay stuff, so you're way ahead of me. I'll look over your code and see what I can do to help. I started with the Kinetis SDK examples, but that code goes so far toward hardware abstraction that it makes my head spin to read it. I've been working close to the metal for so many years that multiple layers of #defines and other abstractions that force me to drill through layers of header files to find the bits corresponding to the register bits defined in the reference manual gets discouraging. (WOW! was that ever a run-on-sentence!)

  3. #3
    Member
    Join Date
    Oct 2016
    Location
    Paris
    Posts
    32
    That is very cool ! I did not know about the PXP pipeline. How efficient is the flip/rotation of a buffer ?

  4. #4
    Senior Member vjmuzik's Avatar
    Join Date
    Apr 2017
    Posts
    732
    I haven’t done too big of a benchmark so I can’t say for certain. I can say that adding a rotation or flip did not increase the number of cycles it took for the PXP to process. So if you are already using the PXP it’s really efficient. I’m going to be using it for it’s overlay alpha blending and for drawing bitmaps to a frame buffer. Of course this all depends on getting the input stage working, otherwise the overlay is kind of useless.

  5. #5
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    I did a simple test to measure the time needed to rotate a QVGA image by 90 degrees using either a programmed rotation or a PXP rotation. Here are the results:
    Code:
    Rotation times as a function of memory source and destination
    
    Src-->dest		PGM (uSec)	PXP (uSec)
    --------------------------------------------------------
    DTCM   -->  DTCM	387		578
    DTCM   -->  DMAMEM	721		562
    DTCM   -->  EXTMEM	5997		6487
    DMAMEM -->  DTCM	547		561		
    DMAMEM -->  DMAMEM	701		583
    DMAMEM -->  EXTMEM	8681		6041
    EXTMEM -->  DTCM	5567		12745		
    EXTMEM -->  DMAMEM	5589		12751
    EXTMEM -->  EXTMEM	28630		19204
    The results seem to be mixed--sometimes the programmitic rotate is faster, sometimes the PXP is faster. The major difference is that the CPU is free for other tasks while the PXP does the rotation in the background. Some other things that are interesting:

    * The PXP is slower than a programmed rotation for EXTMEM to DMAMEM and EXTMEM to DTCM. Apparently the PXP has issue with cache misses when EXTMEM is the source.
    * When either source or destination is in EXTMEM, the rotate times are about 8 times higher for both rotation methods--a reflection of the overall slower access to EXTMEM.

    The program has a lot of issues that I need to address before posting it. Amongst the most annoying is that my optimized algorithm for minimizing cache misses when EXTMEM is the destination result in an inverted display image. I used vjmuzik's library as an example, but simplified my code to use direct calls as opposed to his library functions. My main issue with the library is the degree to which the PXP_Next code is entwined with the other function. I haven't yet wrapped my head around the advantages and requirements of the PXP_Next functionality so I opted to go with simpler implementation.

  6. #6
    Senior Member vjmuzik's Avatar
    Join Date
    Apr 2017
    Posts
    732
    Based on the manual the only advantage of using the pxp_next is that you can queue up another command while one is already running so as soon as the previous one finishes it'll start the next one. Though if there isn't any already running then it just immediately runs it, it works by copying the information from the address you write to it into all of the pxp registers, besides a couple that are noted in the manual as not being copied over. Aside from the queue functionality it's theoretically the same as directly writing the registers just without being able to use the SET, TOGGLE, and CLEAR registers.

  7. #7
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    There seems to be a bunch of mental overhead in the updating of the next_PXP structure for what I see to be slight gain. I see it as necessary only if you are going to do two different things in succession that require major changes to the PXP setup. Then you could set up two (or more) next_PXP structures and switch between the image processing functions by alternating the structure pointers. If you're just doing one thing, it hardly seems worth the trouble. One thing that might make it worthwhile is to have a number of next_PXP structures saved in flash or on an SD card. You could then read them into memory and have a very short initialization routine that would only have to set up the clock and point to the saved structure.

    It also seems that there may be a simpler way to set up a next_PXP structure:
    Code:
    //    WARNING!!!!   Untested code!!!!
    // function to set next_PXP structure from PXP registers after they are set up
    void Set_next_PXP(struct NEXT_PXP_t *nextptr);
    uint32_t   * srcptr, *nextptr;
    uint16_t i;
    
        srcptr = &PXP_CTRL;
        dstptr = nextptr;
    
    
        for(i= 0; i<48; i++){
           *destptr++ = *srcptr;
           srcptr += 4;  // PXP saved registers are 16 bytes apart--I hope!
        }
    
    }
    Once you have a tested PXP setup, you could call the function to fill the next_PXP structure. You could then save it to memory or call a function that would print out the structure contents in a form that could easily be added to source code.

  8. #8
    Senior Member vjmuzik's Avatar
    Join Date
    Apr 2017
    Posts
    732
    For my use case I’m going to be making multiple calls in succession so I preemptively added the support in for it. Most of the registers are 16 bytes apart except for the last 3, to me it also doesn’t make sense to manually copy it over in a for loop when the PXP hardware will copy it by itself by just writing to the NEXT register. Though I do see the appeal of having NEXT_PXP structures already precomputed as opposed to setting them up on the fly. But, again for my use case everything is going to be done dynamically so my thought process was more focused on that when I wrote it. It’s hard to explain the idea I have in my head of how I think I want it to work. It’s also necessary to have the cache flushes for your buffers if they are anywhere besides RAM1 before running a PXP operation since it is DMA.

  9. #9
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    I also have a couple of projects that will benefit from dynamic changes in the PXP operation with the NEXT_PXP capabilities. For example:

    1. Set up the OV7670 camera to produce a VGA image in the YUV422 format.
    2. Option 1: Pass the YUV422 buffered image to a function to convert it to JPEG for compression.
    Option 2: Pass the YUV422 buffered image to the PXP and convert it to RGB888 for storage in a PC-compatible bitmap file
    3. After step 2 set up the PXP to scale the YUV or BGA RGB888 image to a QVGA RGB565 image for a TFT display.

    I think that step 2 should help eliminate some of the banding I see in gradients in RGB565 images on the PC, as it should give better color resolution---although with lower spatial resolution in the colors.

    I think his type of problem can be approached by having two different PXP setups and saving each to a separate NEXT_PXP structure. You then simply store the appropriate structure address and there is no need to update any of the PXP registers.

    Yesterday I spent some time thinking about ways to save the NEXT_PXP structure in nonvolatile memory (EEPROM) or on SD. (Or print it out as an initialized array of uint32_t that you can copy and paste into other source code). There seems to be some utility in that, but there is the sticking point that any changes in the program are likely to change the addresses of buffers defined as standard variables. I see two ways to get around that issue:

    1. After you load a PXP setup from nonvolatile memory, you call the functions necessary to set the pointers to the buffers you have defined in your program.

    2. If you are working with VGA images, you are constrained to using EXTMEM, so you could set fixed memory addresses in EXTMEM by declaring buffer pointers as constants--and remember to avoid declaring other variables in EXTMEM that might conflict. You could set four 1MB buffers in locations at the 4, 5, 6,and 7MB boundaries. Each could hold a VGA RGB888 image (about 921KB) or one of the smaller formats. If you are manipulating images one of the smaller formats you could do the same thing in a portion of DMAMEM for speed, but you will have to make sure you don't conflict with things like USB buffers or any heap variables.

    PXP setup then simplifies to:
    * Call function to Setup Clock and reset PXP
    * Retrieve a saved PXP_NEXT Structure and store pointer to it in the NEXT register
    * (Optional: update buffer addresses for variables declared in program)

    The advantage of this process is that a new user can write programs to use the stored PXP setups without having to learn the PXP internals. If the canned setups are not exactly what they want, they can then modify the PXP as needed and as their familiarity with the hardware improves.

    I'm hoping that something like conversion from RGB565 to RGB888 would need only about 60 lines of source---including the pre-defined PXP_NEXT structure and about 30 lines of code and function calls. I guess I'll see if I can make that happen with a rotate function before I move on to color space conversions. (Even though rotation is something that is done as well with a 10-line function--unless you have a way to use the foreground time while the PXP does the work.)



    I think I've got enough of a handle on the PXP vs Program rotation stuff that I will next move on to trying some of the color-space conversions.

  10. #10
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    Here are two demo programs that set up the pixel pipeline to do rotations.

    The first program does the actual PXP setup, with OV7670 Camera output rotated and displayed on an ILI9341 TFT module. The PXP setup is also converted to an array of 32 uint32_t entries that can be used with the PXP_NEXT functionality to do a setup of the PXP in other programs by cutting and pasting the array initialization code.

    Code:
    /****************************************************
      Collect camera data into two different sets of
      buffers and set up PXP rotations then print PXP_Next
      data for cut and paste into another program
      M Borgerson 12/5/2020
     ******************************************************** */
    
    #include <OV7670.h>
    #include <ILI9341_t3n.h>
    
    // need to include PXP definitions if not using the
    // latest imxrt.h from GitHub (as of 12/4/2020).
    #ifndef PXP_CTRL_SET
    #include "PXP_Defs.h"
    #endif
    
    //Specify the pins used for Non-SPI functions of display
    #define TFT_CS   10  // AD_B0_02
    #define TFT_DC   9  // AD_B0_03
    #define TFT_RST  8
    
    ILI9341_t3n tft = ILI9341_t3n(TFT_CS, TFT_DC, TFT_RST);
    
    #define OUTBUFIDX 3
    #define PSBUFIDX 12
    const uint16_t  bwidth = 320;
    const uint16_t  bheight = 240;
    
    
    // PXP_Next structis 32 register setings, but we just save as an array
    uint32_t PXP_Next_0[32];
    
    
    // define memory buffers in different locations for speed testing
    uint16_t srcdma[320l * 240l]__attribute__ ((aligned (64))) DMAMEM;
    uint16_t dstdma[320l * 240l]__attribute__ ((aligned (64))) DMAMEM;
    
    uint16_t srcext[320l * 240l]__attribute__ ((aligned (64))) EXTMEM;
    uint16_t dstext[320l * 240l]__attribute__ ((aligned (64))) EXTMEM;
    
    // CSI frame buffer 2 isn't used by PXP rotations
    uint16_t fb2[320l * 240l];
    
    uint16_t *srcptr = (uint16_t *)&srcdma;
    uint16_t *dstptr = (uint16_t *)&dstdma;
    
    const char compileTime [] = " Compiled on " __DATE__ " " __TIME__;
    
    const int pinCamReset = 14;
    
    void setup() {
      Serial.begin(9600);
      delay(200);
      Wire.begin();
    
      pinMode(pinCamReset, OUTPUT);
    
    
      digitalWriteFast(pinCamReset, LOW);
      delay(10);
      digitalWriteFast(pinCamReset, HIGH);  // subsequent resets via SCB
    
    
      if (OV7670.begin(QVGA, (uint8_t *)srcptr, (uint8_t *)&fb2)) {
        Serial.println("OV7670 camera initialized.");
    
      } else {
        Serial.println("Error initializing OV7670");
      }
      // 12 MHz gives 15FPS.  16MHz will do 20FPS, but leaves little time
      // for anything but video display.
      OV7670.SetCamClock(12);
    
      // Start ILI9341
      tft.begin();
      tft.setRotation(0);  // testing external rotation
    
      CMSI();
    
      Serial.println("Initializing PXP");
      delay(50);
      PXP_Init(srcptr, dstptr);
      delay(10);
      Serial.println("Ready");
    }
    
    void loop() {
      // Only 3 choices:  's' System Info  'f' capture single frame 't' run rotate tests
      char ch;
      if (Serial.available()) {
        ch = Serial.read();
        if (ch == 's') CMSI();
        if (ch == 'f') CMGF();
        if (ch == 't') TestSpeeds();
      }
    
    }
    
    elapsedMicros rutime;
    
    void CMSI(void) {
      Serial.printf("\n\nOV7670 Camera and ILI9341  QVGA Test 3 %s\n", compileTime);
      OV7670.ShowCamConfig();
    }
    
    void ShowPXP(void) {
      Serial.printf("PXP_OUT_BUF:%08X \n", PXP_OUT_BUF);
      Serial.printf("PXP_PS_BUF:%08X \n", PXP_PS_BUF);
    }
    
    // save the PXP registers to the PXP_Next array
    void SavePXPNext(uint32_t pxnptr[]) {
      uint16_t i;
      volatile uint32_t *pxptr = &PXP_CTRL;  // set first address
      uint32_t *nxptr = &pxnptr[0];
      for (i = 0; i < 29; i++) { // first 29 are at 16-byte intervals
        *nxptr++ = *pxptr;
        pxptr += 4; // skips ahead 16 bytes at input
      }
      // the last three entries are oddly spaced
      *nxptr++ = PXP_POWER;
      *nxptr++ = PXP_NEXT;
      *nxptr = PXP_PORTER_DUFF_CTRL;
    }
    
    // print out a PXP_Next array in a format that can be pasted into
    // source code to get the same PXP behavior
    void PrintPXPNext(const char *arrayname, uint32_t pxnptr[]) {
      uint16_t i;
      Serial.printf("\nuint32_t %s[32] = {", arrayname);
      for (i = 0; i < 31; i++) {
        if ((i % 6) == 0) Serial.println();
        Serial.printf("0x%08X, ", *pxnptr++);
      }
      // last one can't have a comma and needs bracket amd semicolon
      Serial.printf("0x%08X };", *pxnptr);
      Serial.println();
    }
    
    bool PXP_Done(void) {
      return PXP_STAT & PXP_STAT_IRQ;
    }
    
    // updated with PXP definitions from new imxrt.h
    // and using constants bwidth = 320, bheight = 240 for QVGA
    void PXP_Init(uint16_t *inbuff, uint16_t *outbuff) {
      // turn on the PXP Clock
      CCM_CCGR2 |= CCM_CCGR2_PXP(CCM_CCGR_ON);
    
      PXP_CTRL_SET = PXP_CTRL_SFTRST; //Reset the PXP
      PXP_CTRL_CLR = PXP_CTRL_SFTRST | PXP_CTRL_CLKGATE; //Clear reset and gate
      delay(10);
    
      PXP_CTRL_SET = PXP_CTRL_ROTATE(3) | PXP_CTRL_BLOCK_SIZE;  // Set Rotation 3 block size 16x16
    
      PXP_CSC1_COEF0 |= PXP_COEF0_BYPASS; 
    
      PXP_OUT_CTRL_SET = 0x0E;  // specify RGB565 output
      PXP_OUT_BUF = (volatile void *)outbuff;
      PXP_OUT_PITCH = bheight * 2; // output is 240 pixels by 2 bytes after rotation
      PXP_OUT_LRC = 0;
      PXP_OUT_LRC = ((bwidth) << 16) | (bheight);
    
      PXP_OUT_AS_ULC = 0xFFFFFFFF;  // not using the alpha surface
      PXP_OUT_AS_LRC = 0;
    
      PXP_OUT_PS_ULC = 0;  // start processing at upper left 0,0
      PXP_OUT_PS_LRC = ((bwidth) << 16) | (bheight); // same as output
    
      PXP_PS_CTRL_SET = 0x0E;  // PS buffer format is RGB565
      PXP_PS_BUF = (volatile void *)inbuff;
      PXP_PS_UBUF = 0;  // not using YUV
      PXP_PS_VBUF = 0;  // not using YUV
      PXP_PS_PITCH = 640; // input is 320 pixels by 2 bytes wide before rotation
      PXP_PS_SCALE = 0x10001000; // 1:1 scaling (0x1.000)
      PXP_PS_CLRKEYLOW_0 = 0xFFFFFF;  // this disables color keying
      PXP_PS_CLRKEYHIGH_0 = 0x0;  //  this disables color keying
    
      PXP_CTRL_SET = PXP_CTRL_IRQ_ENABLE;
      // we don't actually use the interrupt but need to enable the bits
      // in the PXP_STAT register
    }
    
    void PXP_Rotate(void) {
      uint32_t etime;
      SavePXPNext((uint32_t*)&PXP_Next_0);
      PXP_STAT_CLR = PXP_STAT;  // clears all flags
      PXP_CTRL_SET =  PXP_CTRL_ENABLE;  // start the PXP
      rutime = 0;
      // wait until rotation finished
      while (!PXP_Done()) {};
      PXP_CTRL_CLR =  PXP_CTRL_ENABLE;  // stop the PXP
      etime = rutime;
      Serial.printf("PXP Rotation took %lu microseconds\n", etime);
    }
    
    // Capture, rotate and display a single frame from OV7670
    void CMGF(void) {
      uint16_t readyframe;
      uint32_t imagesize;
      imagesize = OV7670.ImageSize();
      OV7670.begin(QVGA, (uint8_t *)PXP_PS_BUF, (uint8_t *)&fb2);
      OV7670.ClearFrameReady();
      do {
        readyframe = OV7670.FrameReady();
      } while (readyframe != 1 ); // wait until  frame 1 just completed
    
      if ((uint32_t)PXP_PS_BUF > 0x2020000) { // makes camera dma data visible
        arm_dcache_delete((void *)PXP_PS_BUF, imagesize);
      }
      if ((uint32_t)PXP_OUT_BUF > 0x2020000) { // needed when doing DMA into memory
        arm_dcache_delete((void *)PXP_OUT_BUF, imagesize);
      }
      PXP_Rotate();
      Serial.printf("Output buffer at %p\n", PXP_OUT_BUF);
      if ((uint32_t)PXP_OUT_BUF > 0x2020000) {
        arm_dcache_flush((void *)PXP_OUT_BUF, imagesize); // needed when doing DMA out of memory
      }
      tft.writeRect(0, 0, tft.width(), tft.height(), (uint16_t *)PXP_OUT_BUF);
    
    }
    
    void TestFrame(uint16_t *psrc, uint16_t *pdst) {
      // set up frame 1 to store in psrc, frame 2 to fb2
      uint32_t imagesize;
      imagesize = OV7670.ImageSize();
      PXP_PS_BUF = (void *)psrc;
      PXP_OUT_BUF = (void *)pdst; // set the PXP OUT buffer pointer
      if ((uint32_t)psrc > 0x20200000) arm_dcache_flush((void*)psrc, imagesize);
      if ((uint32_t)pdst > 0x20200000) arm_dcache_flush((void*)pdst, imagesize);
    
      CMGF();
    }
    
    // try various combinations of source and destination memory
    // to compare the rotation speeds
    void TestSpeeds(void) {
      Serial.println("\nDMAMEM to DMAMEM");
      TestFrame((uint16_t *)&srcdma, (uint16_t *)&dstdma);
      PrintPXPNext("Rot_DMA_DMA ", (uint32_t*)&PXP_Next_0);
    
      Serial.println("\nDMAMEM to EXTMEM");
      TestFrame((uint16_t *)&srcdma, (uint16_t *)&dstext);
      PrintPXPNext("Rot_DMA_EXT ", (uint32_t*)&PXP_Next_0);
    
      Serial.println("\nEXTMEM to DMAMEM");
      TestFrame((uint16_t *)&srcext, (uint16_t *)&dstdma);
      PrintPXPNext("Rot_EXT_DMA ", (uint32_t*)&PXP_Next_0);
    
      Serial.println("\nEXTMEM to EXTMEM");
      TestFrame((uint16_t *)&srcext, (uint16_t *)&dstext);
      PrintPXPNext("Rot_EXT_EXT ", (uint32_t*)&PXP_Next_0);
    }
    The second program allows you to use the same PXP setup without going through setup code or modifying any registers to switch between two setups. The program does require that you specify the input and output buffers for the rotation--in case the new program should have the arrays in different places than the original program.

    Code:
    /****************************************************
      Collect camera data and rotate using restored PXP
      setups copied from output of SavePXP program
      m. borgerson   12/5/2020
     ******************************************************** */
    #include <OV7670.h>
    #include <ILI9341_t3n.h>
    
    // need to include PXP definitions if not using the
    // latest imxrt.h from GitHub (as of 12/4/2020).
    #ifndef PXP_CTRL_SET
    #include "PXP_Defs.h"
    #endif
    
    //Specify the pins used  display for Non-SPI functions
    #define TFT_CS   10  // AD_B0_02
    #define TFT_DC   9  // AD_B0_03
    #define TFT_RST  8
    
    ILI9341_t3n tft = ILI9341_t3n(TFT_CS, TFT_DC, TFT_RST);
    
    // we are using QVGA settings for camera
    const uint16_t  bwidth = 320;
    const uint16_t  bheight = 240;
    
    // define memory buffers in different locations for speed testing
    uint16_t srcdma[320l * 240l]__attribute__ ((aligned (64))) DMAMEM;
    uint16_t dstdma[320l * 240l]__attribute__ ((aligned (64))) DMAMEM;
    
    uint16_t srcext[320l * 240l]__attribute__ ((aligned (64))) EXTMEM;
    uint16_t dstext[320l * 240l]__attribute__ ((aligned (64))) EXTMEM;
    
    #define OUTBUFIDX 3
    #define PSBUFIDX 12
    // PXP_Next struct is 32 register setings, but we just save as an array
    
    uint32_t Rot_EXT_EXT [32] = {
      0x00800302, 0x00000000, 0x0000000E, 0x70000000, 0x00000000, 0x000001E0,
      0x014000F0, 0x00000000, 0x014000F0, 0x3FFF3FFF, 0x00000000, 0x0000000E,
      0x70025800, 0x00000000, 0x00000000, 0x00000280, 0x00000000, 0x10001000,
      0x00000000, 0x00FFFFFF, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
      0x00FFFFFF, 0x00000000, 0x44000000, 0x01230208, 0x079B076C, 0x00000000,
      0x00000000, 0x00000000
    };
    
    uint32_t Rot_DMA_DMA [32] = {
      0x00800302, 0x00000000, 0x0000000E, 0x20200000, 0x00000000, 0x000001E0,
      0x014000F0, 0x00000000, 0x014000F0, 0x3FFF3FFF, 0x00000000, 0x0000000E,
      0x20225800, 0x00000000, 0x00000000, 0x00000280, 0x00000000, 0x10001000,
      0x00000000, 0x00FFFFFF, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
      0x00FFFFFF, 0x00000000, 0x44000000, 0x01230208, 0x079B076C, 0x00000000,
      0x00000000, 0x00000000
    };
    
    #define OUTBUFIDX 3
    #define PSBUFIDX 12
    
    // CSI frame buffer 2 isn't used by PXP rotations
    uint16_t fb2[320l * 240l];
    
    const char compileTime [] = " Compiled on " __DATE__ " " __TIME__;
    
    const int pinCamReset = 14;
    
    void setup() {
      uint8_t *srcptr = (uint8_t *)Rot_DMA_DMA [PSBUFIDX];
      Serial.begin(9600);
      delay(200);
      Wire.begin();
    
      pinMode(pinCamReset, OUTPUT);
      digitalWriteFast(pinCamReset, LOW);
      delay(10);
      digitalWriteFast(pinCamReset, HIGH);  // subsequent resets via SCB
    
      if (OV7670.begin(QVGA, (uint8_t *)srcptr, (uint8_t *)&fb2)) {
        Serial.println("OV7670 camera initialized.");
    
      } else {
        Serial.println("Error initializing OV7670");
      }
      // 12 MHz gives 15FPS.  16MHz will do 20FPS, but leaves little time
      // for anything but video display.
      OV7670.SetCamClock(12);
      // Start ILI9341
      tft.begin();
      tft.setRotation(0);  // testing external rotation
    
      CMSI();
      Serial.println("Adjusting PXP buffer addresses.");
      delay(10);
      SetNextBuffers(Rot_EXT_EXT, srcext, dstext);
      SetNextBuffers(Rot_DMA_DMA, srcdma, dstdma);
    
      Serial.println("Initializing PXP");
      delay(50);
      PXP_Start((uint32_t)&Rot_DMA_DMA);
      delay(10);
      Serial.println("Ready");
    }
    
    void loop() {
      // Only 3 choices:  's' System Info  'f' capture single frame 't' run rotate tests
      char ch;
      if (Serial.available()) {
        ch = Serial.read();
        if (ch == 's') CMSI();
        if (ch == 'f') CMGF();
        if (ch == 't') TestSpeeds();
      }
    }
    
    // adjust the source and destination buffers in the saved PXP settings to match
    // the variables declared in this program
    void SetNextBuffers(uint32_t pxpnxt[], uint16_t src[], uint16_t dst[]) {
      pxpnxt[PSBUFIDX] = (uint32_t)src;
      pxpnxt[OUTBUFIDX] = (uint32_t)dst;
    }
    
    void CMSI(void) {
      Serial.printf("\n\nOV7670 Camera and ILI9341  QVGA Test 3 %s\n", compileTime);
      OV7670.ShowCamConfig();
    }
    
    // Capture, rotate and display a single frame from OV7670
    void CMGF(void) {
      uint16_t readyframe;
      uint32_t imagesize;
      imagesize = OV7670.ImageSize();
      OV7670.begin(QVGA, (uint8_t *)PXP_PS_BUF, (uint8_t *)&fb2);
      OV7670.ClearFrameReady();
      do {
        readyframe = OV7670.FrameReady();
      } while (readyframe != 1 ); // wait until  frame 1 just completed
    
      if ((uint32_t)PXP_PS_BUF > 0x2020000) { // makes camera dma data visible
        arm_dcache_delete((void *)PXP_PS_BUF, imagesize);
      }
      if ((uint32_t)PXP_OUT_BUF > 0x2020000) { // needed when doing DMA into memory
        arm_dcache_delete((void *)PXP_OUT_BUF, imagesize);
      }
      PXP_Rotate();
      Serial.printf("Output buffer at %p\n", PXP_OUT_BUF);
      if ((uint32_t)PXP_OUT_BUF > 0x2020000) {
        arm_dcache_flush((void *)PXP_OUT_BUF, imagesize); // needed when doing DMA out of memory
      }
      tft.writeRect(0, 0, tft.width(), tft.height(), (uint16_t *)PXP_OUT_BUF);
    
    }
    
    // Use two different PXP_NEXT settings to compare the rotation speeds
    void TestSpeeds(void) {
      Serial.println("\nDMAMEM to DMAMEM");
      PXP_NEXT = (uint32_t)&Rot_DMA_DMA;
      PXP_CTRL_CLR = PXP_CTRL_ENABLE;  // stop automatic execution on PXP_NEXT write
      CMGF();// get, rotate and display a frame
      delay(1000);
      Serial.println("\nEXTMEM to EXTMEM");
      PXP_NEXT = (uint32_t)&Rot_EXT_EXT;
      PXP_CTRL_CLR = PXP_CTRL_ENABLE;  // stop automatic execution on PXP_NEXT write
      CMGF();  // get, rotate and display a frame
    }
    
    bool PXP_Done(void) {
      return PXP_STAT & PXP_STAT_IRQ;
    }
    
    // Restart PXP with settings from a PXP_Next array
    void PXP_Start(uint32_t pxnptr) {
      // turn on clock to PXP
      CCM_CCGR2 |= CCM_CCGR2_PXP(CCM_CCGR_ON);
    
      PXP_CTRL_SET = PXP_CTRL_SFTRST; //Reset
      PXP_CTRL_CLR = PXP_CTRL_SFTRST | PXP_CTRL_CLKGATE; //Clear reset and gate
      delay(10);
      // storing pointer in PXP_NEXT causes PXP to restore settings
      PXP_NEXT = pxnptr;
    }
    
    elapsedMicros rutime;
    void PXP_Rotate(void) {
      uint32_t etime;
      PXP_STAT_CLR = PXP_STAT;  // clears all flags
      PXP_CTRL_SET =  PXP_CTRL_ENABLE;  // start the PXP
      rutime = 0;  // reset the timing counter
      // wait until rotation finished
      while (!PXP_Done()) { };
    
      PXP_CTRL_CLR =  PXP_CTRL_ENABLE;  // stop the PXP
      etime = rutime;
      Serial.printf("PXP Rotation took %lu microseconds\n", etime);
    }
    The primary advantage of using the PXP_Next arrays for setup is that you can switch from one PXP setup to another with minimal code. These rotation example are excessively simple in that the PXP rotation can be bypassed by just setting the TFT display rotation to 3 instead of zero. In my case, this example code was a first step toward simplifying setup and restore for more complex operations like scaling and color space conversions.

  11. #11
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,762
    Very cool signs of utility and progress. Using another hardware capability for 'background' processing keeping the loop() free to loop()!

  12. #12
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    8,230
    Would be good to have the needed pins for the rest of the video hardware.. let's hope the MM has them instead 8 serial..
    Last edited by Frank B; 12-05-2020 at 11:20 PM.

  13. #13
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    I've made good progress on one of my goals: having the PXP convert incoming YUV422 image buffers from the OV7670 camera to QVGA RGB565 buffers for display on an ILI9341 board. I have a test program that accepts YUV422 QQVGA, QVGA, and VGA images, scales them as necessary to display at QVGA size, then converts from YUV to RGB565 for the ILI9341. I'll post some demo code when I clean up the debugging cruft and simplify the user interface.

    One of the key tools in the development was getting a readable display of the PXP setup. I ended up with this:
    Code:
    CTRL:         00800002       STAT:         00200001
    OUT_CTRL:     0000000E       OUT_BUF:      70000000    OUT_BUF2: 00000000
    OUT_PITCH:         640       OUT_LRC:       320,240
    OUT_PS_ULC:      0,  0       OUT_PS_LRC:    320,240
    OUT_AS_ULC:   16383,16383    OUT_AS_LRC:      0,  0
    
    PS_CTRL:      00000032       PS_BUF:       70300000
    PS_UBUF:      00000000       PS_VBUF:      00000000
    PS_PITCH:         1280       PS_BKGND:     00000080
    PS_SCALE:     20002000       PS_OFFSET:    00000000
    PS_CLRKEYLOW: 00FFFFFF       PS_CLRKEYLHI: 00000000
    
    AS_CTRL:      00000000       AS_BUF:       00000000    AS_PITCH:      0
    AS_CLRKEYLOW: 00FFFFFF       AS_CLRKEYLHI: 00000000
    
    CSC1_COEF0:   84030000       CSC1_COEF1:   01230208    CSC1_COEF2: 076B079C
    
    POWER:        00000000       NEXT:         00000000
    PORTER_DUFF:  00000000
    Old Fart Digression: Why is the default font for code display not a monospaced font? If I want to have my columns line up nicely, I have to change the font to Courier New.

    Here is the code for the display, which I hope will help out others working with the Pixel Pipeline:

    Code:
    
    void ShowPXP(void) {
      Serial.printf("CTRL:         %08X       STAT:         %08X\n", PXP_CTRL, PXP_STAT);
      Serial.printf("OUT_CTRL:     %08X       OUT_BUF:      %08X    OUT_BUF2: %08X\n", PXP_OUT_CTRL,PXP_OUT_BUF,PXP_OUT_BUF2);
      Serial.printf("OUT_PITCH:    %8lu       OUT_LRC:       %3u,%3u\n", PXP_OUT_PITCH, PXP_OUT_LRC>>16, PXP_OUT_LRC&0xFFFF);
    
      Serial.printf("OUT_PS_ULC:    %3u,%3u       OUT_PS_LRC:    %3u,%3u\n", PXP_OUT_PS_ULC>>16, PXP_OUT_PS_ULC&0xFFFF,
                                                                   PXP_OUT_PS_LRC>>16, PXP_OUT_PS_LRC&0xFFFF);
      Serial.printf("OUT_AS_ULC:   %3u,%3u    OUT_AS_LRC:    %3u,%3u\n", PXP_OUT_AS_ULC>>16, PXP_OUT_AS_ULC&0xFFFF,
                                                                   PXP_OUT_AS_LRC>>16, PXP_OUT_AS_LRC&0xFFFF);
      Serial.println();
      Serial.printf("PS_CTRL:      %08X       PS_BUF:       %08X\n", PXP_PS_CTRL,PXP_PS_BUF);
      Serial.printf("PS_UBUF:      %08X       PS_VBUF:      %08X\n", PXP_PS_UBUF, PXP_PS_VBUF);
      Serial.printf("PS_PITCH:     %8lu       PS_BKGND:     %08X\n", PXP_PS_PITCH, PXP_PS_BACKGROUND_0);
      Serial.printf("PS_SCALE:     %08X       PS_OFFSET:    %08X\n", PXP_PS_SCALE,PXP_PS_OFFSET);
      Serial.printf("PS_CLRKEYLOW: %08X       PS_CLRKEYLHI: %08X\n", PXP_PS_CLRKEYLOW_0,PXP_PS_CLRKEYHIGH_0);
      Serial.println();
      Serial.printf("AS_CTRL:      %08X       AS_BUF:       %08X    AS_PITCH: %6u\n", PXP_AS_CTRL,PXP_AS_BUF, PXP_AS_PITCH & 0xFFFF);
      Serial.printf("AS_CLRKEYLOW: %08X       AS_CLRKEYLHI: %08X\n", PXP_AS_CLRKEYLOW_0,PXP_AS_CLRKEYHIGH_0);
      Serial.println();
      Serial.printf("CSC1_COEF0:   %08X       CSC1_COEF1:   %08X    CSC1_COEF2: %08X\n", 
                                                                    PXP_CSC1_COEF0,PXP_CSC1_COEF1,PXP_CSC1_COEF2);
      Serial.println();
      Serial.printf("POWER:        %08X       NEXT:         %08X\n", PXP_POWER,PXP_NEXT);
      Serial.printf("PORTER_DUFF:  %08X\n", PXP_PORTER_DUFF_CTRL);
    }
    
    

  14. #14
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    A follow up note with some PXP conversion timings:

    VGA YUV422 to QVGA RGB565 70.43mSec
    QVGA YUV422 to QVGA RGB565 22.15mSec
    QQVGA YUV422 to QVGA RGB565 10.13mSec

    Another anomaly that annoys me: When I switch the OV7670 from full-frame VGA mode to QVGA the downsampling process seems to switch the output from UYVY to VYUY and I have to change the YUV output bit order settings to get the colors right.

    There are other issues with the setting of the camera and PXP YUV to RGB conversion coefficients. I can understand how the guys who write the camera and display drivers for smartphones have to spend hundreds or thousands of hours tweaking registers to get the photo and display quality we expect from our phones.

  15. #15
    Senior Member vjmuzik's Avatar
    Join Date
    Apr 2017
    Posts
    732
    Now that I got my test boards in I loaded this up again and thanks to your ShowPXP method I was able to determine that the manual has a slight error in the PXP_NEXT description that caused the issues for me, specifically with this line here:
    Code:
    All registers will be reloaded with the exception of the following: STAT, CSCCOEFn, NEXT.
    As it turns out CSCCOEFn is in fact reloaded and since I never set it in my next structure the bypass bit wasn't staying turned on like I thought it would be and I would've never checked for it either.

    Since everything is now working correctly on my end I was able to finally test some dynamic color space conversions with alpha overlays and I'm happy to report that it is working wonderfully for my application.

  16. #16
    Senior Member
    Join Date
    Feb 2018
    Location
    Corvallis, OR
    Posts
    331
    Nice to see you have things working for your application. I didn't run into the glitch with CSCOEFn restoration as my scheme for saving and restoring a Next_PXP array pulls all the data from the PXP and stores it in the array with a simple for() loop and three transfers for the items at the end of the array. I haven't yet tackled alpha overlays, but I can see a possible usage for time stamping captured camera frames by writing the time stamp into a small bitmap and using overlays to show it in the captured frame. Another possible project is to display a video or camera frame capture in a window on my ILI9341 while putting some touch-screen start and stop buttons in the foreground.

    Right now, my PXP operations are so tightly tied to my OV7670 camera setup that I need to make sure that the camera code is stable before posting more PXP code.

    I've started cleaning up the camera code and have posted it on GitHub at https://github.com/mjborgerson/OV7670

    I need to go through all the examples in the library folder to make sure that they work with the latest OV7670 object and there is a lot of work to be done on the pdf documentation file.

  17. #17
    Senior Member vjmuzik's Avatar
    Join Date
    Apr 2017
    Posts
    732
    I’m using it almost like a texture map for drawing “realistic” controls on screen and drawing the color separately underneath it. With the color space conversion it makes it easy to use transparent bitmaps in ARGB8888 format to accomplish that. This way I don’t have to store a separate bitmap for every color I want to use since that will waste memory. Though it can push the limits of what the PXP can handle if I draw too many controls too quickly.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •