DMA > 32K Bytes

Status
Not open for further replies.

mborgerson

Well-known member
I am working on a project to generate gray-scale NTSC video using a 3-bit R2R ladder DAC on GPIOD bits 0 to 2. My plan is to use a large buffer of uint8_t to hold all the sync, porch, black and video levels and send them to GPIOD_PDOR one byte at a time.

I have two problems:

1. The buffer is 110,040 bytes long. It looks like the dmachannel object limits me to 32KB or less, even when there is no minor loop and the transfer size is 1byte. My first and second readings of the Kinetis manual and Kinetis.h seem to show that much larger transfers are possible when the minor loop is disabled:
#define DMA_TCD_NBYTES_MLOFFNO_NBYTES(n) ((uint32_t)((n) & 0x3FFFFFFF)) // NBytes transfer count when minor loop disabled

With the t3.6 having 256K for my video frame buffer, it would be nice if dmachannel functions would support larger transfers when there is no minor loop. I will try direct operations on the TCD fields to see if they will work.

2. even with smaller test transfers, I get no output when I try to use the PDB to time transfers to my desired 6MHz clock rate.
Does using the PDB for periodic transfers require setting the TRIG bit (bit 6) of the channel configuration register? The triggerAtHardwareEvent method seems to set only the enable bit (bit 7).


I have had some success with this approach sending the output to DAC0, but the frame buffer is twice as large because the DAC needs 16-bit inputs. In Addition, the DAC isn't really fast enough to respond at 6MHz, so there is a lot of horizontal blur in the output.
 
I am not sure about 2.

But you might take a look at using DMASettings and chaining them.

Example ILI9341 display 320*240*2=153600 bytes, which for example in the ili9341_t3n library I have the option to use DMA to output to SPI...
Note: there are a few other things going on, but some of my init stuff for DMA for T3.6...
Code:
#if defined(__MK66FX1M0__) 
	// T3.6

	// BUGBUG:: check for -1 as wont work on SPI2 on T3.5
//	uint16_t *fbtft_start_dma_addr = _pfbtft;

	
	//Serial.printf("CWW: %d %d %d\n", CBALLOC, SCREEN_DMA_NUM_SETTINGS, count_words_write);
	// Now lets setup DMA access to this memory... 
	_dmasettings[0].sourceBuffer(&_pfbtft[1], (COUNT_WORDS_WRITE-1)*2);
	_dmasettings[0].destination(_pkinetisk_spi->PUSHR);

	// Hack to reset the destination to only output 2 bytes.
	_dmasettings[0].TCD->ATTR_DST = 1;
	_dmasettings[0].replaceSettingsOnCompletion(_dmasettings[1]);

	_dmasettings[1].sourceBuffer(&_pfbtft[COUNT_WORDS_WRITE], COUNT_WORDS_WRITE*2);
	_dmasettings[1].destination(_pkinetisk_spi->PUSHR);
	_dmasettings[1].TCD->ATTR_DST = 1;
	_dmasettings[1].replaceSettingsOnCompletion(_dmasettings[2]);

	_dmasettings[2].sourceBuffer(&_pfbtft[COUNT_WORDS_WRITE*2], COUNT_WORDS_WRITE*2);
	_dmasettings[2].destination(_pkinetisk_spi->PUSHR);
	_dmasettings[2].TCD->ATTR_DST = 1;
	_dmasettings[2].replaceSettingsOnCompletion(_dmasettings[3]);

	// Sort of hack - but wrap around to output the first word again. 
	_dmasettings[3].sourceBuffer(_pfbtft, 2);
	_dmasettings[3].destination(_pkinetisk_spi->PUSHR);
	_dmasettings[3].TCD->ATTR_DST = 1;
	_dmasettings[3].replaceSettingsOnCompletion(_dmasettings[0]);

	// Setup DMA main object
	//Serial.println("Setup _dmatx");
	_dmatx.begin(true);
	_dmatx.triggerAtHardwareEvent(dmaTXevent);
	_dmatx = _dmasettings[0];
	_dmatx.attachInterrupt(dmaInterrupt);
There is sort of a hack involved in this one in that I use PUSHR with the first word of the buffer, to prime it and to get the upper word correct...
But the interesting things are, the setting up the different settings objects to handle a portion of the buffer, and then using the replaceSettingsOnCompletion
setup to say, when you complete this transfer continue on with the information in the settings object that I am linked to...
 
I have written a sample sketch that shows how to transfer more than 32KB in a single DMA transfer. To do this you have to bypass the functions of the dmaChannel object, which does not allow transfers larger than 32KB. In the following code I set up a transfer of 200KBytes to the lower byte of PortD, as I have done for other video frame transfers.

There are two issues with this code:
1. I hard-coded the transfer to work only with DMA channel zero when I setup the DMA control register.
2. The transfer happens at full bus speed because it all occurs as a single minor-loop transfer. Since it is a single transfer, I cannot time the output using the PDB to get transfers at a desired PDB clock rate. On a Teensy3.6 at 180MHz, the bytes transfer at about 36MB/second
Code:
.



/************************************************************************
 * 
 *  This sketch shows how to transfer a large memory buffer to the
 *  low byte of GPIO Port D.
 *  The teensy dmaChannel device won't handle buffers larger than 32KB,
 *  so  you have to directly interact with the TCD and DMA control 
 *  register.
 *   M. Borgerson     9/5/19
 *
 *************************************************************************/


#include <DMAChannel.h>
#include "kinetis.h"

DMAChannel dma0;                          //dma for DAC channel 0


#define HSYNCPIN  35
#define VSYNCPIN  33

// Pin 33 is used to generate a pulse for oscilloscope trigger
#define VSLOW digitalWriteFast(33, LOW);
#define VSHIGH digitalWriteFast(33,HIGH);


uint8_t  dmabuff[200*1024];
uint32_t buffsize = sizeof(dmabuff);



void dumpTCD(void){
  uint32_t *csptr;  Serial.printf("DMA TCD at: 0x%08lX\n",(uint32_t)&dma0.TCD);
  Serial.printf("SADDR:      0x%08lX\n", (int)dma0.TCD->SADDR);
  Serial.printf("SOFF:       %d\n", dma0.TCD->SOFF);
  Serial.printf("ATTR:       0x%08lX\n", dma0.TCD->ATTR);
  Serial.printf("NBYTES:     %ld\n", dma0.TCD->NBYTES_MLOFFNO);
  Serial.printf("SLAST:      0x%08lX\n", dma0.TCD->SLAST);
  Serial.printf("DADDR:      0x%08lX\n", (int)dma0.TCD->DADDR);
  Serial.printf("DOFF:       %d\n", dma0.TCD->DOFF);
  Serial.printf("CITER:      0x%08lX\n", dma0.TCD->CITER);
  Serial.printf("DLASTSGA:   0x%08lX\n", dma0.TCD->DLASTSGA);
  Serial.printf("CSR:        0x%08lX\n", (uint32_t)dma0.TCD->CSR);
  Serial.printf("BITER:      0x%08lX\n", dma0.TCD->BITER);
  Serial.println("--------------");
}

// initialize the dma buffer to a ramp waveform
void RampInitDmaBuff(void){
uint32_t i;
  for(i= 0; i<buffsize; i++) dmabuff[i] = i & 0x07;
}

// initialize the dma buffer to a square wave for easier measurment of frequency
// use half of full amplitude for output
void SquareInitDmaBuff(void){
uint32_t i;
  for(i= 0; i<buffsize; i++) dmabuff[i] = (i&1) * 4;
}


// The definition for the PDOR in kinetis.h is for a 32-bit port, but we want
// the DMA setup to recognize the port as an 8-bit port
#define GPIOD_PDORBL    (*(volatile uint8_t *)0x400FF0C0) // Port Data Output Register low byte

//  Set up DMA to transfer dmabuff to low byte of GPIOD
void InitDMA(void){
  uint8_t *lptr;

  dma0.disable();
  lptr =  &dmabuff[0];

  dma0.TCD->DADDR = (&GPIOD_PDORBL);
  dma0.TCD->SADDR = (uint8_t *)lptr;         // source data buffer
  dma0.TCD->SOFF = 1;                 // advance by 1 bytes (8 bits) per read
  dma0.TCD->DOFF = 0;                 // Do not change destination after write
  dma0.TCD->ATTR_SRC = 0;               
  dma0.TCD->NBYTES = sizeof(dmabuff);
  dma0.TCD->SLAST = -sizeof(dmabuff);
  dma0.TCD->BITER = 1;   // for one large minor loop BITER and CITER = 1
  dma0.TCD->CITER = 1;

  dma0.disableOnCompletion();  // Just do one transfer at a time
  // To make sure that the large transfer works, you have to set the DMA_CR_EMLM
  // bit to zero in the DMA control register.  The following command sets the
  // other bits, but not the DMA_CR_EMLM bit.
  DMA_CR = DMA_CR_GRP1PRI| DMA_CR_EDBG;
  // Use the PDB to trigger each DMA transfer
  dma0.triggerManual();

  dma0.enable();
  
}
  

void setup() {
  //Start serial
  Serial.begin(9600);
  delay(200);
  delay(100);
  SquareInitDmaBuff();  // initialize buffer to 
  pinMode(HSYNCPIN, OUTPUT);
  pinMode(VSYNCPIN, OUTPUT);
  Serial.println("PDB DMA Test. Starting...");
  // these are the lower 3 bits of GPIOD
  // used for the 3-bit R2R ladder DAC
  pinMode(2, OUTPUT);
  pinMode(14, OUTPUT);
  pinMode(7,  OUTPUT);

   
}


#define LOOPDELAY 100

void loop(){
char ch;
  while (Serial.available() ) {
    ch = Serial.read();
    Serial.printf("Command = <%c> \n", ch);
    if(ch == 'r'){
      RampInitDmaBuff();
    }
    if(ch == 's'){
      SquareInitDmaBuff();
    }
    if(ch == 'd'){
      dumpTCD();       
    }
  }
  VSHIGH;
  InitDMA();  // Initialize dma0 for transfer to GPIOD

  delay(LOOPDELAY); 
  VSLOW;
}
 
Status
Not open for further replies.
Back
Top