T3.6 DMA and DAC

Status
Not open for further replies.

csreades

Active member
I have been using the T3.6 in a project I created that runs a piezo device that responds to analog voltage. I have the teensy connected to a board that has amplification and shift registers to link the 2 DAC channels to 16 outputs. Until now the Teensy only had one time critical activity (creating the voltage signals, I will call them waveforms). I want the teensy to start handling more time critical activities and I think the DMA will be perfect for sending from memory the waveform to the DAC. My previous experience of all this is basically just Arduino I did do mech. eng. but it didn't really cover much past what basic assembler is.

Firstly where can I go to understand the DMA, the DAC and the PDB (possible a useful module)? Is there a hardware sheet for the Teensy processor that covers these?

Secondly I found a post on hackaday blog and have built an example in teensyduino that is close to what I want.
Code:
// Blocking.pde
// -*- mode: C++ -*-
//
// Shows how to use the blocking call runToNewPosition
// Which sets a new target position and then waits until the stepper has
// achieved it.
//
// Copyright (C) 2009 Mike McCauley
// $Id: Blocking.pde,v 1.1 2011/01/05 01:51:01 mikem Exp mikem $

#include <AccelStepper.h>

#define steps_mm 134
#define BUFFER_SIZE 1024

// Define a stepper and the pins it will use
AccelStepper stepper(1, 33, 34) ; // Defaults to AccelStepper::FULL4WIRE (4 pins) on 2, 3, 4, 5
bool ran = false;
bool homed = false;
int k = 0;
static volatile uint16_t waveform_table[BUFFER_SIZE];


void setup() {
  // fill up the sine table
  for (int i = 1; i < BUFFER_SIZE; i++) {
    waveform_table[i] = 2 * 8 * 256 - 1;
  }

  for (int i = 0; i < BUFFER_SIZE; i = i + 50) {
    for (int idx = 0; idx < 25; idx ++) {
      waveform_table[i + idx] = 0;
    }
  }

  for( int i = 800; i < BUFFER_SIZE; i++) { waveform_table[i] = 0;}


  // initialise the DAC
  SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock
  DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference

  // initialise the DMA
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // Here we choose where our data is coming from, and where it is going
  DMA_TCD0_SADDR = waveform_table;   // set the address of the first byte in our LUT as the source address
  DMA_TCD0_DADDR = &DAC0_DAT0L; // set the first data register as the destination address
  // now we need to set the read and write offsets - kind of boring
  DMA_TCD0_SOFF = 4; // advance 32 bits, or 4 bytes per read
  DMA_TCD0_DOFF = 4; // advance 32 bits, or 4 bytes per write
  // this is the fun part! Now we get to set the data transfer size...
  DMA_TCD0_ATTR  = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT);
  DMA_TCD0_ATTR |= DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DMOD(31 - __builtin_clz(32)); // set the data transfer size to 32 bit for both the source and the destination
  // ...and the number of bytes to be transferred per request (or 'minor loop')...
  DMA_TCD0_NBYTES_MLNO = 16; // we want to fill half of the DAC buffer, which is 16 words in total, so we need 8 words - or 16 bytes - per transfer
  // set the number of minor loops (requests) in a major loop
  // the circularity of the buffer is handled by the modulus functionality in the TCD attributes
  DMA_TCD0_CITER_ELINKNO = DMA_TCD_CITER_ELINKYES_CITER(BUFFER_SIZE * 2 / 16);
  DMA_TCD0_BITER_ELINKNO = DMA_TCD_BITER_ELINKYES_BITER(BUFFER_SIZE * 2 / 16);
  // the address is adjusted by these values when a major loop completes
  // we don't need this for the destination, because the circularity of the buffer is already handled
  DMA_TCD0_SLAST    = -BUFFER_SIZE * 2;
  DMA_TCD0_DLASTSGA = 0;
  // do the final init of the channel
  DMA_TCD0_CSR = 0;
  // enable DAC DMA
  DAC0_C0 |= DAC_C0_DACBBIEN | DAC_C0_DACBWIEN; // enable read pointer bottom and waterwark interrupt
  DAC0_C1 |= DAC_C1_DMAEN | DAC_C1_DACBFEN | DAC_C1_DACBFWM(3); // enable dma and buffer
  DAC0_C2 |= DAC_C2_DACBFRP(0);
  // init the PDB for DAC interval generation
  SIM_SCGC6 |= SIM_SCGC6_PDB; // turn on the PDB clock
  PDB0_SC |= PDB_SC_PDBEN; // enable the PDB
  PDB0_SC |= PDB_SC_TRGSEL(15); // trigger the PDB on software start (SWTRIG)
  //PDB0_SC |= PDB_SC_CONT; // run in continuous mode
  PDB0_MOD = 40; // modulus time for the PDB
  PDB0_DACINT0 = (uint16_t)(1); // we won't subdivide the clock...
  PDB0_DACINTC0 |= 0x01; // enable the DAC interval trigger
  PDB0_SC |= PDB_SC_LDOK; // update pdb registers
  PDB0_SC |= PDB_SC_SWTRIG; // ...and start the PDB


  pinMode(35, OUTPUT);
  pinMode(36, INPUT);
  pinMode(37, INPUT);
  digitalWrite(35, LOW);
  stepper.setMaxSpeed    (750 * steps_mm);
  stepper.setAcceleration(5000 * steps_mm);
  stepper.setCurrentPosition(0);
  //stepper.setSpeed(500*steps_mm);
  Serial.begin(115200);
  Serial.println("fin settuping");
  stepper.setSpeed    (-150 * steps_mm);

  while (homed == false) {


    stepper.runSpeed();


    if (digitalRead(37) == LOW) {
      stepper.stop();
      if (ran == false) {
        ran = true;
        stepper.setCurrentPosition(0);
        stepper.runToNewPosition(1 * steps_mm);
        stepper.setSpeed(-1 * steps_mm);
      }
      else {
        stepper.stop();
        stepper.setCurrentPosition(4);
        stepper.runToNewPosition(5 * steps_mm);
        homed = true;
      }
    }
  }

}

void loop()
{
  PDB0_SC |= PDB_SC_SWTRIG; // ...and start the PDB



  /*
    Serial.println("run..");
    if (!stepper.isRunning()) {
    if (stepper.currentPosition() > 250 * steps_mm) {
      stepper.moveTo(50 * steps_mm);
    }
    else {
      stepper.moveTo(450 * steps_mm);
    }

    }
    stepper.run();

    if (Serial.available() > 0)
    {
    while (Serial.available() > 0) {
      char t = Serial.read();
    }
    stepper.runToNewPosition(5 * steps_mm);
    stepper.setSpeed(-1 * steps_mm);
    while (digitalRead(37) == HIGH) {
      stepper.runSpeed();
    }
    Serial.println(stepper.currentPosition());
    delay(1500);
    }
  */
}

You will see that this code sets up the PDB and the DMA/DAC and at the same time controls a stepper motor (the other time sensitive application). Currently this script just endlessly repeats the waveform but I want to be able to have two different modes.

1. Repeating a set interval e.g. 2khz, 4 khz
2. Single shot every time I command

I tried playing with the PDB and setting to non-continuous then software triggering it at a set interval. Instead of burst sending the waveform then waiting it just seemed to affect the total period of the signal.

Code:
  //PDB0_SC |= PDB_SC_CONT; // run in continuous mode
...

void loop()
{
  PDB0_SC |= PDB_SC_SWTRIG; // ...and start the PDB
}

This is not the behavior I want and I guess points to me not understanding what this is actually doing. Hence why I wanted help from people or a reference I can look up what the above lines are and what other options their may be.
 
Firstly where can I go to understand the DMA, the DAC and the PDB (possible a useful module)?

The audio library uses DMA and the PDB timer to stream audio samples to the DAC. That code is the best example to read.

https://github.com/PaulStoffregen/Audio/blob/master/output_dac.cpp


Is there a hardware sheet for the Teensy processor that covers these?

All the peripherals are documented in the reference manual.

https://www.pjrc.com/teensy/datasheets.html
 
Thank you very much. Will look into those manuals and that example. I Knew I had seen a more extensive manual before but couldn’t find it.
 
GCC is indeed smart enough to avoid a div (end even if it did not: the integer divide is quite fast)

Code:
[COLOR=#000000]        [COLOR=#0000ff]lsrs[/COLOR]    [COLOR=#4864aa]r1[/COLOR], [COLOR=#4864aa]r4[/COLOR], [COLOR=#098658]#4 // div 16[/COLOR]
        [COLOR=#0000ff]adds[/COLOR]    [COLOR=#4864aa]r4[/COLOR], [COLOR=#4864aa]r4[/COLOR], [COLOR=#098658]#1[/COLOR]
        [COLOR=#0000ff]movs[/COLOR]    [COLOR=#4864aa]r0[/COLOR], [COLOR=#098658]#12[/COLOR]
        [COLOR=#0000ff]bl[/COLOR]      [COLOR=#008080]analogWrite[/COLOR]([COLOR=#008080]int[/COLOR], [COLOR=#008080]int[/COLOR])
        [COLOR=#0000ff]cmp[/COLOR]     [COLOR=#4864aa]r4[/COLOR], [COLOR=#098658]#4096[/COLOR]
        [COLOR=#0000ff]bne[/COLOR]     [COLOR=#008080].L2
[/COLOR][/COLOR]



For fun: On a 8-Bit AVR it looks like this:
Code:
[COLOR=#000000][COLOR=#008080].L2:[/COLOR]
        [COLOR=#0000ff]mov[/COLOR] [COLOR=#4864aa]r23[/COLOR],[COLOR=#4864aa]r29[/COLOR]
        [COLOR=#0000ff]mov[/COLOR] [COLOR=#4864aa]r22[/COLOR],[COLOR=#4864aa]r28[/COLOR]
        [COLOR=#0000ff]ldi[/COLOR] [COLOR=#4864aa]r24[/COLOR],[COLOR=#098658]4[/COLOR]
        [COLOR=#cd3131]1:[/COLOR]
        [COLOR=#0000ff]lsr[/COLOR] [COLOR=#4864aa]r23[/COLOR]
        [COLOR=#0000ff]ror[/COLOR] [COLOR=#4864aa]r22[/COLOR]
        [COLOR=#0000ff]dec[/COLOR] [COLOR=#4864aa]r24[/COLOR]
        [COLOR=#0000ff]brne[/COLOR] [COLOR=#098658]1[/COLOR][COLOR=#008080]b //Loop for div 16[/COLOR]
        [COLOR=#0000ff]ldi[/COLOR] [COLOR=#4864aa]r24[/COLOR],[COLOR=#008080]lo8[/COLOR]([COLOR=#098658]12[/COLOR])
        [COLOR=#0000ff]ldi[/COLOR] [COLOR=#4864aa]r25[/COLOR],[COLOR=#098658]0[/COLOR]
        [COLOR=#0000ff]rcall[/COLOR] [COLOR=#008080]aw[/COLOR]([COLOR=#008080]int[/COLOR], [COLOR=#008080]int[/COLOR])
        [COLOR=#0000ff]adiw[/COLOR] [COLOR=#4864aa]r28[/COLOR],[COLOR=#098658]1[/COLOR]
        [COLOR=#0000ff]rjmp[/COLOR] [COLOR=#008080].L2
[/COLOR][/COLOR]

 
Last edited:
Thank you frank... but is that related to my question? If so I understand even less than I expected.
 
So I am a little defeated by all this DMA PDB DAC stuff. I wanted to give it one more try before I give up and introduce new hardware. The chunk of code I was looking to replace with DMA is this:

Code:
void jetNozzle()
{
  //jets with the currently selected nozzle
  //should be called by all scripts
  for (int idx = 0; idx < points; idx++)
  {
    DAC1_DAT0L = (volt_l[idx]);
    DAC1_DATH = (volt_h[idx]);
    //analogWrite(jet_act, volt[idx]);
    //delayMicroseconds(tim[idx]);
  }
}

Hopefully you can see that all it does is grab the array "volt" and spit out sequentially the values until it is at the end. This runs as fast as possible (around 150nS an iteration in my application), and points is typically 100-200 values long. This then hangs the processor for around 15-30 uS which is causing timing issues now.

In theory it should be possible to do this with the DMA right?
 
In theory it should be possible to do this with the DMA right?

Yes.

In fact, here's a quick test created by just copying the audio library code. The only change is the DREQ flag, so the DMA stops when it gets to the end of the buffer rather than infinitely looping.

Code:
#include "DMAChannel.h"

DMAChannel dma;
#define NSAMPLE 120
uint16_t dac_buffer[NSAMPLE];

#define PDB_CONFIG (PDB_SC_TRGSEL(15) | PDB_SC_PDBEN | PDB_SC_CONT | PDB_SC_PDBIE | PDB_SC_DMAEN)
#define PDB_PERIOD (1360-1)

void setup() {

  while (!Serial && millis() < 800) ; // wait
  Serial.println("DAC DMA");

  // put some data into dac_buffer array
  for (unsigned int i=0; i < NSAMPLE; i++) {
    float f = (float)i / (float)NSAMPLE;
    dac_buffer[i] = sinf(f * 2.0 * 3.14159) * 2000.0 + 2000;
  }
  
  // turn on DAC hardware
  SIM_SCGC2 |= SIM_SCGC2_DAC0;
  DAC0_C0 = DAC_C0_DACEN;                   // 1.2V VDDA is DACREF_2

  // set the programmable delay block to trigger DMA requests
  SIM_SCGC6 |= SIM_SCGC6_PDB;
  PDB0_IDLY = 1;
  PDB0_MOD = PDB_PERIOD;
  PDB0_SC = PDB_CONFIG | PDB_SC_LDOK;
  PDB0_SC = PDB_CONFIG | PDB_SC_SWTRIG;
  PDB0_CH0C1 = 0x0101;

  // DMA will copy dac_buffer to the DAC on each PDB trigger
  dma.begin();
  dma.TCD->SADDR = dac_buffer;
  dma.TCD->SOFF = 2;
  dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(1) | DMA_TCD_ATTR_DSIZE(1);
  dma.TCD->NBYTES_MLNO = 2;
  dma.TCD->SLAST = -sizeof(dac_buffer);
  dma.TCD->DADDR = &DAC0_DAT0L;
  dma.TCD->DOFF = 0;
  dma.TCD->CITER_ELINKNO = sizeof(dac_buffer) / 2;
  dma.TCD->DLASTSGA = 0;
  dma.TCD->BITER_ELINKNO = sizeof(dac_buffer) / 2;
  dma.TCD->CSR = DMA_TCD_CSR_DREQ; // DREQ flag causes DMA to stop when done
  //dma.TCD->CSR = 0; // no flags will loop the DMA forever
  dma.triggerAtHardwareEvent(DMAMUX_SOURCE_PDB);
}

void loop() {
  dma.enable();
  delay(5);
}

The setup code just turns on the DAC, configures the PDB to create timing, fills a buffer with 1 cycle of a sine wave, then sets up a DMA transfer to send the buffer to the DAC triggered by the PDB.

Then the loop just turns on the DMA transfer every 5ms.

Here's the resulting waveform on the Teensy 3.6's DAC0 pin.

file.png
 
Thank you Paul, I just tested it out on my setup and it works exactly how I hoped it would! I admit I am disappointed I could not work it out for myself, but I am very grateful you took the time to help me! Really solves my timing issues, I can put off using a FPGA another few years!
 
Functions, or statements like 'PDB0_MOD', is that arduino langauge?

Or is that teensyduino?

How does the compiler know what register to write by PDB0_MOD for a particular MCU chip?
 
I believe, and I am sure I will be corrected. They are the registers of the processors not related to teensyduino or arduino but just the architecture. You can find details of them in the hardware datasheet: https://www.pjrc.com/teensy/K66P144M180SF5RMV2.pdf

Likely the code would only work for uc with "K66P144M180SF5RMV2" chips or similar. The sacrifice for compute speed over ease of code is why arduino exists, I know for what I want that is why I originally moved away from analogWrite() in favour of the register.
 
Thanks.

I found "PDB0_MOD" section 44.4, on page 1111 (no less :eek: )

So I guess that somewhere, the statement "PDB0_MOD" must be translated to adress " hex 4003_6004" .

Is this adress specific for this MCU?
What "software layer" (if that is the correct word?) takes care of the translation of "PDB0_MOD" to the adress?
 
Thanks.

I found "PDB0_MOD" section 44.4, on page 1111 (no less :eek: )

So I guess that somewhere, the statement "PDB0_MOD" must be translated to adress " hex 4003_6004" .

Is this adress specific for this MCU?
What "software layer" (if that is the correct word?) takes care of the translation of "PDB0_MOD" to the adress?

symbolic names for hardware registers and such are found in Teensy core include files that are installed in your PC/desktop and on github. For teensy3, the symbols are defined in kinetis.h
 
Thanks, found it at Line 3071 (omg :eek:)

So part of teensyduino, is making these names available for us to use, instead having to look up the register adress in the MCU manual ?
 
So part of teensyduino, is making these names available for us to use, instead having to look up the register adress in the MCU manual ?

Yes, exactly. Likewise all the code in that folder is written for Teensy's specific hardware, so all the Arduino functions which access hardware actually work properly on Teensy.

To answer your earlier questions about Arduino, here's a link to Arduino's documentation about how this stuff works.

https://arduino.github.io/arduino-cli/platform-specification/
 
Status
Not open for further replies.
Back
Top