SPI DAC for audio library

Status
Not open for further replies.

ubiubu

Member
Hi,

I'd like to use a quad channel SPI DAC with the Audio library, but can't quite figure out how to make this work. The device in question has a 32 bit wide register, so the basic idea, I guess, would be to interleave the DAC command and audio data bytes with the upper two bytes of SPI0_PUSHR, which contain the chip select stuff. Thus, in the DMA isr, I'm stuffing a/the SPI_tx_buffer like so (in this case I'm using CS = 9, so pcsbits = 0x02 << 16 (as per SPIFIFO.h); CHANNEL_X are the DAC command bytes):

Code:
for (int i=0; i < AUDIO_BLOCK_SAMPLES / 2; i++) {

	        // channel A data:
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_A;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | *src1++;
		// channel B data
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_B;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | *src2++;
		// channel C data
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_C;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | *src3++;
		// channel D data
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_D;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | *src4++;
	}


My DMA set-up looks like this (I've tried to adapt the output_i2s_quad object, mainly by making it write to SPI0_PUSHR):

Code:
void AudioOutputSPIQuad::begin(void)
{
#if 1
	config_SPI();
	dma.begin(true);

	block_ch1_1st = NULL;
	block_ch2_1st = NULL;
	block_ch3_1st = NULL;
	block_ch4_1st = NULL;

	dma.TCD->SADDR = SPI_tx_buffer;
	dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction = 4 bytes
	dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
	dma.TCD->NBYTES_MLNO = ELEMENT_SIZE; // bytes/transaction = 4 bytes (~ SPI0_PUSHR)
	dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
	dma.TCD->DADDR = (volatile uint32_t*)&SPI0_PUSHR;
	dma.TCD->DOFF = 0; // destination offset
	dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE; // major loop
	dma.TCD->DLASTSGA = 0;
	dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE;
	dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
	dma.triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_TX);
	update_responsibility = update_setup();
	dma.enable();
        // SPI -- 
	SPI0_SR = 0xFF0F0000;
	SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
	dma.attachInterrupt(isr);
#endif
}


However, this doesn't work. In fact, it doesn't seem to transmit any data and prevents the Audio library from updating. Any DMA experts who can point me to what I'm doing wrong? (config_SPI() basically just invokes SPIFIFO.begin() and sets up the DAC reference and so on, this bit works)
 
Last edited:
So many things might be wrong, but there's 2 broad categories to at least check first.

If you try to access hardware before it's enabled, you'll cause a fault exception and nothing will run at all. Usually this happens if you do stuff in C++ constructors which depend on initialization you do in other code that runs later.

The other broad category is DMA errors. In this case your program runs. You can read and print the DMA error registers to get a little info.

The device in question has a 32 bit wide register

Is your project such a secret that you can't even tell us the DAC part number?

Normally the way things works here involves posting complete code, not just small fragments, and often also showing how you've actually connected the wires.

But DMA is pretty much always difficult stuff to troubleshoot. Even if you share everything, this is usually the sort of thing you just need to do quite a lot of trial & error. As long as you're not getting stuck in the fault exceptions, troubleshooting DMA usually involves adding code to print memory regions and trying simpler stuff until you get *something* to happen, and a lot of incremental steps and head scratching along the way.
 
The other broad category is DMA errors. In this case your program runs. You can read and print the DMA error registers to get a little info.

Yes, it runs, so it'll probably be that. The isr doesn't run once, however, hence my question re DMA set-up.

Is your project such a secret that you can't even tell us the DAC part number?

No, it's no a secret. At this point my problem is really about getting to work DMA / SPI though, not the DAC.

Normally the way things works here involves posting complete code, not just small fragments, and often also showing how you've actually connected the wires.

As I said, I've adapted the output_i2s_quad object, so what I've posted is the difference. The hardware works fine, otherwise, so I can exclude hardware errors.

But DMA is pretty much always difficult stuff to troubleshoot. Even if you share everything, this is usually the sort of thing you just need to do quite a lot of trial & error. As long as you're not getting stuck in the fault exceptions, troubleshooting DMA usually involves adding code to print memory regions and trying simpler stuff until you get *something* to happen, and a lot of incremental steps and head scratching along the way.

OK, I'll keep trying. Thanks for the pointer re printing to registers.
 
Last edited:
Seriously, you need to post complete code.

You haven't even included the definitions for 'dest' and 'src'. If they are 'int16_t', your code obviously can't work. If only 'src' is 'int16_t', it gets promoted to an 'uint32_t' or 'int32_t' and sign-extended in your 'or' statements, so you are writing junk into the SPI PUSHR command register.

The DMA controller updates the source address in real time, you can use 'dma.sourceAddress()' to check if gets updated and if the transfer is really running.
 
Last edited:
Seriously, you need to post complete code.

You haven't even included the definitions for 'dest' and 'src'. If they are 'int16_t', your code obviously can't work. If only 'src' is 'int16_t', it gets promoted to an 'uint32_t' or 'int32_t' and sign-extended in your 'or' statements, so you are writing junk into the SPI PUSHR command register.

Thanks, yeah the "or" stuff wouldn't have worked. But I'm not at the point yet where I write junk to SPI0_PUSHR, I'm afraid. I'm pretty sure there's more details to deal with down the line, but at this point I'd mainly be interested in having the DMA shove at least _something_ into SPI0_PUSHR. The rest of the code is basically the same as output_i2s_quad, as mentioned:


Code:
#include "SPI_output.h"
#include "memcpy_audio.h"
#include <spififo.h> 

const uint16_t NUM_CHANNELS = 0x4;
const uint16_t ELEMENT_SIZE = 0x4; // = 32 bit

const uint32_t CHANNEL_A = 0x0;    // = channel A control bits: 0000 0000 | 0000 0000
const uint32_t CHANNEL_B = 0x1;    // = channel B control bits: 0000 0000 | 0000 0001
const uint32_t CHANNEL_C = 0x2;    // = channel C control bits: 0000 0000 | 0000 0010
const uint32_t CHANNEL_D = 0x3;    // = channel D control bits: 0000 0000 | 0000 0011

const uint32_t pcsbits = 0x02 << 16; // pcs << 16; see SPIFIFO.h
const uint8_t DAC_CS = 9;

#define INTREF_ENABLE 0x8000001;
#define SPICLOCK_30MHz (SPI_CTAR_PBR(0) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) 

#if defined(__MK20DX256__) || defined(__MK64FX512__) || defined(__MK66FX1M0__)

audio_block_t * AudioOutputSPIQuad::block_ch1_1st = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch2_1st = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch3_1st = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch4_1st = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch1_2nd = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch2_2nd = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch3_2nd = NULL;
audio_block_t * AudioOutputSPIQuad::block_ch4_2nd = NULL;
uint32_t  AudioOutputSPIQuad::ch1_offset = 0;
uint32_t  AudioOutputSPIQuad::ch2_offset = 0;
uint32_t  AudioOutputSPIQuad::ch3_offset = 0;
uint32_t  AudioOutputSPIQuad::ch4_offset = 0;
bool AudioOutputSPIQuad::update_responsibility = false;

// buffer size: each sample = 8 bytes : SPI/CS + DAC control + SPI/CS + audioblock data
DMAMEM static uint32_t SPI_tx_buffer[AUDIO_BLOCK_SAMPLES * NUM_CHANNELS * 2];
DMAChannel AudioOutputSPIQuad::dma(false);
static const uint32_t zerodata[AUDIO_BLOCK_SAMPLES / NUM_CHANNELS] = {0};

void AudioOutputSPIQuad::begin(void)
{
#if 1

	config_SPI();
	dma.begin(true);

	block_ch1_1st = NULL;
	block_ch2_1st = NULL;
	block_ch3_1st = NULL;
	block_ch4_1st = NULL;

	dma.TCD->SADDR = SPI_tx_buffer;
	dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction
	dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
	dma.TCD->NBYTES_MLNO = ELEMENT_SIZE; // bytes/transaction = 4 bytes (~ SPI0_PUSHR)
	dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
	dma.TCD->DADDR = (volatile uint32_t*)&SPI0_PUSHR;
	dma.TCD->DOFF = 0; // destination offset
	dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE; // major loop
	dma.TCD->DLASTSGA = 0;
	dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE;
	dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
	dma.triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_TX);
	update_responsibility = update_setup();
	dma.enable();
    // SPI -- 
	SPI0_SR = 0xFF0F0000;
	SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
	dma.attachInterrupt(isr);
	//digitalWriteFast(0x0, 0x1);
#endif
}

void AudioOutputSPIQuad::isr(void)
{
	uint32_t saddr;
	const int16_t *src1, *src2, *src3, *src4;
	const int16_t *zeros = (const int16_t *)zerodata;
	int32_t *dest;

	//digitalWriteFast(0x0, 0x0);

	saddr = (uint32_t)(dma.TCD->SADDR);
	dma.clearInterrupt();
	if (saddr < (uint32_t)SPI_tx_buffer + sizeof(SPI_tx_buffer) / 2) {
		// DMA is transmitting the first half of the buffer
		// so we must fill the second half
		dest = (int32_t *)&SPI_tx_buffer[AUDIO_BLOCK_SAMPLES * NUM_CHANNELS];
		if (update_responsibility) update_all();
	} else {
		dest = (int32_t *)SPI_tx_buffer;
	}

	src1 = (block_ch1_1st) ? block_ch1_1st->data + ch1_offset : zeros;
	src2 = (block_ch2_1st) ? block_ch2_1st->data + ch2_offset : zeros;
	src3 = (block_ch3_1st) ? block_ch3_1st->data + ch3_offset : zeros;
	src4 = (block_ch4_1st) ? block_ch4_1st->data + ch4_offset : zeros;

	for (int i=0; i < AUDIO_BLOCK_SAMPLES / 2; i++) {

	    // channel A data: 8 bytes ... 
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_A;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | (*src1++ & 0xFFFF);
		// channel B data
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_B;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | (*src2++ & 0xFFFF);
		// channel C data
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_C;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | (*src3++ & 0xFFFF);
		// channel D data
		*dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | CHANNEL_D;
		*dest++ = pcsbits | SPI_PUSHR_CTAS(1) | (*src4++ & 0xFFFF);
	}

	if (block_ch1_1st) {
		if (ch1_offset == 0) {
			ch1_offset = AUDIO_BLOCK_SAMPLES/2;
		} else {
			ch1_offset = 0;
			release(block_ch1_1st);
			block_ch1_1st = block_ch1_2nd;
			block_ch1_2nd = NULL;
		}
	}
	if (block_ch2_1st) {
		if (ch2_offset == 0) {
			ch2_offset = AUDIO_BLOCK_SAMPLES/2;
		} else {
			ch2_offset = 0;
			release(block_ch2_1st);
			block_ch2_1st = block_ch2_2nd;
			block_ch2_2nd = NULL;
		}
	}
	if (block_ch3_1st) {
		if (ch3_offset == 0) {
			ch3_offset = AUDIO_BLOCK_SAMPLES/2;
		} else {
			ch3_offset = 0;
			release(block_ch3_1st);
			block_ch3_1st = block_ch3_2nd;
			block_ch3_2nd = NULL;
		}
	}
	if (block_ch4_1st) {
		if (ch4_offset == 0) {
			ch4_offset = AUDIO_BLOCK_SAMPLES/2;
		} else {
			ch4_offset = 0;
			release(block_ch4_1st);
			block_ch4_1st = block_ch4_2nd;
			block_ch4_2nd = NULL;
		}
	}
}

void AudioOutputSPIQuad::update(void)
{
	audio_block_t *block, *tmp;

	block = receiveReadOnly(0); // channel 1
	if (block) {
		__disable_irq();
		if (block_ch1_1st == NULL) {
			block_ch1_1st = block;
			ch1_offset = 0;
			__enable_irq();
		} else if (block_ch1_2nd == NULL) {
			block_ch1_2nd = block;
			__enable_irq();
		} else {
			tmp = block_ch1_1st;
			block_ch1_1st = block_ch1_2nd;
			block_ch1_2nd = block;
			ch1_offset = 0;
			__enable_irq();
			release(tmp);
		}
	}
	block = receiveReadOnly(1); // channel 2
	if (block) {
		__disable_irq();
		if (block_ch2_1st == NULL) {
			block_ch2_1st = block;
			ch2_offset = 0;
			__enable_irq();
		} else if (block_ch2_2nd == NULL) {
			block_ch2_2nd = block;
			__enable_irq();
		} else {
			tmp = block_ch2_1st;
			block_ch2_1st = block_ch2_2nd;
			block_ch2_2nd = block;
			ch2_offset = 0;
			__enable_irq();
			release(tmp);
		}
	}
	block = receiveReadOnly(2); // channel 3
	if (block) {
		__disable_irq();
		if (block_ch3_1st == NULL) {
			block_ch3_1st = block;
			ch3_offset = 0;
			__enable_irq();
		} else if (block_ch3_2nd == NULL) {
			block_ch3_2nd = block;
			__enable_irq();
		} else {
			tmp = block_ch3_1st;
			block_ch3_1st = block_ch3_2nd;
			block_ch3_2nd = block;
			ch3_offset = 0;
			__enable_irq();
			release(tmp);
		}
	}
	block = receiveReadOnly(3); // channel 4
	if (block) {
		__disable_irq();
		if (block_ch4_1st == NULL) {
			block_ch4_1st = block;
			ch4_offset = 0;
			__enable_irq();
		} else if (block_ch4_2nd == NULL) {
			block_ch4_2nd = block;
			__enable_irq();
		} else {
			tmp = block_ch4_1st;
			block_ch4_1st = block_ch4_2nd;
			block_ch4_2nd = block;
			ch4_offset = 0;
			__enable_irq();
			release(tmp);
		}
	}
}

void AudioOutputSPIQuad::config_SPI(void)
{

  SPIFIFO.begin(DAC_CS, SPICLOCK_30MHz, SPI_MODE0);
  // enable internal reference:
  uint32_t _data = INTREF_ENABLE;
  SPIFIFO.write16(_data >> 16, SPI_CONTINUE);  
  SPIFIFO.write16(_data);
  SPIFIFO.read();
  SPIFIFO.read();
}

#endif
 
Last edited:
The DMA setup looks correct and it does kick off a transfer:

Code:
#include <DmaChannel.h>
#include <spififo.h>
#include <array>

const uint16_t NUM_CHANNELS = 0x4;
const uint16_t ELEMENT_SIZE = 0x4; // = 32 bit
const size_t AUDIO_BLOCK_SAMPLES = 64;
DMAMEM static auto SPI_tx_buffer = [](){
    std::array<uint32_t, AUDIO_BLOCK_SAMPLES * NUM_CHANNELS * 2> res;
    for(auto& elem : res) elem = SPI_PUSHR_CONT | SPI_PUSHR_CTAS(1);
    return res;
}();

DMAChannel dma;
const uint8_t DAC_CS = 9;
#define SPICLOCK_30MHz (SPI_CTAR_PBR(0) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) 
#define INTREF_ENABLE 0x8000001;

void spiSetup() {
    SPIFIFO.begin(DAC_CS, SPICLOCK_30MHz, SPI_MODE0);
    // enable internal reference:
    uint32_t _data = INTREF_ENABLE;
    SPIFIFO.write16(_data >> 16, SPI_CONTINUE);  
    SPIFIFO.write16(_data);
    SPIFIFO.read();
    SPIFIFO.read();
}

void setup() {
    Serial.begin(115200);
    delay(2000);
    Serial.printf("SPI_tx_buffer: %x   %u\n", (uint32_t) SPI_tx_buffer.data(), SPI_tx_buffer.size());

    spiSetup();
    
    dma.TCD->SADDR = SPI_tx_buffer.data();
    dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction
    dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
    dma.TCD->NBYTES_MLNO = ELEMENT_SIZE; // bytes/transaction = 4 bytes (~ SPI0_PUSHR)
    dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
    dma.TCD->DADDR = &SPI0_PUSHR;
    dma.TCD->DOFF = 0; // destination offset
    dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE; // major loop
    dma.TCD->DLASTSGA = 0;
    dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE;
    dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
    dma.triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_TX);
    dma.enable();

    SPI0_SR = 0xFF0F0000;
    SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
}

uint32_t dma_addr = 0;

void loop() {
    uint32_t new_dma_addr = (uint32_t) dma.sourceAddress();
    if(dma_addr != new_dma_addr) {
        Serial.printf("DMA src: %x    dest: %x\n", new_dma_addr, (uint32_t) dma.destinationAddress());
        dma_addr = new_dma_addr;
        delay(100);
    }
}
 
The DMA setup looks correct and it does kick off a transfer

Thanks. It does seem to work ok: Bypassing the audio library, the code below writes a square wave to channels A-D. It doesn't seem to like AUDIO_BLOCK_SAMPLES = 128 (in which case it tends to work only occasionally), but it works reliably with AUDIO_BLOCK_SAMPLES = 64. So far, so good.

Before I waste any more time with this, I'm beginning to wonder whether it's possible to achieve what I'm trying to do -- My understanding is that to integrate this with the audio library, one would actually have to time the transfers a little differently (rather than write as fast as possible). AudioOutputAnalog ties the to the Programmable Delay Block (DMAMUX_SOURCE_PDB), the i2s objects to DMAMUX_SOURCE_I2S0_TX. My ultimate goal would be to use this in conjunction with AudioInputUSB on OSX, so the PDB seems to be out? (I haven't entirely kept up to date with the state of art, but IIRC there were a couple of threads about how the deviation from standard sample (ie AUDIO_SAMPLE_RATE_EXACT) rates caused issues?)


Code:
#include <DmaChannel.h>
#include <spififo.h>
#include <array>

const uint16_t NUM_CHANNELS = 0x4;
const uint16_t ELEMENT_SIZE = 0x4; // = 32 bit
const size_t AUDIO_BLOCK_SAMPLES = 64;
DMAMEM static auto SPI_tx_buffer = [](){
    std::array<uint32_t, AUDIO_BLOCK_SAMPLES * NUM_CHANNELS * 2> res;
    for(auto& elem : res) elem = SPI_PUSHR_CONT | SPI_PUSHR_CTAS(1);
    return res;
}();

DMAChannel dma;

#define SPICLOCK_30MHz (SPI_CTAR_PBR(0) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) 
#define INTREF_ENABLE 0x8000001;
#define DAC_CHANNEL_OFFSET 20
#define DAC_CS 15
#define LDAC 14

const uint32_t DAC_CH_A = ((uint32_t) 0x0 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_B = ((uint32_t) 0x1 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_C = ((uint32_t) 0x2 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_D = ((uint32_t) 0x3 << DAC_CHANNEL_OFFSET);
const uint32_t pcsbits = 0x10 << 16; // DAC_CS = 15; see SPIFIFO.h

void spiSetup() {

    pinMode(LDAC, OUTPUT);
    digitalWrite(LDAC, LOW);
    CORE_PIN7_CONFIG = PORT_PCR_DSE | PORT_PCR_MUX(2);
    CORE_PIN13_CONFIG = PORT_PCR_DSE | PORT_PCR_MUX(2);
    SPIFIFO.begin(DAC_CS, SPICLOCK_30MHz, SPI_MODE0);
    // enable internal reference:
    uint32_t _data = INTREF_ENABLE;
    SPIFIFO.write16(_data >> 16, SPI_CONTINUE);  
    SPIFIFO.write16(_data);
    SPIFIFO.read();
    SPIFIFO.read();
}

uint32_t data = 0xFFFF;

void isr() {

  uint32_t saddr;
  int32_t *dest;

  saddr = (uint32_t)(dma.TCD->SADDR);
  dma.clearInterrupt();
  if (saddr < (uint32_t)SPI_tx_buffer.data() + sizeof(SPI_tx_buffer) / 2) {
    // DMA is transmitting the first half of the buffer
    // so we must fill the second half
    dest = (int32_t *)&SPI_tx_buffer[AUDIO_BLOCK_SAMPLES * NUM_CHANNELS];
    data = 0x0;
  } else {
    dest = (int32_t *)SPI_tx_buffer.data();
    data = 0xFFFF;
  }

  for (int i=0; i < AUDIO_BLOCK_SAMPLES / 2; i+=8) {

   // channel A data      
    uint32_t _data = DAC_CH_A | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel B data
    _data = DAC_CH_B | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel C data
    _data = DAC_CH_C | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel D data
    _data = DAC_CH_D | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
  } 
}

void setup() {
    Serial.begin(115200);
    delay(2000);
    Serial.printf("SPI_tx_buffer: %x   %u\n", (uint32_t) SPI_tx_buffer.data(), SPI_tx_buffer.size());

    spiSetup();
    
    dma.TCD->SADDR = SPI_tx_buffer.data();
    dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction
    dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
    dma.TCD->NBYTES_MLNO = ELEMENT_SIZE; // bytes/transaction = 4 bytes (~ SPI0_PUSHR)
    dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
    dma.TCD->DADDR = &SPI0_PUSHR;
    dma.TCD->DOFF = 0; // destination offset
    dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE; // major loop
    dma.TCD->DLASTSGA = 0;
    dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE;
    dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
    dma.triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_TX);
    dma.enable();

    SPI0_SR = 0xFF0F0000;
    SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
    dma.attachInterrupt(isr);
}

uint32_t dma_addr = 0;

void loop() {
    uint32_t new_dma_addr = (uint32_t) dma.sourceAddress();
    if(dma_addr != new_dma_addr) {
        Serial.printf("DMA src: %x    dest: %x\n", new_dma_addr, (uint32_t) dma.destinationAddress());
        dma_addr = new_dma_addr;
        delay(100);
    }
}
 
Last edited:
Here is a follow up. To make this work properly, I guess one would really need to write 32 bytes per request (rather than 4 bytes), ie dma.TCD->NBYTES_MLNO = 8 * NUM_CHANNELS; (= 8 bytes data per channel * 4 channels).

I've dropped in the code for the PDB from the onboard DAC object, and this works just fine. However, as soon as I make NBYTES_MLNO > 16 (and adjust CITER_ELINKNO/BITER_ELINKNO to sizeof(SPI_tx_buffer) / bytes_per_mlno)), the transmitted data will be corrupt. The channels (or some of them) still update, so things can't be entirely off the map, but it clearly doesn't work properly.

So this works fine:
Code:
    [B]uint16_t nbytes_mlno = ELEMENT_SIZE * 4;[/B] // = 16 bytes
    dma.TCD->SADDR = SPI_tx_buffer.data();
    dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction
    dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
    dma.TCD->NBYTES_MLNO = nbytes_mlno; // bytes/transaction
    dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
    dma.TCD->DADDR = &SPI0_PUSHR;
    dma.TCD->DOFF = 0; // destination offset
    dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / nbytes_mlno; // bytes per major loop
    dma.TCD->DLASTSGA = 0;
    dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / nbytes_mlno;
    dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
    dma.triggerAtHardwareEvent(DMAMUX_SOURCE_PDB);
    dma.enable();

    SPI0_SR = 0xFF0F0000;
    SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
    dma.attachInterrupt(isr);

But this doesn't (it'll transfer stuff, but the data is only partially intact):

Code:
    [B]uint16_t nbytes_mlno = ELEMENT_SIZE * 8;[/B] // = 32 bytes
    dma.TCD->SADDR = SPI_tx_buffer.data();
    dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction
    dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
    dma.TCD->NBYTES_MLNO = nbytes_mlno; // bytes/transaction
    dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
    dma.TCD->DADDR = &SPI0_PUSHR;
    dma.TCD->DOFF = 0; // destination offset
    dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / nbytes_mlno; // bytes per major loop
    dma.TCD->DLASTSGA = 0;
    dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / nbytes_mlno;
    dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
    dma.triggerAtHardwareEvent(DMAMUX_SOURCE_PDB);
    dma.enable();

    SPI0_SR = 0xFF0F0000;
    SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
    dma.attachInterrupt(isr);


Is it possible I'm hitting some limit here? (I've tried with both a T3.2 and T3.6, and they behave the same way). The gist of the matter seems to be that I can only transmit 16 bytes or so per minor loop to SPI0_PUSHR (bytes_per_mlno = 4 * 4) before the data will get corrupt. Is there any way I can print the DMA transfers?

Edit. I've tried whether it makes a difference when instead of dma.TCD->NBYTES_MLNO = bytes_per_mlno; I try with

Code:
dma.TCD->NBYTES_MLOFFYES = DMA_TCD_NBYTES_MLOFFYES_NBYTES(bytes_per_mlno) | DMA_TCD_NBYTES_SMLOE | DMA_TCD_NBYTES_MLOFFYES_MLOFF(mloff);

where (as above) bytes_per_mlno = 8 * NUM_CHANNELS; // 2 x 4 bytes per channel

When mloff = 0, it seems to work much the same way as above, ie with the minor loop off. For any other value of mloff things get stuck after 2-3 requests, however. Also, conceptually, I'm not entirely sure I get the interaction between SOFF and NBYTES_MLOFFYES_MLOFF. The former needs to be 0x4 in this case, but the latter? Do both get added to the source address after the minor loop is completed?

For fun, I've tried with a 8-channel DAC and this still works with NBYTES_MLNO = 4 (or 8 or 16), but speeding up the PDB instead (PDB0_MOD = PDB_PERIOD / NUM_CHANNELS):


Code:
#include <DmaChannel.h>
#include <spififo.h>
#include <array>

const uint16_t NUM_CHANNELS = 0x8;
const uint16_t ELEMENT_SIZE = 0x4; // = 32 bit
const size_t AUDIO_BLOCK_SAMPLES = 256;
DMAMEM static auto SPI_tx_buffer = [](){
    std::array<uint32_t, AUDIO_BLOCK_SAMPLES * NUM_CHANNELS * 2> res;
    for(auto& elem : res) elem = SPI_PUSHR_CONT | SPI_PUSHR_CTAS(1);
    return res;
}();

DMAChannel dma;

#define SPICLOCK_30MHz (SPI_CTAR_PBR(0) | SPI_CTAR_BR(0) | SPI_CTAR_DBR) 
#define INTREF_ENABLE 0x8000001;
#define DAC_CHANNEL_OFFSET 20
#define DAC_CS 15
#define LDAC 14

// DAC control bits: 

const uint32_t DAC_CH_A = ((uint32_t) 0x0 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_B = ((uint32_t) 0x1 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_C = ((uint32_t) 0x2 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_D = ((uint32_t) 0x3 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_E = ((uint32_t) 0x4 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_F = ((uint32_t) 0x5 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_G = ((uint32_t) 0x6 << DAC_CHANNEL_OFFSET);
const uint32_t DAC_CH_H = ((uint32_t) 0x7 << DAC_CHANNEL_OFFSET);
const uint32_t pcsbits = 0x10 << 16; // DAC_CS = 15; see SPIFIFO.h

void spiSetup() {

    pinMode(LDAC, OUTPUT);
    digitalWrite(LDAC, LOW);
    CORE_PIN7_CONFIG = PORT_PCR_DSE | PORT_PCR_MUX(2);
    CORE_PIN13_CONFIG = PORT_PCR_DSE | PORT_PCR_MUX(2);
    SPIFIFO.begin(DAC_CS, SPICLOCK_30MHz, SPI_MODE0);
    // enable internal reference:
    uint32_t _data = INTREF_ENABLE;
    SPIFIFO.write16(_data >> 16, SPI_CONTINUE);  
    SPIFIFO.write16(_data);
    SPIFIFO.read();
    SPIFIFO.read();
}

uint32_t data = 0xFFFF;

void isr() {

  uint32_t saddr;
  int32_t *dest;

  saddr = (uint32_t)(dma.TCD->SADDR);
  dma.clearInterrupt();
  if (saddr < (uint32_t)SPI_tx_buffer.data() + sizeof(SPI_tx_buffer) / 2) {
    // DMA is transmitting the first half of the buffer
    // so we must fill the second half
    dest = (int32_t *)&SPI_tx_buffer[AUDIO_BLOCK_SAMPLES * NUM_CHANNELS];
    data = 0xFFFF;
  } else {
    dest = (int32_t *)SPI_tx_buffer.data();
    data = 0x0000;
  }

  for (uint16_t i = 0; i < AUDIO_BLOCK_SAMPLES / 2; i += NUM_CHANNELS * 2) {

   // interleave channel A data with SPI_PUSHR upper two bytes (CS):
    uint32_t _data = DAC_CH_A | (data << 4); // = DAC command (32 bit)
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16; // first 4 bytes
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF); // second 4 bytes
    // channel B data
    _data = DAC_CH_B | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel C data
    _data = DAC_CH_C | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel D data
    _data = DAC_CH_D | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel E data
    _data = DAC_CH_E | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel F data
    _data = DAC_CH_F | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel G data
    _data = DAC_CH_G | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
    // channel H data
    _data = DAC_CH_H | (data << 4);
    *dest++ = SPI_PUSHR_CONT | pcsbits | SPI_PUSHR_CTAS(1) | _data >> 16;
    *dest++ = pcsbits | SPI_PUSHR_CTAS(1) |  (_data & 0xFFFF);
  } 
}

#define PDB_CONFIG (PDB_SC_TRGSEL(15) | PDB_SC_PDBEN | PDB_SC_CONT | PDB_SC_PDBIE | PDB_SC_DMAEN)

#if F_BUS == 120000000
  #define PDB_PERIOD (2720-1)
#elif F_BUS == 108000000
  #define PDB_PERIOD (2448-1)
#elif F_BUS == 96000000
  #define PDB_PERIOD (2176-1)
#elif F_BUS == 90000000
  #define PDB_PERIOD (2040-1)
#elif F_BUS == 80000000
  #define PDB_PERIOD (1813-1)  // small ?? error
#elif F_BUS == 72000000
  #define PDB_PERIOD (1632-1)
#elif F_BUS == 64000000
  #define PDB_PERIOD (1451-1)  // small ?? error
#elif F_BUS == 60000000
  #define PDB_PERIOD (1360-1)
#elif F_BUS == 56000000
  #define PDB_PERIOD (1269-1)  // 0.026% error
#elif F_BUS == 54000000
  #define PDB_PERIOD (1224-1)
#elif F_BUS == 48000000
  #define PDB_PERIOD (1088-1)
#elif F_BUS == 40000000
  #define PDB_PERIOD (907-1)  // small ?? error
#elif F_BUS == 36000000
  #define PDB_PERIOD (816-1)
#elif F_BUS == 24000000
  #define PDB_PERIOD (544-1)
#elif F_BUS == 16000000
  #define PDB_PERIOD (363-1)  // 0.092% error
#else
  #error "Unsupported F_BUS speed"
#endif

void setup() {
    Serial.begin(115200);
    delay(2000);
    Serial.printf("SPI_tx_buffer: %x   %u\n", (uint32_t) SPI_tx_buffer.data(), SPI_tx_buffer.size());

    spiSetup();

    if (!(SIM_SCGC6 & SIM_SCGC6_PDB)
      || (PDB0_SC & PDB_CONFIG) != PDB_CONFIG
      || PDB0_MOD != PDB_PERIOD
      || PDB0_IDLY != 1
      || PDB0_CH0C1 != 0x0101) {
        SIM_SCGC6 |= SIM_SCGC6_PDB;
        PDB0_IDLY = 1;
        // hack ahead... 
       [B] PDB0_MOD = PDB_PERIOD / NUM_CHANNELS;[/B]
        PDB0_SC = PDB_CONFIG | PDB_SC_LDOK;
        PDB0_SC = PDB_CONFIG | PDB_SC_SWTRIG;
        PDB0_CH0C1 = 0x0101;
    }

    dma.TCD->SADDR = SPI_tx_buffer.data();
    dma.TCD->SOFF = ELEMENT_SIZE; // source offset per transaction
    dma.TCD->ATTR = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT); 
    dma.TCD->NBYTES_MLNO = ELEMENT_SIZE; // bytes/transaction = 4 bytes (~ SPI0_PUSHR)
    dma.TCD->SLAST = -sizeof(SPI_tx_buffer);
    dma.TCD->DADDR = &SPI0_PUSHR;
    dma.TCD->DOFF = 0; // destination offset
    dma.TCD->CITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE; // major loop
    dma.TCD->DLASTSGA = 0;
    dma.TCD->BITER_ELINKNO = sizeof(SPI_tx_buffer) / ELEMENT_SIZE;
    dma.TCD->CSR = DMA_TCD_CSR_INTHALF | DMA_TCD_CSR_INTMAJOR;
    //dma.triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_TX);
    dma.triggerAtHardwareEvent(DMAMUX_SOURCE_PDB);
    dma.enable();

    SPI0_SR = 0xFF0F0000;
    SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS; 
    dma.attachInterrupt(isr);
}

uint32_t dma_addr = 0;

void loop() {
    uint32_t new_dma_addr = (uint32_t) dma.sourceAddress();
    if(dma_addr != new_dma_addr) {
        Serial.printf("DMA src: %x    dest: %x\n", new_dma_addr, (uint32_t) dma.destinationAddress());
        dma_addr = new_dma_addr;
        delay(100);
    }
}
 
Last edited:
Status
Not open for further replies.
Back
Top