Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 11 of 11

Thread: DMA multiplexing, teensy 3.6

  1. #1
    Junior Member
    Join Date
    May 2017
    Posts
    9

    DMA multiplexing, teensy 3.6

    I want to sample a 4-channel analog-to-digital converter at 20 kHz over SPI (I'm sampling at 32-bit resolution, so I can't use the onboard ADC). There is a single chip select but the channels are muxed by GPIOs. I would like to sample the ADC using DMA; is there a way to synchronize the DMA timer to toggle the GPIO pins to switch every 32 SPI clocks? In addition, can I set the SPI word size to 32? If this is possible, how complicated is the implementation? We are considering sampling the SPI manually. I have DMA working with 8-bit spi words currently using https://github.com/crteensy/DmaSpi.
    Last edited by ozeng; 05-18-2017 at 06:04 AM.

  2. #2
    Senior Member
    Join Date
    Jan 2013
    Posts
    687
    Quote Originally Posted by ozeng View Post
    I want to sample a 4-channel analog-to-digital converter at 20 kHz over SPI (I'm sampling at 32-bit resolution, so I can't use the onboard ADC).
    What is the converter?

    There is a single chip select but the channels are muxed by GPIOs.
    How? 4 pins? 2 pins?

    I would like to sample the ADC using DMA; is there a way to synchronize the DMA timer to toggle the GPIO pins to switch every 32 SPI clocks?
    Maybe. What's the exact timing? Do you have restrictions on pin usage? Are the FTM timers used? Can you use SPI0?

    In addition, can I set the SPI word size to 32?
    Not exactly. Why? Byte reordering is possible with the DMA controller.

    What is the memory layout of the transferred data supposed to be?

  3. #3
    Junior Member
    Join Date
    May 2017
    Posts
    9
    ADC: LTC2500 http://cds.linear.com/docs/en/datasheet/250032f.pdf
    mux: http://www.analog.com/media/en/techn...ADG608_609.pdf

    I think the mux uses two gpio pins to represent a two-bit integer.

    I don't have many restrictions on pin usage; many GPIOs are open, and we have access to SPI0. We are sampling at 5 kHz where we have 4 channels per sample and each channel is 32 bits. So, the GPIO should be changing at a rate of 20kHz. I believe we're not using the ftm timers (I'm not sure what they are). I'm guessing it would be a good idea for the gpio and dma/spi timings to come from a single source? Otherwise over time the gpio and spi might not be synchronized.

    The spi word size is relevant because I think the chip select is cycled every spi word but we want to have chip select down for each 32-bit sample. If we are using GPIO to mux, however, we might also attach chip select to a gpio.

    Memory layout would ideally be all four channels of a sample lie contiguous to each other in memory and we just have a buffer of samples one after the other. A circular buffer of samples would be nice to have since we would like continuous streaming of adc reads.

  4. #4
    Senior Member
    Join Date
    Jan 2013
    Posts
    687
    I would use use 2 FTM timers. FTM0, running at 5khz is responsible for the GPIO switching. Two 50% PWM signals offset by a 90 degree phase difference generate a 2-bit gray code (00 10 11 01) that you can use to switch the MUX. The FTM timers allow you to pair channels, allowing full control over start / end of the PWM signal. The first channel of the pair will switch the output pin high, the second channel will switch it low (e.g. channel pair 1 runs from 0 - 50%, channel pair 2 runs from 25% - 75%). Look at the K66 manual, '45.5.8 Combine mode'.

    Use a second FTM module (e.g. FTM1) to generate the 20kHz ADC conversion start pulse. You can kick off both FTM modules at the same time using the global time base feature (manual, '45.5.28 Global time base'). If you have the second FTM module counting at an exact multiple of the first one, they will stay in sync.

    You can have the ADC busy signal falling edge (conversion complete) trigger an interrupt or DMA transfer. The GPIO pins can be configured for either. However, there is only a single DMA trigger source for each GPIO port. Another option is to simply use fixed timing and configure a second channel from FTM1 for Output Compare mode and have it trigger either an interrupt or DMA.

    If you are using an interrupt and SPI0 (which has 4-word RX and TX queues), you can simply set up the SPI transfer for 16-bit, write two dummy doublewords to PUSHR (manual, '57.4.7 PUSH TX FIFO Register In Master Mode') with the right upper half ('Continuous Peripheral Chip Select Enable' in the first; 'End Of Queue' in the second and PCS in both). The ADC seems to be plenty fast, so if you are running at 30MHz SPI speed you could simply wait the 1us it takes to transfer 32 bits.

    Or alternately set an interrupt for the 'End Of Queue' flag (manual, 57.4.6 'DMA/Interrupt Request Select and Enable Register (SPIx_RSER)'). You can then pick up 2 words (2 * 16bit) from POPR (57.4.9 POP RX FIFO Register (SPIx_POPR)). If you are using interrupts, just rely on the FIFO and don't use DMA. 40'000 interrupts/s (or 20'000) is not a lot for Teensy 3.6...

    You can also do everything with DMA. However, that's a lot more difficult to debug and more difficult to recover from errors.

  5. #5
    Junior Member
    Join Date
    May 2017
    Posts
    9
    Thanks so much! You're a real hero.

    I'll look into the docs you sent. Fingers crossed that I can figure out how these modules work.

  6. #6
    Senior Member
    Join Date
    Jan 2013
    Posts
    687
    The DMA transfer isn't too bad actually. The tricky part is that the SPI module doesn't want to run continuously without intervention. So the first DMA transfer clears out the SPI status flags.

    This code transfers data from SPI0 into a ring buffer, triggered by pin 32 (connected to 31 for testing). SPI0 pin 11 / 12 are connected as loopback for testing.

    This relies on the 4-word SPI FIFOs. It wouldn't work on SPI1/SPI2.

    Code:
    #include <DMAChannel.h>
    #include <array>
    #include <SPI.h>
    
    const uint8_t spi_cs_pin = 15;   // pin 15 SPI0 chip select
    const uint8_t trigger_pin = 32;  // PTB11
    
    // CTAS(1) is configured for 16-bit SPI word transfer
    std::array<uint32_t, 2> spi_tx_src;
    
    const size_t buffer_size = 8;
    std::array<volatile uint32_t, buffer_size> spi_rx_dest;
    
    DMAChannel dma_start_spi;
    DMAChannel dma_rx;
    DMAChannel dma_tx;
    
    
    uint32_t dummy;
    uint32_t start_spi_sr = 0xFF0F0000;
    
    auto& serial = Serial;
    auto& spi = SPI;
    SPISettings spi_settings(10, MSBFIRST, SPI_MODE0);
    
    void setup() {
        serial.begin(115200);
        delay(2000);
        serial.println("Starting.");
    
        spi.begin();
    
        pinMode(spi_cs_pin, OUTPUT);
        digitalWriteFast(spi_cs_pin, HIGH);
        auto kinetis_spi_cs = spi.setCS(spi_cs_pin);
        spi_tx_src = { 
            SPI_PUSHR_PCS(kinetis_spi_cs) | SPI_PUSHR_CONT | SPI_PUSHR_CTAS(1) | 0x4242,
            SPI_PUSHR_PCS(kinetis_spi_cs) | SPI_PUSHR_EOQ | SPI_PUSHR_CTAS(1) | 0x4343u,
        };
        
        spi.beginTransaction(spi_settings);
    
        // for testing, pin 31 connected to 32, send t in serial monitor to initiate SPI transfer
        const uint8_t trigger_pin_out = 31;
        pinMode(trigger_pin_out, OUTPUT);
        digitalWriteFast(trigger_pin_out, LOW);
    
        pinMode(trigger_pin, INPUT_PULLUP);
        volatile uint32_t *pin_config = portConfigRegister(trigger_pin);
        *pin_config |= PORT_PCR_IRQC(0b0010); // DMA on falling edge
    
        dma_start_spi.sourceBuffer(&start_spi_sr, sizeof(start_spi_sr));
        dma_start_spi.destination(KINETISK_SPI0.SR);
        // triggered by pin 32, port B
        dma_start_spi.triggerAtHardwareEvent(DMAMUX_SOURCE_PORTB);
        
        dma_tx.TCD->SADDR = spi_tx_src.data();
        dma_tx.TCD->ATTR_SRC = 2; // 32-bit read from source
        dma_tx.TCD->SOFF = 4;
        // transfer both 32-bit entries in one minor loop
        dma_tx.TCD->NBYTES = 8;
        dma_tx.TCD->SLAST = -sizeof(spi_tx_src); // go back to beginning of buffer
        dma_tx.TCD->DADDR = &KINETISK_SPI0.PUSHR;
        dma_tx.TCD->DOFF = 0;
        dma_tx.TCD->ATTR_DST = 2; // 32-bit write to dest
        // one major loop iteration
        dma_tx.TCD->BITER = 1;
        dma_tx.TCD->CITER = 1;
        dma_tx.triggerAtCompletionOf(dma_start_spi);
    
        dma_rx.source((uint16_t&) KINETISK_SPI0.POPR);
        dma_rx.destinationBuffer((uint16_t*) spi_rx_dest.data(), sizeof(spi_rx_dest));
        dma_rx.triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_RX);
    
        SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS; // DMA on receive FIFO drain flag
        SPI0_SR = 0xFF0F0000;
    
        dma_rx.enable();
        dma_tx.enable();
        dma_start_spi.enable();
    
        uint32_t dma_rx_pos = uint32_t(dma_rx.sourceAddress());
        while(true) {
            if(serial.available()) {
                char c = serial.read();
                if(c == 't') {
                    digitalWriteFast(trigger_pin_out, HIGH);
                    delay(1);
                    digitalWriteFast(trigger_pin_out, LOW);
                }
            }
            if(uint32_t(dma_rx.destinationAddress()) != dma_rx_pos) {
                dma_rx_pos = (uint32_t) dma_rx.destinationAddress();
                if(dma_rx_pos % 4 == 0) { // only print finished transfer
                    serial.printf("rx buf: %x   dma ptr: %x   delta: %u\n",
                        uint32_t(spi_rx_dest.data()), dma_rx_pos, 
                        dma_rx_pos - uint32_t(spi_rx_dest.data()));
                    for(size_t i = 0; i < spi_rx_dest.size(); i++) serial.printf("%8x ", spi_rx_dest[i]);
                    serial.println();
                }
            }
        }
    }
    
    void loop() {}
    Last edited by tni; 05-19-2017 at 04:04 PM. Reason: fixed SPI chip select

  7. #7
    Junior Member
    Join Date
    May 2017
    Posts
    9
    Wow, that's awesome! This stuff is legendary.

    One question:
    This code is interrupt-dependent, correct (at least, it failed when I did noInterrupts())? If we have a competing interrupt, could that cause us to drop a byte? E.g. if SPI0_POPR is full and the interrupt doesn't fire immediately, would we drop bits? An offset error would be pretty bad for us. Or, maybe if multiple interrupts are queued, it would try to read SPI0_POPR twice in a row, where the second read would hang because the register only carries 32 bits? Essentially, how careful do we have to be about the offsets and interrupts?

  8. #8
    Senior Member
    Join Date
    Jan 2013
    Posts
    687
    Quote Originally Posted by ozeng View Post
    This code is interrupt-dependent, correct (at least, it failed when I did noInterrupts())?
    No, it's not. What does it mean it failed? The SPI transfer to the buffer will still occur, but various other things won't work (like USB Serial I/O). So you can't press 't' to trigger a transfer. A push-button wired to pin 32 will trigger the transfer with interrupts disabled (I have tried it).

    The code doesn't use interrupts at all. The DMA chaining is there to avoid having to use them. The DMA controller runs independently, interrupts being disabled doesn't matter.

  9. #9
    Junior Member
    Join Date
    May 2017
    Posts
    9
    Sweet, you really thought that through. Thanks for the help!

  10. #10
    Senior Member Theremingenieur's Avatar
    Join Date
    Feb 2014
    Location
    Colmar, France
    Posts
    743
    Quote Originally Posted by ozeng View Post
    Wow, that's awesome! This stuff is legendary.
    Each time @tni posts some code, I feel that I'm a nobody, seen his deep knowledge, understanding and coding faculties. That guy is just brilliant!

  11. #11
    Senior Member defragster's Avatar
    Join Date
    Feb 2015
    Posts
    4,262
    Quote Originally Posted by Theremingenieur View Post
    Each time @tni posts some code, I feel that I'm a nobody, seen his deep knowledge, understanding and coding faculties. That guy is just brilliant!
    Indeed - each posted piece is a lesson ( or three ). I coded in the dark ages where that wasn't available - or was 'C' only - unless ASM was called for. ... and then there is understanding of the hardware detail . . .

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •