DMASPI library needs some (probably breaking) changes to really support multiple SPIs

Status
Not open for further replies.
christoph, a short question:

Does the lib use the SPI-FIFO for tx ? If not, any idea how to enable it ? I'm asking, because with my own code i'm getting slower speeds than theoretically possible.
 
Are you aware of any potential conflicts using the ADC library DMA functionality?
No, that shouldn't be a problem unless you run out of DMA channels. DMASPI needs two of them for each SPI. If you only use one SPI, it will only use two channels for that one.

That is what I was thinking would be the case. So a potential solution could be a separate function that pushes out a 16x16 buffer for section writes, that is copied from the main 128x64 buffer using memcpy?
If your display supports that, this would be a viable solution for the display part of the whole thing. However, you still need your display code to work together with the DAC part: It sounds like regular DAC writes would be preferable. Then you can do the following:
  • Set up a callback that is triggered after each DAC write
  • In that callback, check if new data needs to be written to the display. If so, register the DMASPI transfer and return immediately.
  • AFAIK DMASPI transfers can be registered within ISR contexts, so interrupt-driven DAC is fine for that.
 
christoph, a short question:

Does the lib use the SPI-FIFO for tx ? If not, any idea how to enable it ? I'm asking, because with my own code i'm getting slower speeds than theoretically possible.
The only transmit DMA trigger is the TFFF (Transmit FIFO Fill Flag), which is set "while the transmit FIFO is not full", so the words should go out back to back, without significant gaps in between. Here's the relevant code for KINETIS_K:

DMA channel setup, only once when the DMASPI engine is initialized (https://github.com/crteensy/DmaSpi/blob/master/DmaSpi.h#L532):
Code:
static void begin_setup_txChannel_impl()
{
  txChannel_()->disable();
  txChannel_()->destination((volatile uint8_t&)SPI0_PUSHR);
  txChannel_()->disableOnCompletion();
  txChannel_()->triggerAtHardwareEvent(DMAMUX_SOURCE_SPI0_TX);
}

TX DMA channel setup for each transfer (https://github.com/crteensy/DmaSpi/blob/master/DmaSpi.h#L469) :
Code:
      if (m_pCurrentTransfer->m_pSource != nullptr)
      {
        // real data source
        DMASPI_PRINT(("  real source\n"));
        txChannel_()->sourceBuffer(m_pCurrentTransfer->m_pSource,
                                   m_pCurrentTransfer->m_transferCount);
      }
      else
      {
        // dummy data source
        DMASPI_PRINT(("  dummy source\n"));
        txChannel_()->source(m_pCurrentTransfer->m_fill);
        txChannel_()->transferCount(m_pCurrentTransfer->m_transferCount);
      }

SPI part, before selecting a chip (https://github.com/crteensy/DmaSpi/blob/master/DmaSpi.h#L550):
Code:
static void pre_cs_impl()
{
  SPI0_SR = 0xFF0F0000;
  SPI0_RSER = SPI_RSER_RFDF_RE | SPI_RSER_RFDF_DIRS | SPI_RSER_TFFF_RE | SPI_RSER_TFFF_DIRS;
}

Finally, after selecting a chip, when the transfer is set up and we're ready to send/receive (https://github.com/crteensy/DmaSpi/blob/master/DmaSpi.h#L556):
Code:
static void post_cs_impl()
{
  rxChannel_()->enable();
  txChannel_()->enable();
}

The completion ISR doesn't do anything special to the TX DMA channel, it just clears SPI flags and the RX channel interrupt flag.

Possible attack vectors:
  • Check your SPI settings (bitrate) - do you see gaps on a scope (if you have one)?
  • If other DMA activity might be getting in the way: check channel arbitration (no real experience with that here), maybe something is simply getting in the way before the FIFO runs empty

Also, what does "your own code" do? Is it a different go at SPI with DMA?

Regards

Christoph
 
If your display supports that, this would be a viable solution for the display part of the whole thing. However, you still need your display code to work together with the DAC part: It sounds like regular DAC writes would be preferable. Then you can do the following:

Yes; so far this has been the only tricky bit I am still hung up on. I've been trying to find the proper/clean way of setting up the trx.busy to be able to belay any other potential SPI communications or else "^&$^(%&)‚fl°‡··‡fl·°‡fl·°flfi››—€....."


[*]Set up a callback that is triggered after each DAC write

Any good resources on setting this up you are aware of? Specifically will I look into callbacks triggered by ISR? Or would I need to use a method of DmaSpi for the trigger from the end of the

DAC ISR (currently @ 50us IntervalTimer )::

For each of the 4 ADC/DAC channel pairs::::
read the ADC from the 2 channel ports to optimize ADC usage, (4 channels input to the 2 internal ADC units that flip between the busiest).
find the phase for the wavetable driver depending on the mode (switch case for tempo/trigger/hertz),
write the DAC channel


Somehow it is working fine with the 4 channels in the ISR at an average loop time of ~30-40us
the screen buffer transfer time consumption for full screen write using SPI transactions ~2600us
 
By the way, sending intermittent chaos to an old screen can cause some very interesting glitches. One in particular has me very curious. I am not aware of anyone having made this display work in multiple levels of greyscale, but Ive gotten smooth grayscale gradients from the glitches at times, and also some really interesting bright as hell slow scan effects that look like running a copy machine with the lid open (bright light bar stream)

Im going to see if I can get the DAC communication ISR to regulate the timing of the screen
 
Other parts of your code that use the SPI must be made aware of the fact that SPI is being used asynchronously by the DMASPI library. You can create a mutex-like object that is used to lock the SPI for a specific code module, and unlocks it when that module is done with SPI. There are many ways to do this, but the bottom line is that these parts must cooperate in some way.

The glitches and slow scan effects can have many reasons, which we can't investigate without the code.
 
hey there, this question may not be totally related to the topic, but I don't feel like opening a new one. is it possible to use hardware chip select on teensy 3.1/3.2 ?
 
Thought that while I have been playing around with trying to make an Async version of SPI transfer method, I have run into a few different issues with DMA and some of them impact this library as well, so I thought I would mention some of them here. More info up on a couple other threads including: https://forum.pjrc.com/threads/43048-How-best-to-manage-multiple-SPI-busses and https://forum.pjrc.com/threads/43585-Teensy-3-5-SPI-DMA

a) Teensy LC - Does not work at all with Dma on current release - The system would at times do 8 bit or 16 bit transfer to a 32 bit register which must be accessed 32 bits and faults... TNI put in PR into core project. I also put in PR (https://github.com/PaulStoffregen/cores/pull/242) with this and a few other DMA issues.

b) Once it get a) working, I am running into issues on SPI1, where when you disable SPI1 (before CS call) and then reenable, there appears to be a clock pulse which is causing the data to get corrupted. I have a version of your test program that I hacked up for SPI1... I set jumper to 0-1 and the buffers don't match.
Code:
Hi!
Buffers are prepared
Time for non-DMA transfer: 276us
src and dest match

Press a key to continue

DmaSpi::begin() : DmaSpi::start() : state_ = eStopped
DmaSpi::beginNextTransfer: no pending transfer
Transfer @ 0x20001774
Testing src -> dest, single transfer
--------------------------------------------------
Transfer @ 0x200017ac
DmaSpi::registerTransfer(0x20001774)
  DmaSpi::addTransferToQueue() : queueing transfer
  starting transfer
DmaSpi::beginNextTransfer: starting transfer @ 0x20001774
  this was the last in the queue
  real sink
  real source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffe74 64 a01a0080 
TX: 1fffff3c 40077006 64 20520080 
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001774
  state = eRunning
DmaSpi::beginNextTransfer: no pending transfer
Finished DMA transfer
src and dest don't match
 src: 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 0x63 
dest: 0x00 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 
==================================================


Testing src -> discard, single transfer
--------------------------------------------------
Transfer @ 0x200017ac
DmaSpi::registerTransfer(0x20001774)
  DmaSpi::addTransferToQueue() : queueing transfer
  starting transfer
DmaSpi::beginNextTransfer: starting transfer @ 0x20001774
  this was the last in the queue
  dummy sink
  real source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffe50 64 a0120080 
TX: 1fffff3c 40077006 64 20520080 
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001774
  state = eRunning
DmaSpi::beginNextTransfer: no pending transfer
Finished DMA transfer
last discarded value is 0x61
That appears to be wrong, it should be src[DMASIZE-1] which is 0x63
==================================================


Testing 0xFF dummy data -> dest, single transfer
--------------------------------------------------
Transfer @ 0x200017ac
DmaSpi::registerTransfer(0x20001774)
  DmaSpi::addTransferToQueue() : queueing transfer
  starting transfer
DmaSpi::beginNextTransfer: starting transfer @ 0x20001774
  this was the last in the queue
  real sink
  dummy source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffe74 64 a01a0080 
TX: 20001784 40077006 64 20120080 
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001774
  state = eRunning
DmaSpi::beginNextTransfer: no pending transfer
Finished DMA transfer
src and dest don't match
 src: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 
dest: 0x62 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 
==================================================


Testing multiple queued transfers
--------------------------------------------------
Transfer @ 0x200017ac
Transfer @ 0x20001790
DmaSpi::registerTransfer(0x20001774)
  DmaSpi::addTransferToQueue() : queueing transfer
  starting transfer
DmaSpi::beginNextTransfer: starting transfer @ 0x20001774
  this was the last in the queue
  real sink
  real source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffe74 64 a01a0080 
TX: 1fffff3c 40077006 64 20520080 
DmaSpi::registerTransfer(0x20001790)
  DmaSpi::addTransferToQueue() : queueing transfer
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001774
  state = eRunning
DmaSpi::beginNextTransfer: starting transfer @ 0x20001790
  this was the last in the queue
  real sink
  real source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffed8 64 a01a0080 
TX: 1fffff3c 40077006 64 20520080 
Finished DMA transfer
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001790
  state = eRunning
DmaSpi::beginNextTransfer: no pending transfer
Finished DMA transfer1
src and dest don't match
 src: 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 0x63 
dest: 0xff 0x00 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 
src and dest don't match
 src: 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 0x63 
dest: 0x62 0x00 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 
==================================================


Testing pause and restart
--------------------------------------------------
DmaSpi::registerTransfer(0x20001774)
  DmaSpi::addTransferToQueue() : queueing transfer
  starting transfer
DmaSpi::beginNextTransfer: starting transfer @ 0x20001774
  this was the last in the queue
  real sink
  real source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffe74 64 a01a0080 
TX: 1fffff3c 40077006 64 20520080 
DmaSpi::registerTransfer(0x20001790)
  DmaSpi::addTransferToQueue() : queueing transfer
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001774
  state = eStopping
Time until stopped: 249 us
Finished DMA transfer
DMA SPI appears to have stopped (this is good)
restarting
DmaSpi::start() : state_ = eStopped
DmaSpi::beginNextTransfer: starting transfer @ 0x20001790
  this was the last in the queue
  real sink
  real source
post_cs S C1 C2: 20 50 24
RX: 40077006 1ffffed8 64 a01a0080 
TX: 1fffff3c 40077006 64 20520080 
DmaSpi::rxIsr_()
  finishCurrentTransfer() @ 0x20001790
  state = eRunning
DmaSpi::beginNextTransfer: no pending transfer
Finished DMA transfer1
src and dest don't match
 src: 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 0x63 
dest: 0x62 0x00 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 
src and dest don't match
 src: 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 0x62 0x63 
dest: 0x62 0x00 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f 0x60 0x61 
==================================================


Testing src -> dest, with chip select object
--------------------------------------------------
Transfer @ 0x200017ac
DmaSpi::registerTransfer(0x20001774)
  DmaSpi::addTransferToQueue() : queueing transfer
  starting transfer
DmaSpi::beginNextTransfer: starting transfer @ 0x20001774
  this was the last in the queue
  real sink
  real source

c) on Teensy 3.x boards, if you do a 16 bit transfer just before you do a DMA transfer, your transfer will screw up. That is if in your dma example program you add a call: like:
Code:
  SPI.transfer16(0xffff);
 DmaSpi::Transfer trx(nullptr, 0, nullptr);
That is because the high word will be still set to 16 bit mode and all of your one byte writes to PUSHR will output two bytes on the buss... In my SPI version, what I did to resolve this was to have my library code do it's own PUSHR of the first users data byte with CTAR0 and also CONT bit so the transfer then actually runs faster. (Code up in my SPI fork/branch)

d) Teensy 3.5 with SPI1/2 - I was interested in getting async operations to work here as well. But SPI1 and SPI2 only have one DMA_SOURCE for SPI1 or SPI2, these can be used for TX or RX but not both at the same time. My current version does a couple of things depending on if it is TX only, RX only or a transfer. Currently in all cases I am using the DMA channel for TX, although may change later.

1) TX only operation - I setup to only set the DMA to do WRITES, I setup to interrupt when this completes, and I then toss data from RX queue. However this had problem that it would return before all data was sent, so if user changes CS there was problem. So the code now does not have the DMA send the last byte, but instead the interrupt handler here, then issues the PUSHR with EOQ flag, and I have an SPI interrupt handler which receives the EOQ interrupt and then does the appropriate stuff for the end.

2) RX and Transfer - I have simple SPI interrupt handler which processes the interrupt for stuff in RX Fifo and processes it... May change this later to have TX as interrupt and RX as DMA channel as if the interrupts are not processed fast enough then maybe some RX data is lost, but if we are slow to process TX, it should hopefully just slow the operation...

That is all for now.
 
c) on Teensy 3.x boards, if you do a 16 bit transfer just before you do a DMA transfer, your transfer will screw up. That is if in your dma example program you add a call: like:
Code:
  SPI.transfer16(0xffff);
 DmaSpi::Transfer trx(nullptr, 0, nullptr);
That is because the high word will be still set to 16 bit mode and all of your one byte writes to PUSHR will output two bytes on the buss...
On Teensy 3.6.

Teensy 3.2 / 3.5 zero out the command word (at least if DMA without scatter-gather is used), so they end up with CTAS 0.

Code:
  DmaSpi::Transfer trx(nullptr, 0, nullptr);
  Serial.printf("SPI0 PUSHR (1): %x\n", SPI0_PUSHR);
  SPI.transfer16(0xffff);
  Serial.printf("SPI0 PUSHR (2): %x\n", SPI0_PUSHR);
  trx = DmaSpi::Transfer(src, DMASIZE, dest);
  dmaspi.registerTransfer(trx);
  while(trx.busy()) {}
  Serial.printf("SPI0 PUSHR (3): %x\n", SPI0_PUSHR);

Teensy 3.2:
SPI0 PUSHR (1): 63
SPI0 PUSHR (2): 1000ffff
SPI0 PUSHR (3): 63


Teensy 3.6:
SPI0 PUSHR (1): 63
SPI0 PUSHR (2): 1000ffff
SPI0 PUSHR (3): 10000063
 
eeew...that sounds like a lot of trouble. I'll try to wrap my head around it and see if I can find time to add fixes. Regarding your bullets up there:
a) Teensy LC doesn't work at all? Which teensyduino version is this? Did you just run the example that comes with the current DMASPI code?
b) I'll skip that for now since you pointed out that a) should be working first
c) the library can probably clear the command word, which sounds like the correct thing to do here
d) Some time ago I started to think about ways to use the single DMA source in T3.5, but never came to a satisfying result.
 
a) 1.36 teensyduino. Yes I tried your dma test, which also faults. Again simple fix to patch DMAChannel.h
Patch the DMABaseClass
Code:
#elif defined(KINETISL)


class DMABaseClass {
public:
	typedef struct __attribute__((packed, aligned(4))) {
		volatile const void * volatile SAR;
		volatile void * volatile       DAR;
		volatile uint32_t              DSR_BCR;
		volatile uint32_t              DCR;
	} CFG_t;
	CFG_t *CFG;
...
The change was the aligned(4) in the attribute. (Look for TNI's Pull Request for core...

b) My guess you would have same issue with earlier versions of Teensyduino as well.
c) Probably several work arounds will work..
d) yep a pain...
 
a) Teensy LC doesn't work at all?
It does occasionally work, when the compiler has full code visibility and can be sure about alignment of the DMA register struct. But if there is a possibility that the DMA register pointer was changed, the register access will get mis-compiled and fail.

c) the library can probably clear the command word, which sounds like the correct thing to do here
The chip designers went out of their way to make life difficult. You can't write the command word by itself. You might want to do a SPI.transfer(...) for the first byte if CTAS 1 is selected (read PUSHR to find out).
 
@KurtE, regarding option 2) for T3.5 up there: for combined TX/RX, isn't there some kind of linking that would trigger one DMA channel when another has completed a transfer (minor loop linking)? Then it might be possible to do the following:
  • Send the first of n bytes manually
  • Let the RX DMA channel wait for the corresponding incoming byte (n transfers)
  • Let the RX channel trigger the TX channel (n-1 transfers)
So the RX channel would be driven by the SPI's DMA request line, and the TX channel by the RX channel. It would surely not be as smooth as a true two-channel DMA SPI using the FIFO (which is what we have now, more or less), but it's probably better than nothing.

Edit: Oh I see in the other thread that you already tried to implement that approach. I'll try to fully understand what you wrote there...
 
Last edited:
It does occasionally work, when the compiler has full code visibility and can be sure about alignment of the DMA register struct. But if there is a possibility that the DMA register pointer was changed, the register access will get mis-compiled and fail.


The chip designers went out of their way to make life difficult. You can't write the command word by itself. You might want to do a SPI.transfer(...) for the first byte if CTAS 1 is selected (read PUSHR to find out).

What I would typically find is it would crash when it was trying to set either the source or destination pointer in the DMASettings object.

And as you mentioned I did not see any way to clear out the upper word, so that is why I ended up pushing the first byte of the transfer...


@KurtE, regarding option 2) for T3.5 up there: for combined TX/RX, isn't there some kind of linking that would trigger one DMA channel when another has completed a transfer (minor loop linking)? Then it might be possible to do the following:
  • Send the first of n bytes manually
  • Let the RX DMA channel wait for the corresponding incoming byte (n transfers)
  • Let the RX channel trigger the TX channel (n-1 transfers)
So the RX channel would be driven by the SPI's DMA request line, and the TX channel by the RX channel. It would surely not be as smooth as a true two-channel DMA SPI using the FIFO (which is what we have now, more or less), but it's probably better than nothing.
Good question. I am not sure, and you obviously have played more with DMA than I have... I just kept trying things until I found something that worked.

Example: For my c) my first attempt was to setup a DMASettings Chain where for first setting was to output the first byte, but have it setup to output 32 bits (full PUSHR) with settings. Then link it to one that output to PUSHR with size 1 for the rest of the bytes... Worked great on T3.6 but completely failed on both 3.5 and 3.1... If it had worked I also had it setup with a third item in the list for the 3.5 that would again output to PUSHR with 4 bytes for the last data output byte with the EOQ flag...

But if you find a clean way that works on the 3.5... That would be great. As I mentioned for SPI1/2 if you are doing a write operation, it is not too bad. You get two interrupts. 1 at end of DMAChannel, which is when the SPI system says it can handle more data and you have none, but the SPI is still active... Where I then push the last byte with EOQ and then you get the SPI interrupt for EOQ at the end of it's transfer.
 
you obviously have played more with DMA than I have...
Actually I doubt that. When I wrote DMASPI I had a clear goal and pursued it until I got it to work. I pretty much ignored any DMA feature that didn't sound promising, and from your posts I have the impression that you have tried more than I did. In that sense, you're the expert. I'll have a look at Paul's DmaChannel class to see if it supports minor loop linking.

Indeed! There it is: https://github.com/PaulStoffregen/cores/blob/master/teensy3/DMAChannel.h#L450
 
Last edited:
FYI - I did a quick search for KL26 dma SPI issues and do a see a few things like:
http://www.thepositiverail.com/blog...ce-driver-for-the-kinetis-kl26-spi-peripheral

Which talks about, the issue I was seeing when I tried to remove the disable/enable SPI earlier today. Excerpt:
2) First byte. It turns out that the DMA system is not quite fast enough to catch the first DMA transfer request - the symptom of this is the first byte in the transmission is duplicated. Again, a little tricky to catch, if you're not paying attention. The solution is to push the first word out with the CPU.

With these two twists in mind, the procedure for transmitting using DMA then is:

logic low on chip select (enable)
push first byte into DL register (this is actually a read of the status register S, followed by write to the DL - otherwise the DL write is ignored)
enable transmit DMA request (TXDMAE)
wait for the transmit DMA to complete (I use an interrupt handler to set an indicator to 1 and poll it; can also be done using an RTOS semaphore)
poll the SPI Receive Full (status register S, SRF) flag until it is set, indicating that the last word was read from the pin
logic high on chip select (disable)
reset DMA and interrupts - if you don't, you will also get some garbled first bytes (the direct push out of the register will be ignored, and you'll see a double byte)
 
Maybe? It depends on how the compiler is feeling... Or maybe compiler optimize options or...
And I think it started showing up with the new compiler that is in the current release

And also maybe depending on how the different fields get set. As I sort of mentioned, I made three changes to core (kinetis.h, and dmachannel.h)
1) TNI's fix again
2) DMA CHannel names corrections for T3.5
3) code that sets CITR/BITR were wrong when you set the length to something >= 512 and then back to something < 512...
 
Sounds good. I made progress with T-LC running on both SPI and SPI1. I am now doing like the link I mentioned above and doing a manual push of the first byte to SPI before turning on DMA... Also was hitting some random timing issues so currently disable interrupts around the push and reenable after I turned on both channels. More on my other thread...
 
Teenyduino 1.39 release, further development

Now that teensyduino 1.39 is released and KurtE pull-requested me towards the fact that there's apparently a common base to all SPIs, some changes to the code were needed.

What I've done now:
  • updated the code to work with a common SPI base. This will go by unnoticed since it's an internal detail
  • accepted KurtE's second example for SPI1
  • added an AbstractChipSelect1 class

This also opens the door to a simplification in the chip select classes: Until now, AbstractChipSelect was hard-coded to use a specific SPI instance (SPI, SPI1 or SPI2). Now it's possible to implement (not done yet) one of the following:
  • let the DMASPI library pass the SPI instance it's using to a chip select object
  • change the AbstractChipSelect constructor to include a reference to the SPI it's supposed to use

So far I've tested the current state of DMASPI on a Teensy 3.6, SPI0 and SPI1. I dont' see a reason why the others shouldn't work but I've marked them as "experimental" in the readme.
 
Hi christoph,

Have you looked at the issues associated with SPI1 and SPI2 on the T3.5 yet?

It took me awhile to get it to work for the Async SPI transfers that are now part of the build. Mine are rather limited in flexibility, but works.
 
Status
Not open for further replies.
Back
Top