DmaSpi for teensyduino 1.20 RC2

Status
Not open for further replies.
I have been following your dmaspi development and tested the 10/7 zipped update. For me the sketch hangs at the first while(trx.busy()) regardless of the number of bytes to be transferred. If you are looking for an additional tester, I'll volunteer. Any library that can reduce the FPU load is worth the effort.
 
I have been following your dmaspi development and tested the 10/7 zipped update. For me the sketch hangs at the first while(trx.busy()) regardless of the number of bytes to be transferred. If you are looking for an additional tester, I'll volunteer. Any library that can reduce the FPU load is worth the effort.

By "The 10/7 zipped update" you mean the zip in Post #21 (http://forum.pjrc.com/threads/26479-DmaSpi-for-teensyduino-1-20-RC2?p=55837&viewfull=1#post55837)? It should not hang.

  • Did you use the example included in the zip?
  • Where exactly does it hang?
  • It needs terminal input (any key) to start, and to proceed after initialization (I probably should make that clearer...sorry if you simply missed that)

Regards

Christoph
 
Last edited:
I re-downloaded the zip from #21. first digit/enter gives the 234us. Second digit/enter stops at the point listed below. I'm using a Teensy 3.1. pins 11 & 12 are connected for the loopback (non-DMA works). I was not sure this zip was also updated to your github folder. I have not used DMA on the Teensy, but will try a few things with trx.busy() and enter my comments on github.

Hi!
Buffers are prepared
Time for non-DMA transfer: 234us
src and dest match

Press a key to continue

Testing src -> dest, single transfer
--------------------------------------------------
 
Looks like the receive channel's ISR is not called for some reason. I've had that problem before and I'm having a look at it now. If I could only remember what the reason was back then...
 
I'm not sure what exactly the bug was, but here are my findings and my suggestion (it fixes the issue)
  • With teensyduino 1.20 RC 2: The RX ISR is not called
    • There is a bug in teensyduino 1.20 RC 2 that misconfigures the DMA channel in some cases
    • I'm not sure if that applies to the first lockup in the DmaSpi example
    • I had opened an issue back then, and Paul fixed the code, but only in the repo and in later release candidates. I should have opened an RC 3 branch for DmaSpi then, but I didn't. Big mistake.
  • With teensyduino 1.20 (released october 8th, 2014; no RC any more): no lockup in the DmaSpi example, it runs as intended. So download the new teensyduino 1.20 release, please. Download link: https://www.pjrc.com/teensy/td_download.html

The code I successfully ran today with the new teensyduino 1.20 release is in my repo: https://github.com/crteensy/DmaSpi/tree/teensyduino_1.20_RC2

Regards

Christoph
 
Last edited:
I downloaded teensyduino 1.20 and the latest github DmaSpi library.
I have not been able to get cs working. It remains LOW at all times. I paired down your example transfers to this:

waitForKeyPress();
DMASPI0.begin();
DMASPI0.start();
Serial.println("Testing src -> dest, single transfer");
ActiveLowChipSelect cs(0, SPISettings());
DmaSpi0::Transfer trx(src, DMASIZE, dest); //locks up if "0, &cs); included
trx = DmaSpi0::Transfer(src, DMASIZE, nullptr, 0b10101010, &cs); //fixed pattern
DMASPI0.registerTransfer(trx);
while(trx.busy());
delay(100); //separate loop control of clk pin
Serial.println("Finished DMA transfer");

The problem is that rxIsr_() is still not accessed.

Also, This may be how you designed it but Transfer trx(src, DMASIZE, dest); MUST be called once to define trx. An actual Transfer is not needed. But the form cannot be: DmaSpi0::Transfer trx(src, DMASIZE, dest, 0, &cs); This locks up.
 
So
Code:
DmaSpi0::Transfer trx(src, DMASIZE, dest); //locks up if "0, &cs); included
DMASPI0.registerTransfer(trx);
while(trx.busy());
works, but
Code:
DmaSpi0::Transfer trx(src, DMASIZE, dest, 0, &cs);
DMASPI0.registerTransfer(trx);
while(trx.busy());
locks? And
Code:
DmaSpi0::Transfer trx;
trx = DmaSpi0::Transfer(src, DMASIZE, nullptr, 0b10101010, &cs);
DMASPI0.registerTransfer(trx);
while(trx.busy());
does not lock, but cs is not working?
 
I think I found it. In DmaSpi::beginPendingTransfer():

If a transfer has a dummy sink or source (nullptr), the transfer size is not set correctly. In my example sketch this didn't have any effect because the DMA channels' transfer count was set by a transfer that was performed previously (the first transfer sets both channels' transfer counts to DMASIZE).

In your case, if you start with a dummy transfer, it doesn't set the transfer count and locks.

I've pushed to github, try the current state.
 
If your problems with ActiveLowChipSelect persist after you tried the transfer size fix: What happens when you use a DebugChipSelect? Does it output "Debug CS: select()"?

I was able to reproduce a locked down CS only with incorrectly set transfer sizes, so my latest commit might solve both of the problems you are seeing.
 
Last edited:
I worked through the ActiveLowChipSelect function.

When I call in the .ino: ActiveLowChipSelect cs(value, SPISettings());

With the value = 10 the cs pin works properly. In chipSelect.h at line 55: pin_(pin); both pin_ and pin equal 10. With value = 0 both equal 0, and cs remains low.

P.S. When I added DebugChipSelect cs; (and commented out ActiveLow...) it returned:
Debug CS: select()
Debug CS: deselect()
repeatedly.
 
Last edited:
So apparently something is keeping the ActiveLowChipSelect constructor from setting the pin to output, or from setting it to high level. Do you have any call Serial1.begin(...) in your code that might be reconfiguring pin 0 for UART usage?
 
I tried a few more things and have one version that works with cs(0, SPISettings()). I left the //s in the working version to make it easier to see what changed.
Code:
Problem code:
 DMASPI0.begin();
  DMASPI0.start();
   Serial.println("Testing src -> dest, with CHIP SELECT object");
  Serial.println("--------------------------------------------------");
  ActiveLowChipSelect cs(0, SPISettings());
  //DebugChipSelect cs;
  DmaSpi0::Transfer trx(src, 1, dummy);
  DmaSpi0::registerTransfer(trx);
  while(trx.busy());
  Serial.println("==================================================\n\n"); 
 
  long loopcntoutter = 5;
  long loopcntinner = 10; 

  waitForKeyPress();
  for (long j = loopcntoutter; j > 0; j--) {
   for (long i = loopcntinner; i > 0 ; i--){
    trx = DmaSpi0::Transfer(src, DMASIZE, dest, 0, &cs);
    DMASPI0.registerTransfer(trx);
    while(trx.busy());
    delay(30);    //between each DMASIZE
   }
  delay(70);
 }

Working code:
 DMASPI0.begin();
  DMASPI0.start();
  Serial.println("Testing src -> dest, with CHIP SELECT object");
  Serial.println("--------------------------------------------------");
  //ActiveLowChipSelect cs(0, SPISettings());
  //DebugChipSelect cs;
  //DmaSpi0::Transfer trx(src, 1, dummy);
  //DmaSpi0::registerTransfer(trx);
  //while(trx.busy());
  Serial.println("==================================================\n\n"); 
 
  long loopcntoutter = 5;
  long loopcntinner = 10; 

  waitForKeyPress();
  ActiveLowChipSelect cs(0, SPISettings());
  DmaSpi0::Transfer trx(src, 1, dummy);
  trx = DmaSpi0::Transfer(src, DMASIZE, dest, 0, &cs);
  for (long j = loopcntoutter; j > 0; j--) {
   for (long i = loopcntinner; i > 0 ; i--){
    //trx = DmaSpi0::Transfer(src, DMASIZE, dest, 0, &cs);
    DMASPI0.registerTransfer(trx);
    while(trx.busy());
    delay(30);    //between each DMASIZE
   }
  delay(70);
 }
 
the problem code:
Code:
 DMASPI0.begin();
  DMASPI0.start();
   Serial.println("Testing src -> dest, with CHIP SELECT object");
  Serial.println("--------------------------------------------------");
  ActiveLowChipSelect cs(0, SPISettings());
  //DebugChipSelect cs;
  DmaSpi0::Transfer trx(src, 1, dummy); // THIS LOOKS SUSPICIOUS!
  DmaSpi0::registerTransfer(trx);
  while(trx.busy());
  Serial.println("==================================================\n\n"); 
 
  long loopcntoutter = 5;
  long loopcntinner = 10; 

  waitForKeyPress();
  for (long j = loopcntoutter; j > 0; j--) {
   for (long i = loopcntinner; i > 0 ; i--){
    trx = DmaSpi0::Transfer(src, DMASIZE, dest, 0, &cs);
    DMASPI0.registerTransfer(trx);
    while(trx.busy());
    delay(30);    //between each DMASIZE
   }
  delay(70);
 }
Please give me the definition of dummy. It's in the "dest" argument place, and should be pointer to a location writable by DmaSpi. Is that the case? If not, it might cause <insert random problem here> depending on its value. What is your intention here? Do you want to discard incoming data or do you want to store incoming data in dummy?

The working code
Code:
 DMASPI0.begin();
  DMASPI0.start();
  Serial.println("Testing src -> dest, with CHIP SELECT object");
  Serial.println("--------------------------------------------------");
  //ActiveLowChipSelect cs(0, SPISettings());
  //DebugChipSelect cs;
  //DmaSpi0::Transfer trx(src, 1, dummy);
  //DmaSpi0::registerTransfer(trx);
  //while(trx.busy());
  Serial.println("==================================================\n\n"); 
 
  long loopcntoutter = 5;
  long loopcntinner = 10; 

  waitForKeyPress();
  ActiveLowChipSelect cs(0, SPISettings());
  DmaSpi0::Transfer trx(src, 1, dummy);
  trx = DmaSpi0::Transfer(src, DMASIZE, dest, 0, &cs);
  for (long j = loopcntoutter; j > 0; j--) {
   for (long i = loopcntinner; i > 0 ; i--){
    //trx = DmaSpi0::Transfer(src, DMASIZE, dest, 0, &cs);
    DMASPI0.registerTransfer(trx);
    while(trx.busy());
    delay(30);    //between each DMASIZE
   }
  delay(70);
 }
is a bit different in how the destination for data received from the slave is defined. Is dest defined like this:
Code:
volatile uint8_t dest[DMASIZE];
or similar, as in the example?
 
I added the dummy call to instantiate trx when the long &cs version locked up. Your new version fixed that problem to that it is not needed now.
Code:
working code:
  long loopcntoutter = 5;
  long loopcntinner = 10; 

  waitForKeyPress();
  ActiveLowChipSelect cs(0, SPISettings());
  DmaSpi0::Transfer trx(src, DMASIZE, dest, 0, &cs);
  for (long j = loopcntoutter; j > 0; j--) {
   for (long i = loopcntinner; i > 0 ; i--){
    DMASPI0.registerTransfer(trx);
    while(trx.busy());
    delay(30);    //between each DMASIZE
   }
  delay(70);
 }

Also, in ActiveLowChipSelect() a potential error with cs = 0 is not handled and the cs pin is left LOW. I added the following patch to trap cs =0 and instead use the board's default cs pin. You may have a better way to handle this.
Code:
#ifndef BOARD_SPI_DEFAULT_SS
#define BOARD_SPI_DEFAULT_SS (10)
#endif

ActiveLowChipSelect(const unsigned int& pin, const SPISettings& settings)
  : pin_(pin), 
  settings_(settings)
{
  pin_ = (pin) ? pin : BOARD_SPI_DEFAULT_SS;  //ppd
  pinMode(pin_, OUTPUT);
  digitalWriteFast(pin_, 1);
}
The code is working now so I will put it to use in my project. One suggestion: The library should support multiple cs pins for SPI devices connected in parallel. A LED matrix is a typical example. If you want more testing, we can continue this thread, otherwise , I will add any futures issues via github. Thx for all the effort.
 
That sounds good, thanks for the feedback!

Regarding specific ChipSelect classes: I introduced the abstract base class (AbstractChipSelect) for exactly this purpose: You can create your own, custom chip select, derive it from AbstractChipSelect, and implement whatever you want.

As nobody is currently reporting broken code, I will now do some cleaning up in the repo and merge the current state into the master branch.
 
Also, in ActiveLowChipSelect() a potential error with cs = 0 is not handled and the cs pin is left LOW. I added the following patch to trap cs =0 and instead use the board's default cs pin. You may have a better way to handle this.
Code:
#ifndef BOARD_SPI_DEFAULT_SS
#define BOARD_SPI_DEFAULT_SS (10)
#endif

ActiveLowChipSelect(const unsigned int& pin, const SPISettings& settings)
  : pin_(pin), 
  settings_(settings)
{
  pin_ = (pin) ? pin : BOARD_SPI_DEFAULT_SS;  //ppd
  pinMode(pin_, OUTPUT);
  digitalWriteFast(pin_, 1);
}

I don't quite understand why pin=0 should be trapped as a potential error. The pin exists, and it can be used as an output, can't it? https://www.pjrc.com/teensy/pinout.html shows pin 0 as RX1. If Serial1 is not used, the pin can be used for GPIO.

Also, NEVER EVER implicitly use a pin (like SS, pin 10 in your suggestion) as an output instead of another. That can seriously fry the mk20 or other parts, if it's not expected. The better solution in this case is to use an assertion, I'd even say static_assert, because the hardware configuration is known at compile time:
Code:
static_assert(pin > 0, "You are trying to use an unsupported pin for chip select (pin 0 is now allowed)");
Again, I'm pretty confident that pin0 is allowed to be used as GPIO.
 
That's great, thank you for testing and using it.

During the previous discussions I had the impression that the transfer constructor might benefit from a few overloads or a better grouping of parameters. Is that correct? It could for example accept parameter objects that specify source and destination in a more concise manner, without the need for defaults or filling in unused parameters. Something like
Code:
DmaSpi0::Transfer(dummySource(0xFF), number_of_words, destination(myBuffer), &cs);
or
Code:
DmaSpi0::Transfer(source(mySource), number_of_words, dummyDestination());
This might be easier to set up and the compiler could optimize them just fine, resulting in no overhead.

Any other suggestions or requests?
 
I'm having issues using the library to repeatedly send multiple transfers with an actual SPI device

This code works fine but is very precarious.
Code:
#include <WProgram.h>

#include <SPI.h>
#include <DmaSpi.h>

#define DMASIZE 255

#define cs_pin  10   //Chip select pin


uint8_t src[DMASIZE];
volatile uint8_t dest[DMASIZE];
volatile uint8_t dest0[DMASIZE];
volatile uint8_t dest1[DMASIZE];


SPISettings spiSettings;

byte read_cmd0[DMASIZE]={
  0x03,0,0,0};
byte read_cmd1[DMASIZE]={
  0x03,0,0,0x40};


byte sent[8]={
};
byte req[8]={
  1,1,1,1,1,1,1,1};

uint32_t us1,us2,us3,us4,us0;

long prev1,prev2;
byte busy0;
byte printq[8]={
};
byte received_sample[8]={
};


///////////////////////////////////////////////////////////////////////////

void setSrc()
{
  for (size_t i = 0; i < DMASIZE; i++)
  {
    // int r = random(0,255);
    src[i] = i;
    Serial.print(src[i],HEX);
    Serial.print(" ");

  }
  Serial.println();
  Serial.println("src'd");


}

void clrDest(uint8_t* dest_)
{
  memset((void*)dest_, 0x00, DMASIZE);
}


//////////////////////////////s25fl commands

byte stat(){  
  SPI.beginTransaction(spiSettings);  
  digitalWrite (cs_pin, LOW);
  SPI.transfer(0x05);
  byte stat = SPI.transfer(0);
  digitalWrite (cs_pin, HIGH);
  SPI.endTransaction();
  return stat;
}

void waitforit (){
  while ((stat() & B0000001)==B00000001)
  {

  }
  Serial.println("flash ready");
}

void write_enable(){

  SPI.beginTransaction(spiSettings);   
  digitalWriteFast(cs_pin,LOW);
  SPI.transfer(0x06);
  digitalWriteFast(cs_pin,HIGH);
  SPI.endTransaction();

  waitforit();
}


///////////////////////////////////////////////////////////////////////////////

void setup()
{
  randomSeed(A0);
  while (!Serial) {
  }

  delay(100);
  //     Serial.println("jhey");


  pinMode(cs_pin, OUTPUT);
  pinMode(LED_BUILTIN, OUTPUT);
  pinMode(3, OUTPUT);
  pinMode(4, OUTPUT);
  pinMode(5, OUTPUT);

  delay(1000);
  randomSeed(A1);

  /** Prepare source and destination **/
  setSrc();
  clrDest((uint8_t*)dest);
  Serial.flush();

  /** set up SPI **/
  SPI.begin();

  write_enable();
  SPI.beginTransaction(spiSettings);   //errase first 64k
  digitalWrite (cs_pin, LOW);
  SPI.transfer(0xD8);
  SPI.transfer(0);
  SPI.transfer(0);
  SPI.transfer(0);
  digitalWrite (cs_pin, HIGH);
  SPI.endTransaction();
  waitforit ();
  // Serial.println("errased");
  write_enable();

  SPI.beginTransaction(spiSettings);   //write
  digitalWrite (cs_pin, LOW);
  SPI.transfer(0x02);
  SPI.transfer(0);
  SPI.transfer(0);
  SPI.transfer(0);
  SPI.transfer(src, DMASIZE);
  digitalWrite (cs_pin, HIGH);
  SPI.endTransaction();
  waitforit ();
  //  Serial.println("written");


  SPI.beginTransaction(spiSettings);   //read
  digitalWrite (cs_pin, LOW);
  SPI.transfer(0x03);
  SPI.transfer(0);
  SPI.transfer(0);
  SPI.transfer(0);
  for (int i = 0; i < DMASIZE; ++i)
  {
    dest[i]=SPI.transfer(0);

  }
  digitalWrite (cs_pin, HIGH);
  SPI.endTransaction();    
  waitforit ();

  //   Serial.println("read");


  Serial.flush();

  DMASPI0.begin();
  DMASPI0.start();


  ActiveLowChipSelect cs(10, SPISettings()); //cs seems to work
  DmaSpi0::Transfer trans0(src, 1, dest);
  trans0 = DmaSpi0::Transfer(read_cmd0, DMASIZE, dest, 0, &cs);

  DMASPI0.registerTransfer(trans0);

  while (trans0.done()==0){
  };

  for (int i = 4; i < 127+4; ++i)
  {
    Serial.print(dest[i]);
    Serial.print(" ");

  }

}

//////////////////////////////////////////////////////////////////////////

ActiveLowChipSelect cs(10, SPISettings()); 

DmaSpi0::Transfer trans0(read_cmd0, DMASIZE, dest0, 0, &cs);
DmaSpi0::Transfer trans1(read_cmd1, DMASIZE, dest1, 0, &cs);



void loop()
{

  ///Serial.println("start");

  //Serial.print(trans0.busy());  Serial.println(" 0busy");
  // Serial.print(trans1.busy());  Serial.println(" 1busy");

  ActiveLowChipSelect cs(10, SPISettings());


  if (trans0.busy()==0 && req[0]==1){
    // Serial.println("Q 0 ");
    us0=micros();
    trans0 = DmaSpi0::Transfer(read_cmd0, DMASIZE-4, dest0, 0, &cs);
    DMASPI0.registerTransfer(trans0);
    sent[0]=1;
    req[0]=0;

    //Serial.print(micros()-us1);Serial.println(" 0 us");

  }

  if (trans1.busy()==0 && req[1]==1){
    //Serial.println("Q 1 ");
    us1=micros();
    //ActiveLowChipSelect cs(10, SPISettings()); 
    trans1 = DmaSpi0::Transfer(read_cmd1, DMASIZE, dest1, 0, &cs);
    DMASPI0.registerTransfer(trans1);
    sent[1]=1;
    req[1]=0;

    // Serial.print(micros()-us1);Serial.println(" 1 us");

  }




  if (trans0.busy()==0 && sent[0]==1){

    uint32_t t2 = micros()-us0;

    if ((millis()-prev2)>1000){
      prev2=millis();

      Serial.println("dest0: ");

      for (int i = 4; i < 127+4; ++i)
      {
        Serial.print(dest0[i]);
        Serial.print(" ");

      }

      Serial.println();

    }

    //Serial.print(t2);Serial.println(" us0");
    sent[0]=0;
    req[0]=1;
    clrDest((uint8_t*)dest0);

    //delay(1000);


  }

  if (trans1.busy()==0 && sent[1]==1){

    uint32_t t3 = micros()-us1;


    if ((millis()-prev1)>1000){
      prev1=millis();

      Serial.println("dest1: ");

      for (int i = 4; i < 127+4; ++i)
      {
        Serial.print(dest1[i]);
        Serial.print(" ");

      }

      Serial.println();

    }

    //Serial.print(t3);Serial.println(" us1");
    sent[1]=0;
    req[1]=1;
    clrDest((uint8_t*)dest1);

    //delay(1000);


  }


  // waste time 
  for (int i = 0; i < 100; ++i)
  {
    byte r,b,g;
    r=(micros()>>24) & 127;
    g=(micros()>>16) & 127;
    b=(micros()>>8) & 127;
    analogWrite(3,g);
    analogWrite(4,b);
    analogWrite(5,r);
  }

}

If I remove the time wasting analogWrite or replace them with delays or basically move anything around it crashes.
Shouldn't busy and done keep it for tripping over itself?

The final code needs to queue up 8 transfers and have them continuously operating as fast as possible while lots of other things are happening.

I obviously have some basic misunderstandings about how to use it.
What is the difference between this

Code:
  trx = DmaSpi0::Transfer(src, DMASIZE, dest);
and
Code:
  DmaSpi0::Transfer trx(src, DMASIZE, dest);
?

Thanks for the great library!
 
Last edited:
Status
Not open for further replies.
Back
Top