SPI speedup question

@joepasquariello Perhaps you are right, I should have stopped at the first line. Let me rephrase.

My sincere feeling is that in an embedded environment, the default API should be close to the hardware and something like shared access, that inevitably costs extra cycles and serves a subset of the possible use cases, should be a convenience layer.

Everything is a tradeoff... There used to be code in place, that would remember if the last output was 8 bit or 16 bit and if the next one is different it changes the setting. Which in isolation worked great.

However, then you run into issues. like someone comes along who believe they can do it better and writes their own spi transfer function and change the registers out from under you. And that flag is out of sync and your code starts to show some subtle errors. These issues often take a long time to isolate and track down. Several cases of this happened. Also, a lot of the code out there just assumed that transfers were 8 bits and again ran into libraries and sketches that no longer worked correctly.

So it often comes down to, what is more important, compatibility or the fastest possible some metric for some specific usage case? And with standard libraries like SPI, Wire, ... compatibility is usually considered reasonably important.
 
Well that's neat, the code. Thank you.

I am very grateful for the help that has been offered, that you know the part and the library so well is a huge help, and I appreciate the reasons for supporting the less experienced users. I sincerely apologize if I offended.

Thank you again
 
@kurtE, I would maintain that the low level should be save/set the number of bits, and transfer, and above that a call that does an 8/16/32/block transfer with all of the dress-up. But then maybe it should be re-entrant as well. Now you need something atomic or you have to save and restore interrupt control. So it is quite a kettle of fish to treat shared access for inexpert users as a general case. It is bound to cost cycles and trade functionality of one sort for another. I like Joe's helper functions, why not make those available in the library and have the best of both worlds?
 
@kurtE, I would maintain that the low level should be save/set the number of bits, and transfer, and above that a call that does an 8/16/32/block transfer with all of the dress-up. But then maybe it should be re-entrant as well. Now you need something atomic or you have to save and restore interrupt control. So it is quite a kettle of fish to treat shared access for inexpert users as a general case. It is bound to cost cycles and trade functionality of one sort for another. I like Joe's helper functions, why not make those available in the library and have the best of both worlds?

Quick note: I personally try not to make any assumptions about the experience levels of different members up here, unless of course if they mention it. To me it is often more, a matter of what your current project is, and often times it is more about where you want to spend your time. That is you are talking about one specific case where maybe that is the only hardware using the SPI bus. Where others may have Audio board, with both SD and memory, plus a display and maybe touch screen support and simply need everything to work right. In some cases they may for example really need display updates as fast as possible. Other cases they may need SD to work fast...

As to adding more stuff to SPI, that is totally up to Paul. When I added the methods: transfer(buffer, retbuffer, count) and likewise the DMA version, he rejected the transfer16 versions of it I had as well. Later someone allowed something like that, but only in the private section. Still not sure what use that has, but... Note: one of the big purposes of the SPI library is to be compatible with the SPI documention:
https://www.arduino.cc/reference/en/language/functions/communication/spi/

So can always try to see if Paul would take a PR on this.

Myself I might simply see if I could find a "Good Enough" solution to this: For example something like:
Code:
    SPI.setTransferWriteFill(0xff);
    for (i=0; i<NREADOUT; i++){ 

      while ( READPINA ) {}   // wait while high
      while ( !READPINA ) {}  // wait while low

      SETPINB;
      delayNanoseconds( 710 );
      CLEARPINB;
      uint8_t temp_buf[2];
      SPI.transfer(nullptr, temp_buf, 2);
      // depending on MSB or LSB, you may be able to simply use the buffer directly or
      *p16++ = temp_buf[0] << 8 + temp_buf[1];  // may need to swap 0 and 1 here. 
    }
So simply put just transfer 2 bytes, and if necessary, swap the two bytes when you add it to your output buffer...
the setTrasnferWriteFill, says what char to send if the use passes in nullptr for send...
 
So simply put just transfer 2 bytes, and if necessary, swap the two bytes when you add it to your output buffer...
the setTrasnferWriteFill, says what char to send if the use passes in nullptr for send...

Hi Kurt. I tried this earlier today. With no change to SPI source, it's about the same as SPI.transfer16() with the set/restore of framesize. If you set the delay fields of TCR to 0, then you get about 2/3 of the improvement of avoiding the set/restore of framesize in transfer16(). I also tried the CONT and CONTC flags, but didn't see any further improvement.
 
Back
Top