Why does SPI.beginTransaction have settings as an argument?

Status
Not open for further replies.

Bill Greiman

Well-known member
Code:
An RTOS normally uses a Mutex for sharing. I would like to keep SdFat and the my new generic FAT library device and RTOS independent.

SPI.beginTransaction just doesn't fit well. It mixes locks with device setup. Why is there not a simple SPI.beginTransaction() that doesn't muck with SPI settings?

For SdFat I plan to use two weak functions for sharing the SPI bus, sdSpiLock(), and sdSpiUnlock().

For ChibiOS you would replace the weak functions with something like this in your sketch:
Code:
Mutex spiMutex;
inline void sdSpiLock() {chMtxLock(&spiMutex);}
inline void sdSpiUnlock() {chMtxUnlock();}
Note that chMtxUnlock() does not have an argument since it unlocks the next owned mutex in reverse lock order. A program like SdFat needs several locks to protect global data and device access. They must be release in inverse order to prevent deadlocks.

What would be the best solution for sdSpiLock() with the new SPI transactions?

I assume sdSpiUnlock() could be:
Code:
inline void sdSpiUnlock() {SPI.endTransaction();}

Edit: setting SPI speed should not be restricted to beginTransaction. SD cards must be initialized at low speed then shifted to high speed.
Code:
	// This function is deprecated.	 New applications should use
	// beginTransaction() to configure SPI settings.
	inline static void setClockDivider(uint8_t clockDiv) {
		SPCR = (SPCR & ~SPI_CLOCK_MASK) | (clockDiv & SPI_CLOCK_MASK);
		SPSR = (SPSR & ~SPI_2XCLOCK_MASK) | ((clockDiv >> 2) & SPI_2XCLOCK_MASK);
	}
 
Last edited:
If SD cards and the like hog an SPI port for many 10's of mSec or 100's, this can't work for the general case where peripherals like wireless and data acquisition have real time latency requirements. In communications like wireless of all sorts, it means lost or retransmitted messages. In some data acquisition use cases, it means loss of perishable data.

I feel that this single SPI port sharing/timing has no solution other than use of additional soft or hardware ports. It can work for a brief transaction.

The speed-shifting for the SD card startup could be done as two transactions, each with different speed params.
So far, the SD card and ethernet interface are common examples of things that take a lot, or too much, time owning the SPI port. I think both could break up their transfers into smaller/briefer transactions, but this turns into something too impractical.

What we really need is an expansion daughtercard adding more real SPI ports.
 
Last edited:
The speed-shifting for the SD card startup could be done as two transactions, each with different speed params.

Yes this fixes the simple Arduino problem but I don't want to simulate beginTransaction() when I am using a RTOS. beginTransaction is not thread safe so you can't use it with a RTOS preemptive scheduler to shared SPI.

I am trying to write libraries that can be used with and without the Arduino IDE and with a number of RTOSs.

It is possible to improve the situation with the SD by releasing the SPI bus when doing a busy wait but not for the actual block transfer. Trying to breakup DMA or FIFO based transfers is a nightmare. I can get the max time for an SPI transfer down to about 200 usec on Teensy 3.x.

Even a few usec is too long for many applications. If you use a SPI sensor to digitize fast signals, you need time jitter that is at the microsecond level or your signal to noise ratio will suffer. An easy non rigorous way to understand this is that if you want to plot the signal you need the time jitter between points to be as good as the error in the sensor value. If you have a 10-bit sensor and you read it at 1000 Hz, you need 1 usec jitter in the time interval.

I feel that this single SPI port sharing/timing has no solution other than use of additional soft or hardware ports. It can work for a brief transaction.

This has been clear in the embedded world for many years. A really good soft SPI package would help a lot on Teensy 3.x. I wrote a fast soft SPI for AVR so people could use it with SdFat instead of hardware SPI.

Here is the SdFat config define.
Code:
/**
 * Set USE_SOFTWARE_SPI nonzero to always use software SPI on AVR.
 */
#define USE_SOFTWARE_SPI 0

Here is hardware verses software SPI for small transfers like print. Software SPI is fast enough for most data logging applications that print data. You can decide how to use software SPI. I often use software SPI in the interrupt to transfer a few bytes from a sensor.

hardware SPI on Uno.
Buffer size 10 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
141.32,108596,40,64

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
145.79,2512,36,61

Software SPI on Uno.
Buffer size 10 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
74.27,130736,40,128

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
90.88,6628,36,103

The real answer is multiple hardware SPI controllers. I am amazed that Freescale only allows one on a 64 pin package. STM32F4 has three available on the 64 pin parts. Other Cortex M4 chips also have two or three with 64 pin parts.

The AVR software SPI is here https://github.com/greiman/DigitalIO. It also has software I2C.
 
Last edited:
Yes this fixes the simple Arduino problem but I don't want to simulate beginTransaction() when I am using a RTOS.

Why would you ever want to gain exclusive access to the SPI bus, using a mutex or semaphore or other mechanism, and then use the SPI hardware at whatever unknown clock speed and data mode happens to be configured on the hardware from previous usage?
 
Sure, you might need more than one setting, but you could easily accomplish that by simply beginning different transactions at the slow and fast clock speeds. Or you could even reconfigure the hardware while you have exclusive access.
 
A MUTEX doesn't help an ISR that just must read at least the status via SPI. The current concept is that if the port is in use by a different peripheral, then the other periperals (all) must have their interrupt masked/blocked until the owning software (thread, task, function) releases it.
This latency just won't work in too many situations, and the solution is more SPI ports, soft or hardware.
 
Last edited:
Sure, you might need more than one setting, but you could easily accomplish that by simply beginning different transactions at the slow and fast clock speeds. Or you could even reconfigure the hardware while you have exclusive access.

I am am mainly concerned about the sharing. beginTransaction is worthless for an RTOS so the new SPI is worthless for my libraries. It's not thread safe so I just want to do the minimum for those users that are trying to use it in simple single threaded Arduino code. I just want to replace a weak function that does the mutex lock operation in an RTOS with the mutex like part of beginTransaction.

I guess I could just do a begin transaction with some constant settings and ignore the fact that it manipulated SPI.

Arduino is just one of many systems I am targeting. The new SPI just doesn't fit the structure of a typical RTOS so I need a work around. In a typical RTOS, you can have several SPI drivers in the HAL while sharing is in the RTOS kernel. That's the structure I need to simulate for Arduino.

My comment about speeds is about the intent to remove set divisor from Arduino SPI.
Code:
	// This function is deprecated.	 New applications should use
	// beginTransaction() to configure SPI settings.
	inline static void setClockDivider(uint8_t clockDiv) {

Edit: In an RTOS, a mutex is for sharing between threads and sharing between a thread and interrupt is forbidden. In Arduino, beginTransaction is for sharing between a single thread and an interrupt. Not the same thing!
 
Last edited:
Not ignoring the real ISR problem in SPI sharing, can you please amplify on why beginTransaction isn't thread safe? As I understand it, and ISR issue aside, it seems that it would be, since while owning the SPI port, it would use atomic-section code to provide exclusion to ownhership flags and data.

PS: RTOSes I've used have "tasks" not threads, and cooperative or preemptive, each task has its own stack. I think some OSes have threads that can or often share the same stack. I/O drivers I've done for FreeRTOS use message queues in ISRs, where FreeRTOS has ISR-safe APIs for get/put with a queue. The queue can contain data blocks and/or flags and notifications.
 
Last edited:
A thread or task must have its own stack in a preemptive RTOS. Don't expect clarity on these terms. Task, thread, and process are used ambiguously in the RTOS world.

As I understand it, and ISR issue aside, it seems that it would be, since while owning the SPI port, it would use atomic-section code to provide exclusion to ownhership flags and data.

There is nothing in the SPI transaction stuff that implements a critical section for threads in a given RTOS. If two threads want to share SPI in a RTOS you must use a mutex provided by the RTOS.

A mutex has the concept of owner where owner is a thread or task. A mutex deals with priority inversion. Only the owner can unlock a mutex. The SPI transaction stuff is not close.
 
Last edited:
Bill, you speak as if I/we don't know RTOS concepts. Not so.

Sorry, I didn't mean to imply that.

Edit: I tend to over explain RTOS concepts since each system uses different terminology in an ambiguous way http://computing.unn.ac.uk/staff/cgmb3/teaching/threads/TasksThreadsProcs.pdf. FreeRTOS tasks are ChibiOS threads.

I was trying to explain what is in the new SPI library. The new SPI library contains nothing related to tasks/threads.

I think some OSes have threads that can or often share the same stack.
This is even a problem for coop schedulers since a thread's context will be buried on the stack so a mutex can't be implemented in a general way. You can implement a shared stack if you limit some OS function calls to the main function or to loop in the case of Arduino. In this case there are clever ways to avoid the buried context problem.

I want to allow for the general preemptive case. The SPI transactions have no way to save context so it can't work with general preemptive threads. SPI transactions can't block while a thread waits.
 
Last edited:
... so I just want to do the minimum for those users that are trying to use it in simple single threaded Arduino code.

Well, perhaps something like this?

Code:
#if defined(FreeRTOS)
    xSemaphoreTake(spiMutex, MAX_DELAY);
    SPI.setDataMode(SPI_MODE0);
    SPI.setClockDivider(mySpeed);
    SPI.setBitOrder(MSBFIRST);
#elif defined(SPI_HAS_TRANSACTION)
    SPI.beginTransaction(mySettings);  // create mySettings as a variable ahead of time, if not hard-coding speed
#else
    // just start using the SPI port and hope for the best....
    SPI.setDataMode(SPI_MODE0);
    SPI.setClockDivider(mySpeed);
    SPI.setBitOrder(MSBFIRST);
#endif

I just want to replace a weak function that does the mutex lock operation in an RTOS with the mutex like part of beginTransaction.

Usually preprocessor macros are more reliable. Weak functions risk conflict with any other code which happens to use the same name. The most likely case is when people copy some or all of your code into their program or library.

Inline static functions defined in .h files with #ifdef to compile the appropriate version are usually a more reliable choice, because you control what the compiler will use. With weak functions, all other unrelated code can interfere, which can be really useful for some projects, but probably not what you want in a filesystem library.

Edit: In an RTOS, a mutex is for sharing between threads and sharing between a thread and interrupt is forbidden. In Arduino, beginTransaction is for sharing between a single thread and an interrupt. Not the same thing!

Indeed, not the same thing. Like all libraries targeting a lot of very different platforms, #ifdef checks for conditional seem to be unavoidable.

While messy, it's pretty simple to use static inline functions and bury all that mess into header files, because even though the underlying system is dramatically different, fundamentally you need to do the same things: gain access to the shared SPI bus and use it with known settings.
 
I don't like weak functions very much but preprocessor macros don't work very well.

I don't want to include RTOS .h files in SdFat so this won't work.
Code:
#if defined(FreeRTOS)

I want the library to work correctly when a user includes one of several RTOS libraries or no RTOS is included.

It's easy to do this if the function that replaces the weak function in in the RTOS library. I have lots of experience doing this.

Code:
#include <SdFat.h>  // has weak function for SPI sharing with no RTOS.
#include <RTOS.h>  // has weak replacement for this RTOS so SdFat doesn't need to be change for a new RTOS.

I wouldn't use the weak function if I didn't want to support the Arduino SPI transactions.

Like all libraries targeting a lot of very different platforms, #ifdef checks for conditional seem to be unavoidable.

It is not only possible but done routinely in the embedded world. In fact use of #ifdef for this purpose is forbidden in coding standard for most critical systems like avionics.

Edit: my SdFat replacement uses a virtual SPI base class so I can have a different or even several SPI libraries for each RTOS and and I can have two or more SD cards with different SPI libraries for each in one program. This is beyond where #ifdef is a good idea.

The SPI library object is an argument to the SD file system creator. Think of polled SPI like AVR, DMA SPI, software SPI, or multiple SPI controllers. This will finally solve the problem, Arduino with transactions will be one of the choices.
 
Last edited:
I don't want to include RTOS .h files in SdFat so this won't work.

Sorry, that was not clear, I meant #ifdef won't work if I don't include RTOS .h files.

I want to copy the way most RTOSs use FatFS for SD cards. They don't modify the basic FatFS code and don't put RTOS includes in FatFS. FatFS is a generic FAT library that supports any block device, not just SD on SPI. It is used by most RTOSs so it is not practical to use #ifdefs in in a library like FatFS.

I have most of this working but the SPI transactions didn't fit very well. I guess it won't be too bad if I modify the virtual SPI base class a little.

I want to be able to use the SPI drivers that are included with the RTOS. Also the new FAT library will be used without the Arduino IDE.
 
Last edited:
seems we're on a tangent from the root problem: Unworkable latency for SPI device which interrupt to get the CPU to quickly use SPI to deal with very time sensitive events such as in communications and data acquisition.

The mechanism for task/thread exclusion for resources like SPI is quite easy by comparison and has many solutions.
 
seems we're on a tangent from the root problem: Unworkable latency for SPI device which interrupt to get the CPU to quickly use SPI to deal with very time sensitive events such as in communications and data acquisition.

The mechanism for task/thread exclusion for resources like SPI is quite easy by comparison and has many solutions.

I didn't start this topic to discuss interrupt latency, I wanted to understand if I could split the beginTransaction call in to two parts, a SPI lock part and a SPI configure part. Clearly there is no plan for that. After more thought, I have a clean way to only use beginTransaction in SdFat when an RTOS is not being used and I don't need to include RTOS .h files in SdFat.

Yes SPI sharing in a RTOS has an easy solution, It's at the start of this topic.
Code:
Mutex spiMutex;
inline void sdSpiLock() {chMtxLock(&spiMutex);}
inline void sdSpiUnlock() {chMtxUnlock();}

Now back to "the root problem".

I will use SPI transactions when an RTOS is not used and this should allow the SPI interrupt latency on Teensy 3.x to be reduced to about 200 usec max. This is still terrible but I don't want to breakup the actual 512 byte data transfer since some SD cards are flaky. These SD cards go to sleep during a data transfer when they see clock pulses with chip select high. It was a shock when I learned that SD cards react to SPI clock when chip select is high. If you look at SdFat, you will see that I transfer a byte after setting chip select high to insure cards will go high-Z on the SPI bus and sleep in low power mode.

From Chan, the author of FatFS http://elm-chan.org/docs/mmc/mmc_e.html.
In principle of the SPI mode, the CS signal must be kept asserted during a transaction. However there is an exception to this rule. When the card is busy, the host controller can deassert CS to release SPI bus for any other SPI devices.
Therefore to make MMC/SDC release DO signal, the master device must send a byte after CS signal is deasserted.

I may generalize the AVR software SPI option for SdFat. It could be fairly fast on Teensy if constant pin numbers were used and Paul optimized digitalRead and digitalWrite. I had one user who did his own mods to software SPI so he could use hardware SPI in interrupts on Due.

The new SdFat does have a partial solution for logging data from a SPI sensor at high speed. The LowLatencyLogger example can log data records from an SPI sensor at 2,500 Hz on Teensy 3.x with sample time jitter of one or two microseconds.

Edit: here is another strange chip select high behavior for SD cards.
After supply voltage reached 2.2 volts, wait for one millisecond at least. Set SPI clock rate between 100 kHz and 400 kHz. Set DI and CS high and apply 74 or more clock pulses to SCLK. The card will enter its native operating mode and go ready to accept native command.
So you must send a bunch of 0XFF bytes with chip select high!

Yet another possible SPI interaction for an SD card on a shared bus.
In addition the internal write process is initiated a byte after the data response, this means eight clocks are required to initiate internal write operation. The state of CS signal during the eight clocks can be either low or high, so that it can be done by bus release process described below.
 
Last edited:
Here is a test of SdFat using software SPI on Teensy 3.1.

I tried to use digitalReadFast() and digitalWriteFast() with constant arguments but wasn't able to get the timing right. The compiler seemed to fight me with optimizations.

I finally put a wrapper around these functions to kill the extreme optimization for constant arguments and and the timing was easy to fix with a scope but probably slowed write by a factor of two and read even more. It is still a lot faster than digitalRead and digitalWrite.

I got the the following results, write about 147 KB/sec and read about 190 KB/sec. This might be useful to free up hardware SPI for use in interrupts.

File size 5 MB
Buffer size 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
146.90,62803,3112,3483

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
190.56,5351,2666,2686

Here is the result for hardware SPI.

File size 5 MB
Buffer size 512 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
387.96,77559,942,1318

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1333.25,1263,364,383
Write over twice as fast and read way faster.
 
Status
Not open for further replies.
Back
Top