How best to manage multiple SPI busses?

Status
Not open for further replies.
FYI - I fixed an issue with my graphic library above, that did not work when I did an Async update screen... Forgot to put in begin/end transaction... Now in and zip file above was updated...

Side Notes:
Test program above can test this, by: entering in debug terminal: d<cr>
Which toggles on DMA testing mode. So when you hit just <CR> for a line it toggles between drawing with or without frame buffer. Background is red in FB mode. In DMA mode when it draws RED it is using the ASYNC update...

Also tests continuous screen updates: r<cr>
Which checks frame counts and when it does 10 frame updates it changes colors... Which also worked.

However: this code knows that the T3.6 DMA can only output 32K at a time, so it manually did it in three chunks.

Would like to now have that done internal to the SPI.transfer(buf, retbuf, cnt, CallBack) function... As to not have to have the program know these constraints, especially since on T3.5 SPI1/2, current constraint is 512 bytes (9 bits for count as other bits are used for channel link)...

Question to myself and others who have played with DMA...

Currently I am using the DMAChannel.h functions like: _dmaTX->sourceBuffer(buf, cnt);

Which work, when they complete they reset the SADDR in the channel back to where it started as the set: SLAST= -LEN...
Likewise for use destinationBuffer...

So Suppose I wish for the Transfer to output the full 76880 UINT16's which would take 3 writes (2.34..), Should I in the DMA RX ISR:

a) Keep the counts and pointers separate in my SPI structure and do all the math to figure out the next starting points and bytes/words left...

b) In the ISR simply update the SADR and DADR values by adding SLAST to these values....

c) Or would it work to manually set the SLAST=0, so when each DMA is completed in theory it is already pointing to the start of the next transfer?

It feels like b) or c) would be the cleanest, c) would would probably be slightly faster for turn around, but b) would be very very slightly faster for transfers that can complete in one chunk (majority)...

Sounds like time to experiment.
 
b) In the ISR simply update the SADR and DADR values by adding SLAST to these values....
You mean subtract, SLAST is negative ('-len').

c) Or would it work to manually set the SLAST=0, so when each DMA is completed in theory it is already pointing to the start of the next transfer?

It feels like b) or c) would be the cleanest, c) would would probably be slightly faster for turn around, but b) would be very very slightly faster for transfers that can complete in one chunk (majority)...
I vote for c).

I don't see why b) would be faster. You can always access the TCD directly for stuff that's not supported.
 
b) would be faster as I would not have to set SLAST=0.... Of course I could roll my own sourceBuffer/destBuffer functions which did that, in which case speed is a wash...

I am leaning toward c) as well.

Right now busy doing OK weather outside tasks.
 
I just updated the SPI to use c) So far I have only tested using the display driver above. So far it is working where I now simply issue the full transfer
of all of the bytes: _spi->transfer16(_pfbtft, NULL, _width*_height, asyncInterrupt);

And the transfer takes care of doing the splitting. Again so far only tested on SPI on T3.6.

I need to now go back and retest T3.5... May want to see how the SSD1306 is working. As I have not retested it with the updated chained DMAChannels that are now used and as such 511 bytes is max transfer and was outputting 512, so, will need to use this new code to recover...
 
@admp - Not sure what you you responding to here?

Quick update: My first version to allow transfer size > then one DMA request worked on T3.6, but not on T3.5 SPI1/2...

So needed to rework and do like first transfer and spoon feed the first item with PUSHR. Now appears to work for the different busses.
 
I am playing along big time. I have been using the library with spin and the clip buffer.
Just saw the second branch on your spi github that mentioned this update.
Glad i found it here.
This is dynamite stuff. Thanks for all the hard work mate.
I will load it up and let you know how it goes.
Perfect timing for me as i am just pinning down franks flexibard 3 and will be wanting the extra spi for offboard.
Will be available for a bit of testing.
Hope this has been picked up upstream.
Love your work!
Thanks again.
 
Status update: <Not sure if I should continue here or create new thread>

a) Code to support all SPI objects SPI, SPI1, and SPI2 are instances of one class SPIClass has been integrated into the official teensy library and is now part of the Teensyduino 1.37 beta :D
This includes some helper functions to check if a pin is valid for MOSI, MISO, SCK...
Some of the details are up in the thread: https://forum.pjrc.com/threads/44509-Arduino-1-8-3

I have updated a couple of libraries to use this (mostly the SSD1306 libraries both Adafruit and Sparkfun).

First question is, what if anything else should go into the Teensyduino 1.37? Possible things include

a) Maybe some standard way to know which SPI busses are valid. The Arduino SAM/SAMD SPI driver, defines: SPI_INTERFACES_COUNT
But, another way would be like our wire library and do a set of defines like:
#define SPI_IMPLEMENT_SPI
#define SPI_IMPLEMENT_SPI1
...

a1) Maybe update library properties to say: architectures=avr instead of *
That way if in user library, it won't use it if you are building for SAM or SAMD

b) Not sure if it would make sense to add in helper functions like my version has, that can return back a pointer (or reference) to the SPI register set. Or said another way does it make sense to make the function port() public instead of private. That is suppose I wish (I do) to update the ili9341_t3 library to work with all three busses. The two main things I need, is the port data, as to get the right PUSHR/POPR registers... And also in this case I need to know how deep the queue is... Which is part of the internal hardware structure.... Or I could have the user library just know this as well...

I am assuming the next ones are below the line for this release
c) New Transfer functions
SPI.transfer(buf, retbuf, cnt) - either buf and/or retbuf can be null.
SPI.transfer16(buf, retbuf, cnt) - I added this later... Found it useful

I also added a method to allow me to set the fill value when NULL is passed for buf. Currently I have it defined like: void setTransferWriteFill(uint16_t ch ) {_transferWriteFill = ch;}
These functions for example allowed me to make a version of ILI9341 library without knowing internals of SPI, that kept the buss pretty full. Example drawFastVLine could be implemented like:
Code:
		beginSPITransaction();
		setAddr(x, y, x, y+h-1);
		writecommand_cont(ILI9341_RAMWR);
		setDataMode();
		_spi->setTransferWriteFill(color);  // Set the transfer word.
		_spi->transfer16(NULL, NULL, h);	// Send out the number of words called for.		
		endSPITransaction();

d) ASYNC tansfers - I think it is actually working pretty well, but probably need/want to run this more up the flag pole... Also you expressed concern about having a call back be called as part of interrupt context. Which I can understand, however I don't think it would be the first one. IntervalTimer, attachInterrupt for a pin, ...

Again I think it is great that we are setup to only have one SPI class.

Thanks!@
 
Kurt,

Just to clarify, this library is able to run async transfers on all three SPI ports (Teensy 3.6)? I'm working on an acquisition engine that needs to pull data from two A/D channels (4 converters in each channel) at a burst rate of around 40 Mb/S. The payload is only 96 bits per channel, per sample. But the high bit rate is to ensure that there is adequate time to perform other tasks with the data once it has been acquired.

Having a callback upon completion of transfer is absolutely a must, and I was already researching this when I stumbled across this thread.

From the sounds of your testing with the multiple displays, I don't have any doubt that the speed is certainly there for the A/D transfers. But the data also has to be compensated, aggregated, and the raw data is being streamed to an SD card. And the Teensy will need to answer data retrieval requests and send raw data to another Teensy over the third SPI port. So the more time the micro-controller has to deal with those tasks, the better.

I do understand that the SPI class does not have slave support, nor was I planning on implementing a full slave. Rather, I was thinking of using standard serial as an inter-processor communication channel and using SPI for the bulk data transfers. That is a little way down the road, and for the purposes of testing and validation, I'll have my hands plenty full for now.

But I would like to get a little clarification on the status of this library. Oh, and I am also a fan of being able to re-map the SPI pins. In this case, I need to use SPI1 on Pins 5/6/20/21 because I need the FTM clock input functionality that seems to only be accessible through pins 0/1 on the Teensy 3.6. So this is one of those cases where re-mapping functionality would be a necessity.


Thanks,
Russ
 
Again with the normal caveats that some of this will ever make it into the mainline SPI library, Yes....

Also there is an existing library dmaspi (https://github.com/crteensy/DmaSpi) that is also available that allows you to do everything. There are issues with that library with T3.5 on SPI1/2, but that will be resolved at some point... (Only one DMASource for SPI1 or 2 which can be either RX or TX... So have to work it a different way)...

The main reason I am hoping to get the ASYNC stuff into the mainline function SPI library as it hopefully makes it easier for programs to take advantage of it. That is with my version, you simply have: SPI.transfer(buf, retbuf, cnt, callback) where several of these parameters are optional. Example you don't have to have a CB (pass NULL) and you can query to find out if it is done.... It also tries to hide the details, like for example with my T3.5 implementation, I can do a maximum of 511 bytes/words per transfer, so my internal ISR knows this and will restart the dma to do the next chunk if needed...

Both my version of SPI and dmaspi, allow for alternate pins. If I remember correctly the dmaspi relies on the underlying SPI library to setup things....

Also on the concurrent SPI transfers. Yes you can have all three active. But with some normal caveats. That is they may not all be able to run at absolute top speed as they may have to contend to get the memory buss...

And of course you can roll your own DMA code using the DMAChannel.h/cpp files that are part of the core. My version(ili9341_t3n) of the ili9341_t3 library does this for the T3.6 to support a logical frame buffer.

So there are lots of options available.
 
Support SPI Client?

I am not sure, if maybe I should start different thread for this.

Now that mainline SPI library is going to support all of the SPI busses as one Class and hopefully over time support some additional transfer methods and the like.

There is another portion of the SPI setup, that I have been wondering about, and that is of having library support for Teensy as an SPI Client.

For example with me building a "RPI Hat" like board, where it might be useful to have the Host (RPI or Odroid or UP), be able to do fast communications with the Teensy. I am already doing it with Usart but might be nice with SPI. And in these cases I believe it will be best for the Host to be master...

I have found at least one library that does this: https://github.com/btmcmahan/Teensy-3.0-SPI-Master---Slave

But was thinking it would be nice if it could be supported directly in main library and also handled the different SPI busses. Would also like to have the library not the sketch handle ISR.

Current thoughts are define a standard callback function, maybe like:
uint16_t (*transferSlaveCB)(uint16_t tx);

Will probably need a different begin, maybe something like beginSlave
Maybe here is where I pass in a pointer to the Callback function.

Will probably need something like:
void setSlaveSetting(SPISettings settings);
To be able to update the CTAR_SLAVE settings... Not sure if the SPISettings object is completely appropriate or complete enough. Example may want to set the transfer size.

Note in the above I have the CB with 16 bits in and out as it will be the CTAR Slave that will decide how much of the PUSHR_SLAVE data is valid and likewise your call back can decide how much of the RX data is valid from the POPR.

Notes to self (and others)

Only SPI CS of channel 0 is valid for CS for client mode: Which I believe on T3.2 is only pins 10 or 2...

To work in client mode, you must call SPI.setCS(pin) for one of these pins, which sets the pin into SPI mode (mode 2)

I assume that then when the SPI is active (not halted and MSTR is not set), when the SPI detects that the CS pin is asserted, it will process data on MISO/MOSI/SCK where the host is the one driving these.

Wondering: Are there ways to detect when our CS pin is asserted. digitalRead on this pin will probably not work as we are not in mode 1... Don't see any status bits giving me this.
Is this important? That is if the host says it is going to send me 30 bytes and only sends 20 and then drops CS, should I be able to detect this? ...

Thoughts? Useful?
 
Update to SPI slave mode - Been playing with it. Have test program that ties SPI to SPI1 on a T3.6.. Have been playing trying to figure out if it makes sense to have some form of API where the caller just provides a call back function, Works fine for Write type operations, but read is a little more interesting.

That is if I setup the slave SPI ISR to interrupt on SPI_RSER_RFDF_RE, And I then do a POPR of the data, to pass of to callback, which returns something to PUSH back, the actual data that I push arrives maybe 3 or 4 POPR's from the host... I was expecting one... Still investigtating.

My current test app, is working like emulating a set of registers. Right now I am doing Read/Write of 16 entries. So I have it where I first run a command word (Type transfer, start addr, cnt), followed by writes of that many entries... So the init code in my test app, set up Host register and client registers, so I can transfer both ways...

After init they looked like:
Code:
     0      1      2      3      4      5      6      7      8      9 
     0  65535  65534  65533  65532  65531  65530  65529  65528  65527
I do a transfer to send 10 words, which in theory should come close to swapping the two arrays.
Code:
0=2a0(3) 0=0(3) ffff=1(3) fffe=2(3) fffd=3(3) fffc=4(3) fffb=5(3) fffa=6(3) fff9=7(3) fff8=8(3) fff7=9(3) 
     0      0      0  65535  65534  65533  65532  65531  65530  65529 
     0      1      2      3      4      5      6      7      8      9
Note: the first line, ignore the (3) stuff, was data showing what words were sent from host, and the ones pushed back So I sent a 2a0 (command transfer 10 units starting at 0) returned 0

Could be as simple as timing... That is if the SPI0 sends 3 items before the SPI1 gets and processes interrupt....

Now back to playing.
 
On the topic of SPI buses, I'm now finding a need to debug/troubleshoot my SPI.

What cheap logic analyzers are you using to debug your Teensy SPI projects?

Thanks so much!
 
Last edited:
Status
Not open for further replies.
Back
Top