All, I've uploaded the new v7 library, and modified the top post accordingly. I'll try to summarize the changes here.
Clocking
The library now supports all Teensyduino F_BUS frequencies:
60MHz, 56MHz, 48MHz, 36MHz, 24MHz, 16MHz, 8MHz, 4MHz, 2MHz
In addition new rates have been added:
I2C_RATE_1800, I2C_RATE_2800, I2C_RATE_3000
The supported rates depend on the F_BUS setting which in turn depends on the F_CPU setting. The current F_CPU -> F_BUS mapping (Teensyduino 1.21), is as follows. For a given F_BUS, if an unsupported rate is given, then the highest freq available is used (since unsupported rates fall off from the high end).
Code:
I2C_RATE (MHz)
F_CPU F_BUS 3000 2800 2400 2000 1800 1500 1200 1000 800 600 400 300 200 100
----- ----- ---------------------------------------------------------------------
168M 56M y y y y y y y y y y y y y
144M 48M y y y y y y y y y y y y
120M 60M y y y y y y y y y y y y y y
96M 48M y y y y y y y y y y y y
72M 36M y y y y y y y y y y
48M 48M y y y y y y y y y y y y
24M 24M y y y y y y y y
16M 16M y y y y y y
8M 8M y y y y
4M 4M y y
2M 2M y
The rates are not directly equivalent to SCL clock speeds. I've measured the SCL rates for the 48/56/60MHz F_BUS speeds on a logic analyzer (refer to notes in .cpp file, setRate function), and the results are inconsistent. The peripheral limits the actual SCL speeds to well below the theoretical speeds. To get a better idea of throughput I've measured the transfer time for a 128 byte transfer across different F_CPU / F_BUS / I2C_RATE combinations (specifically the interesting overclock speeds). This is shown below.
A few takeaways here, are:
1) To the first order, it doesn't really matter what F_CPU / F_BUS combination you use, for a given I2C_RATE the variation is only slight.
2) The 48MHz bus speed seems to work slightly better than the irregular 56MHz / 60MHz speeds.
3) The F_CPU appears to be more important than F_BUS, as the peak performance is at the 144MHz / 48MHz corner.
4) The difference from standard 100kHz Arduino rate to the peak rate is about 10x. Likewise from standard 400kHz rate to peak rate is about 2.5x.
New Operating Modes
Old versions of the library were exclusively interrupt based. There are now two new modes of operation:
DMA and
Immediate. These modes are only for Master operation.
The library supports DMA mode transfers for Master send and receive. DMA mode has little to no effect on I2C operating speed, however it does greatly reduce the number of ISR calls needed to service the I2C. You can observe this in the following captures, whereby the bottom plot in each indicates the ISR activity.
Interrupt Mode:
DMA Mode:
The gain that can be expected from this depends on a particular applications traffic. For many applications it may be marginal. On a 256 byte transfer at 400kHz, I've measured about a 10% increase in available CPU time when using DMA. So for a large background task such as driving a display, or using I2C as a networking channel, it may be beneficial. For small message, intermittent traffic such as sensors, there may be little benefit.
Unfortunately there is no DMA mode for Slave operation. I'll elaborate a bit on this and other DMA .. issues .. below.
Also, in cases of unavailable DMA channels, the library is configured to automatically fall back to standard interrupt mode.
The other new mode is Immediate. In this mode the ISR is not used, the call will loop and wait for the I2C operations to complete. As such this mode is always blocking regardless of call. Configuring for Immediate mode essentially makes the library operate the same as the standard Wire library. However this mode is also utilized in special cases involving priority (see next).
Priority Escalation
Since the library is generally configured to be interrupt based, there is always the problem of someone launching I2C calls from inside a higher-priority interrupt, thereby blocking the I2C ISR. To fix this, prior to engaging the ISR, the library will first try to determine the priority of the calling function, and if necessary it will adjust the I2C priority to a high enough level to exceed that of the calling function. If this is not possible (eg. calling function has a priority of zero), then it will revert to Immediate mode. As such it should not be possible to block the I2C from running.
New Functions
The following functions have been added to support the above features:
1) Added new begin() functions to allow setting the initial operating mode. This just adds an operating mode argument to the existing functions. If the mode argument is not specified then begin() calls will default to ISR mode (same as previous behavior):
- begin(i2c_mode mode, uint8_t address, i2c_pins pins, i2c_pullup pullup, i2c_rate rate, i2c_op_mode opMode) - Master or Single-address Slave
- begin(i2c_mode mode, uint8_t address1, uint8_t address2, i2c_pins pins, i2c_pullup pullup, i2c_rate rate, i2c_op_mode opMode) - Address-range Slave
- whereby i2c_op_mode can take the following values: I2C_OP_MODE_ISR, I2C_OP_MODE_DMA, I2C_OP_MODE_IMM
2) Added new functions:
- uint8_t setOpMode(i2c_op_mode opMode) - used to change operating mode on the fly (only allowed when bus is idle)
- void sendTransmission() - non-blocking Tx with implicit I2C_STOP, added for symmetry with endTransmission()
- uint8_t setRate(uint32_t busFreq, i2c_rate rate) - used to set I2C clock dividers to get desired rate. busFreq allows devices which alter their running frequency to recalibrate the I2C rates to the new freq. This form uses an i2c_rate enum argument.
- uint8_t setRate(uint32_t busFreq, uint32_t i2cFreq) - used to set I2C clock dividers to get desired SCL freq. busFreq allows devices which alter their running frequency to recalibrate the I2C rates to the new freq. This form uses a uint32_t frequency argument (quantized to nearest i2c_rate based on empirical measurements)
3) Added new Wire compatibility functions (mostly uint8_t type casts):
- void setClock(uint32_t i2cFreq) - (note: this is actually a degenerate form of setRate() with busFreq == F_BUS)
- uint8_t endTransmission(uint8_t sendStop)
- uint8_t requestFrom(uint8_t addr, uint8_t len)
- uint8_t requestFrom(uint8_t addr, uint8_t len, uint8_t sendStop)
Fixes and Cleanup
- Some bug fixes were done in Slave range code and in arbitration (note: arbitration has never been vetted on this library, it is on my to-do list).
- Removed the I2C1 defines as they are redundant now that kinetis.h has them
- Completely removed all debug code and the rbuf dependency
- Cleaned and reworked the examples to simplify the code, test new things, and eliminate debug. In the examples I tried to group the Wire calls into a tight block so it is obvious which commands are being used.
- Added an interrupt example to test running I2C from inside an ISR.
To-Do
1) When I get some time I'm going to investigate the interesting anti-lockup technique that Swap_File posted in #173:
https://forum.pjrc.com/threads/21680-New-I2C-library-for-Teensy3?p=58368&viewfull=1#post58368
2) Verify arbitration
Other DMA Stuff
Ok, so now on the topic of DMA I have to rant a bit:
Unless I am missing a magic bit setting somewhere (which really should not exist because it would have to be labelled "Breaks I2C Protocol Bit"), the DMA implementation with this I2C peripheral is absolute ****. You see, one of the fundamental elements of the I2C protocol is the notion of an Acknowledge bit. When you talk to a Slave you will either get a ACK or a NACK. When Slaves have a problem or if they don't exist you get a NACK. When you get a NACK you STOP what you're doing. Now DMA sets up an automated transfer from point A to point B. You would think if you were in the business of moving things from point A to point B and you got a NACK you would STOP. Perhaps something that would result in a DMA error yes? NO!! this DMA system will happily blast out whatever you told it to regardless of the ACK/NACK responses.
Is this a problem? Technically (ideally) NO. For well constructed Slaves if they don't recognize their address at the beginning of the message they should ignore the entire message - even if it involves a string of 1000 bytes and 1000 NACKs. More importantly though, for the Master device it tosses the entire notion of verified transmission out the window - the Master in this case can only know if the message was sent, not if it was received.
To circumvent this pile, I've implemented the following workaround - when in DMA mode, the first and last bytes of the messages are sent via ISR-based routines. The bulk of the message in the middle is sent via DMA. The reason for this is that if the first byte (address) NACKs then you know the Slave doesn't exist, and it will STOP immediately. The last byte can also generate a NACK, which would tell you if the Slave died or had some kind of error somewhere between the address and the end of the message. There might be corner cases which are not caught, but for most cases it should work well enough. However, to mitigate the overhead of this nonsense I've set the minimum overall message length to 5 bytes when in DMA mode, eg. Address (ISR) - 3 bytes data (DMA) - last byte data (ISR). Messages below that size will automatically transfer in ISR mode, even if DMA mode is configured. That's for Master Tx mode, and for Master Rx mode a similar approach is used.
Slave mode is a different story. I am unable to figure out a way to make Slave mode work with DMA. Slave Tx/Rx transmissions are indeterminate in size, and DMA blocks normal ISR code from running, so there is no apparent way to terminate a transfer. Stupidly, even major bus events like a STOP do not trigger an ISR interrupt when DMA is active. In theory a similar fix to the rising-SDA hack used by the normal ISR Slave Rx code could be used to terminate a transfer, but at that point you are running an ISR on every byte, so where is the benefit of DMA versus the normal ISR code? Because of this, Slave mode will only work in ISR mode regardless of requested mode in begin() or setOpMode().
Another caveat - Timeouts may behave erratically when using DMA. This is because the DMA bulk transfer doesn't trigger the ISR, so it never gets a chance to detect the timeout condition and exit early. It may detect the timeout on the last byte when the ISR re-engages, but depending on bulk transfer length it could be any amount of time.
Note that I'm not trying to dissuade anyone from using DMA mode (I use it myself), and when it works it could be of some benefit, and really it is essentially free so why not. However just be aware there are workarounds in play (similar to Slave mode hack). So report any problems you have.