propose adding timeout and NAK error counters to library
@nox771 I'd like to add some uint32_t error counters for timeout/call to resetBus as well as addr NAK and data NAK, and imagine these would be worth adding to the library. I don't want to fork and diverge. Why I propose this: if timeout and recovery (with busReset) happens with I2C_AUTO_RETRY defined, the timeout is "fixed" so it's not returned by status() so I can't see and log it in my application code. The NAKs do get returned by status() but if adding the timeout counter, those are the other two solid errors so why not log all three? The counters would stop at the max of 0xFFFFFFFF, not wrap to zero, and could be cleared by the application if desired. 2**32 is a lot of errors (4 billion) so that should be large enough. In a realistic use scenario I'm getting 4500 I2C messages per second (100 KHz SCL). If they are all errors, that's 265 hours to fill a 32 bit int.
So I am asking for any input from you and others before I do this (with the hope of it getting merged into the main master). These would be small changes with little impact on performance of the libraries (code only increments a counter when the error occurs, and three uint32_t would take 12 bytes of data space). I'd put them in a struct for error counts, or they could be added to the existing i2cStruct. Since that's a private struct there would be simple functions to view and clear them.
I don't want to step on any toes proposing such a change.
I have a first hack at this in my fork of i2c_t3 if anyone wants to try it. The new function is Wire.resetBusCountRead(). I'm using it to test the Systronix_PCA9548A library since that MUX seems to rather easily get into a state where it sticks SDA low and there is no recovery other than the wonderful resetBus() of @nox771. Here is typical output after a TyQt reset which apparently left the mux in such a state. But note that it does recover! Woohoo!
You can see in the 9548A lib I have set up text strings to index into the Wire.status() return codes and output the description. This is a single-master system so I am surprised to see the ARB_LOST error. Also it surprises me that this mux seems to be so inclined to get stuck (just resetting Teensy will do it about 1/10 times, I'm guessing because it terminates an I2C message in progress).
@nox771 I'd like to add some uint32_t error counters for timeout/call to resetBus as well as addr NAK and data NAK, and imagine these would be worth adding to the library. I don't want to fork and diverge. Why I propose this: if timeout and recovery (with busReset) happens with I2C_AUTO_RETRY defined, the timeout is "fixed" so it's not returned by status() so I can't see and log it in my application code. The NAKs do get returned by status() but if adding the timeout counter, those are the other two solid errors so why not log all three? The counters would stop at the max of 0xFFFFFFFF, not wrap to zero, and could be cleared by the application if desired. 2**32 is a lot of errors (4 billion) so that should be large enough. In a realistic use scenario I'm getting 4500 I2C messages per second (100 KHz SCL). If they are all errors, that's 265 hours to fill a 32 bit int.
So I am asking for any input from you and others before I do this (with the hope of it getting merged into the main master). These would be small changes with little impact on performance of the libraries (code only increments a counter when the error occurs, and three uint32_t would take 12 bytes of data space). I'd put them in a struct for error counts, or they could be added to the existing i2cStruct. Since that's a private struct there would be simple functions to view and clear them.
I don't want to step on any toes proposing such a change.
I have a first hack at this in my fork of i2c_t3 if anyone wants to try it. The new function is Wire.resetBusCountRead(). I'm using it to test the Systronix_PCA9548A library since that MUX seems to rather easily get into a state where it sticks SDA low and there is no recovery other than the wonderful resetBus() of @nox771. Here is typical output after a TyQt reset which apparently left the mux in such a state. But note that it does recover! Woohoo!
Code:
PCA9548A Library Test Code at 0x70
write CFG: 1
init failed with return of 0x04
I2C_TIMEOUT
Interval is 1 sec, Setup Complete!
Send Q/q for quiet, V/v for verbose output.
control write failed with return of 0x07, I2C_ARB_LOST
control write failed with return of 0x04, I2C_TIMEOUT
control write failed with return of 0x07, I2C_ARB_LOST
et:3 Good:1152 1152/sec bad:4 busReset: 2
et:4 Good:5760 2880/sec bad:4 busReset: 2
Last edited: