wire.endTransmission() hangs if it can't connect with I2C slave device?

Status
Not open for further replies.

bboyes

Well-known member
I have narrowed a hanging Teensy3 program down to this line of code:
flag = wire.endTransmission();
If that line is present, and the I2C slave connection is bad (in my case it's a new PC board being debugged), rather than failing and returning with an error code, execution appears to hang.

The error returns are at http://arduino.cc/en/Reference/WireEndTransmission

Obviously not a good thing in a control program.

Has anyone else seen this? To test it just try writing to an I2C slave and open the SDA line. I'll investiage further. I wonder if Teensy++2/2 behave differently.
 
OK, the Teensy2 library does not fail the same way. It returns an error code of 2 = NACK on address, which would be the first part of the attempted transmission. This would be appropriate.
 
Anyone know where the Teensy3 source code for this I2C Wire would be? I can find the folder arduino-1.0.5\hardware\teensy\cores\teensy3 but it's not obvious which file has the Wire library for I2C. Paul would be the obvious expert here.
 
Last edited:
Look in libraries/Wire.

I have often considered adding a timeout in endTransmission(). If the I2C bus is stuck low, technically that means another I2C master has control of the bus and Teensy is supposed to wait until it's done and lets the lines return to high. But in practice, it usually just means there's a hardware problem and endTransmission() just waits forever.

I believe endTransmission() returns a code to indicate success or failure. But does anyone bother to check it?
 
OK, the Teensy2 library does not fail the same way. It returns an error code of 2 = NACK on address, which would be the first part of the attempted transmission. This would be appropriate.

Do you have pullup resistors connected to Teensy 3.0? Can you please check with a voltmeter if both lines are at 3.3V while endTransmission() is stuck?
 
Yes, I do. This system is an environmental control for animals. If it fails, they will be hurt or die (too hot or too cold). It has happened with a competitor's product, and the animals were worth about $10K each. In our case they will be worth not that much but there will be a few dozen per system. There are multiple Teensy modules, one per zone which will have different temp and humidity settings.

So the point is: this system can't fail, or if it does it has to do so in a safe manner. So we check all return values so that we know we can believe sensor data, and that we are properly driving heaters and lights.

This will also be expected to have a service life of 5-10 years so it has to be reliable.
 
Yes we have 10K pullups, which are a little on the low side but the signals look good at 100 kbits. Teensy2 does not hang in this way.

Oddly I just tried another Teensy3 module (my tech is working on the board and Teensy in use earlier) hooked to nothing and of course I don't see any I2C signals because it has no pullups. But it does fail with an error 4, which is "other error". At least it isn't a hang. This Teensy3 is at a different virtual COM port too... why is that? I guess one COM port for each Teensy serial number?

OK, I got another Teensy3 (COM 17) hooked up to nothing with pullups and it is failing but returning error 2, NAK on address, which is correct.

If I open the clock line, so no pullup any more, it hangs. The data line is high, and clock is floating/low.

So the problem is a little different than I first thought. It seems to be as you said, there is a wait forever if at least one of the lines is stuck low. This is harder to achieve with Teensy2 since it has the weak pullups built in.

In any case we don't ever want Teensy3 to hang. We will also have a watchdog, but this could just create an endless loop. We'd rather have Teensy return - if the bus is stuck, there's no value in waiting forever. We'd like to know there is an error, so that our code can try to send an alert and if necessary go to a fail safe mode.

Can this be easily done as a library change? Would Arduino want to incorporate it into their code?

This leads me to wonder if we should split our I2C devices across the alternate I2C pins (Teensy3 16 & 17) so that we might survive a partial sensor/control bus failure? This sort of exception handling is always tough.

Thanks for the helpful info. Could the lib be changed so that it does not wait forever but returns an error? Code 4 would be OK. Ideally there would be a "stuck bus" code, but maybe this happens so rarely it isn't useful information.

We had a peer TWI network on another project, with four nodes, on AVRs, and tested millions of messages and never saw an bus stuck error. That was all C code, not Arduino, and we never waited forever. A stuck bus would have broken the sensor/control data channel anyway but each node could go to a safe state.

There is an NXP app note about bidirectionally buffering and voltage-level isolating the I2C bus with a single N-FET such as BSS138, and in that case an unpowered node on one side doesn't hamper operation of the other, but it does nothing for a line stuck low. Maybe we isolate two groups of somewhat redundant sensors in hopes that at least half will work at all times if one part is unpowered. The unpowered portion would of course lose its pullups, so that needs to be considered.
 
Last edited:
When I reconnect the clock pullup on the hung Teensy3, after a long pause (minutes), both lines are high on a scope, but execution does not recover.
 
Last edited:
The source code for Wire is interesting. Having an eternal hang in an embedded system is never a good idea; more so if it is deliberate(!). I'm surprised to see that this is the way Wire has been for quite a while. So we will need to fix this if we are to continue using Arduino for development vs rolling our own custom C/C++ code. I will keep this thread updated when we learn/do more.
 
There is an alternative I2C library available on the forum that may address your concerns. E.g it includes a timeout and restart.

Not sure it'll help but it's worth checking it out.
 
I too am getting NACK to every endTransmission when using #include i2c_t3.h. How do I use the wire.cpp in my sketch? Does it replace something in the wire.h library or does it go in the ic2_t3 folder. I'm still learning my way around this environment so better to ask before I leap.
 
Nope, wire and i2c_t3 are completely different libraries - don't copy the file to i2c_t3
I'd look for a hardware-problem and use i2c_t3.

Do you use PULLUP-resistors ? Which value ? And which hardware is attached ?
 
It is an OpenPipe breakout used with emulating all sort of bagpipes. An old schematic showed 10K on SDA and SCL to VCC. I can't get it apart but at the connector I show 1.7M but there is a part on it I don't recognize (image attached) that may be throwing off the measurement. I'm trying to contact the maker now to see if measuring at the connector will give me a valid result. openpiipe breakout signal leads.jpg
 
I got a chance to scope the signals and they look good. I also tried replacing the wire.cpp in the Teensy Wire library which makes sense since Paul was the one that made the modifications. As an aid in my diagnosis, I also reduced the I2C speed down to 100k. Same problem remains, I always get a NACK on the endTransmission. I printed off the MPR121 datasheet and IC2 protocol so I can try and make sense of the scope trace. Hopefully something will jump out at me.
 
I had a bunch of issues with the wire library hanging running on Arduinos. I had one master and 4 slaves on a 25' long bus. Never do that again.

I forgot if I tried Paul's library but I got 100% of the hanging to stop by the following...

1) reducing the communication down to what was necessary
2) I used this WSWire library... https://github.com/steamfire/WSWireLib
3) I reduced the i2c speed to 50khz as follows see gammon's website
... it looks like you can drop the speed down to 12.5khz.. not sure if I tried that

in setup..

void setup() {
Serial.begin(9600);
Wire.begin();
TWBR = 152; //50khz
}

Also, check to make sure the wswire library matches the new speed. There is a variable in twi.h called TWI_FREQ. see the gammon webiste for the formula.
#ifndef twi_h
#define twi_h

#include <inttypes.h>

//#define ATMEGA8

#ifndef TWI_FREQ
#define TWI_FREQ 50000L //changed to 50khz... was 100000L or 100khz
#endif
 
I've not used it, but I recall seeing this device on tindie which claims to be able to allow you to extend i2c wires long distances over ethernet cables:

Also there is this i2c mux which has support for resetting each of the 4 i2c lines it is muxed to:
 
Last edited:
those are some cool chips. Not sure why I didn't go that route. I ended up designing my own board around this chip extender. Once I fixed the problem I didn't populate the boards...

P82B715TD

P82B715TD.JPG
 
I'll give it a try. I also dug out a arduino uno and ran the openpipe sample with its I2C library. I added a returncode to the Wire.endtransmssion() and it too is getting 2 = nack. I ran into some other problems with so I wasn't able to diagnose it further but now I'm thinking that nack is "normal". The MPR121 datasheet doesn't give detail flows.
 
I tried an I2C scanner and every address Nacked. Since everything looked correct I swapped the SCL and SDA leads. At that point I got an ack from my device and nacks to all others. I've contacted the maker since it was his wiring diagram I followed. This was done using the i2c_t3 library.
 
Status
Not open for further replies.
Back
Top