QNEthernet connect() grief

quiver

Active member
I make interfacing devices that use a mix of UDP and TCP, and usually have a small webserver for configuration. I was using embedded T3.2's, then T4's with Wiznet W5500's, but these days I largely use the Teensy 4.1 with its native ethernet.

Last year I moved a couple of projects from the NativeEthernet library to QNEthernet as NativeEthernet stalled when it received zero length payload UDP packets. This took a very long time to diagnose and it only happened with particular control equipment! But QNEthernet fixed it and I've since got into AsyncWebServer_Teensy41; so I'm fairly locked into QNEthernet.

In a new application I need to do TCP client connections.. but I'm finding QNEthernet just isn't reliable at all. The following program with NativeEthernet runs for hours without a failure. On QNEthernet I'm lucky to get a few minutes, more often than not it fails in the first 30 seconds.

The problem appears to be in the connect() function. It's returning 0 or -1 (with no discernible pattern) most of the time. For the purposes of this test I'm just pointing the program at my Macbook Pro and using nc -lvk 80 in a terminal window as my receiver.

I love how well documented QNEthernet is on Github and I've tried to follow Shawn's design principles to the letter. Most other issues I've seen here appear to relate to the write function, not connect().

Here's my test code, obviously when I do the test with NativeEthernet I have to adapt it some to change out the functions it doesn't support.

C++:
#include <QNEthernet.h>

using namespace qindesign::network; // Must be before IP Address declarations

EthernetClient client;

IPAddress thisIP(192,168,10,74);
IPAddress mask(255,255,255,0);
IPAddress gwIP(192,168,10,254);
IPAddress remoteIP(192,168,10,79);

uint32_t timeMgmt;
uint16_t failRate, counter;
bool online;

void setup() {

  Serial.begin(115200); // Serial Monitor

  if (!Ethernet.begin(thisIP,mask,gwIP)) {
    Serial.println("Failed to start Ethernet");
  }

} // END SETUP

void loop() {

  if (!online && Ethernet.linkState()) {
    online = true;
    Serial.println("System Online.");
  }

  if (millis()>timeMgmt) {
    timeMgmt = millis() + 500;
    if (online) {
      char txBuf[1024];
      char numAsChar[8];
      uint16_t packetSize = 0;
      memset(txBuf,0x00,sizeof(txBuf));
      itoa(counter,numAsChar,10);
      strcat(txBuf,"Let's count: ");
      strcat(txBuf,numAsChar);
      strcat(txBuf,"\r\n");
      for (uint16_t i=0;i<sizeof(txBuf);i++) {
        if (txBuf[i]==0x00) {
          packetSize = i;
          break;
        }
      }
      int8_t connectStatus = client.connect(remoteIP,80);
      if (connectStatus==1) {
        client.writeFully(txBuf,packetSize);
        client.flush();
        client.close();
        Serial.print("Tx: ");
        for (uint16_t i=0;i<packetSize;i++) Serial.print(txBuf[i]);
      } else {
        Serial.print("Connection failed - error: ");
        Serial.println(connectStatus);
        failRate++;
      }
      Serial.print("Failures: ");
      Serial.print(failRate);
      Serial.print(" in ");
      Serial.println(counter);
      counter++;
    }
  }
} // END LOOP

Is there something I can try? Can you reproduce the same issue?
 
Hi, quiver. Thanks for using QNEthernet.

Source is here: https://github.com/ssilverman/QNEthernet/blob/master/src/QNEthernetClient.cpp
Side note: connect() returns an int, not an int8_t.

-1 is timeout and 0 means "some other failure", such as a NULL IP address or there was some other problem connecting or reserving resources.

I was experimenting with why this wasn't working and discovered that each time I restarted nc -lvk 80, the connection was successful. Next, I tried a different server and then I saw no failures. The command I used was: python3 -m http.server 80
(See: https://www.digitalocean.com/community/tutorials/python-simplehttpserver-http-server)
You will, of course, see lots of "Bad request" errors because of the lack of a valid HTTP request.

Along the way, I took the liberty of improving the code (it’s improved to my eyes, at least):
C++:
// https://forum.pjrc.com/index.php?threads/qnethernet-connect-grief.74344/

#include <QNEthernet.h>

using namespace qindesign::network;

// Interval between connection attempts, in milliseconds.
constexpr uint32_t kConnectInterval = 500;

// One TCP connection.
EthernetClient client;

// Static IP configuration
const IPAddress thisIP(192,168,10,74);
const IPAddress mask(255,255,255,0);
const IPAddress gwIP(192,168,10,254);
const IPAddress remoteIP(192,168,10,79);

elapsedMillis connectTimer;  // Keeps track of when to connect next
unsigned int attempts = 0;
unsigned int failures = 0;
// Note: I'm using unsigned ints because it's easy to do "%u" in printf()

bool networkChanged = false;  // Set to true upon network change
bool networkUp = false;  // The current state of the network

char txBuf[1024];

// Program setup.
void setup() {
  Serial.begin(115200);
  while (!Serial && millis() < 4000) {
    // Wait for Serial
  }

  // Set up some listeners before starting Ethernet

  Ethernet.onLinkState([](bool flag) {
    if (flag) {
      printf("Link ON\r\n");
    } else {
      printf("Link OFF\r\n");
    }
    networkChanged = true;
    networkUp = flag && (Ethernet.localIP() != INADDR_NONE);
  });

  Ethernet.onAddressChanged([]() {
    IPAddress ip = Ethernet.localIP();
    bool hasIP = (ip != INADDR_NONE);
    if (hasIP) {
      IPAddress subnet = Ethernet.subnetMask();
      IPAddress gw = Ethernet.gatewayIP();

      printf("Address changed:\r\n"
             "\tLocal IP = %u.%u.%u.%u\r\n"
             "\tSubnet   = %u.%u.%u.%u\r\n"
             "\tGateway  = %u.%u.%u.%u\r\n",
             ip[0], ip[1], ip[2], ip[3],
             subnet[0], subnet[1], subnet[2], subnet[3],
             gw[0], gw[1], gw[2], gw[3]);
    } else {
      printf("Address changed: No IP address\r\n");
    }
    networkChanged = true;
    networkUp = Ethernet.linkState() && hasIP;
  });

  if (!Ethernet.begin(thisIP, mask, gwIP)) {
    printf("Failed to start Ethernet!\r\n");
  }
}  // END SETUP

// Main program loop.
void loop() {
  // Watch for any network changes
  if (networkChanged) {
    networkChanged = false;
    if (networkUp) {
      printf("Network UP\r\n");
    } else {
      printf("Network DOWN\r\n");
    }
  }

  // Connect and do work every so often
  if (connectTimer >= kConnectInterval) {
    connectTimer = 0;

    if (networkUp) {
      doConnect();
    }
  }
}  // END LOOP

// Performs a connection. This assumes the network is up.
void doConnect() {
  int packetSize = snprintf(txBuf, sizeof(txBuf),
                            "Let's count: %u\r\n",
                            attempts);
  packetSize = std::max(packetSize, 0);  // Negative means an error

  int connectStatus = client.connect(remoteIP, 80);
  attempts++;

  switch (connectStatus) {
    case 1:
      client.writeFully(txBuf, packetSize);
      client.flush();
      client.close();

      printf("Tx: ");
      for (int i = 0; i < packetSize; i++) {
        printf("%c", txBuf[i]);
      }

      break;

    default:
      printf("Connection failed: status=%d\r\n", connectStatus);
      failures++;
  }

  printf("Failures: %u/%u\r\n", failures, attempts);
}

I'm unsure why the nc command isn't working well. But in any case, you could also try to point it at some HTTP server.
 
Last edited:
Hi Shawn, thanks for this library and for such a comprehensive response!

My code above - which I drastically reduced from the main program to try and rule other factors out - gave me about the same amount of grief (with nc -lvk 80) as I was getting with my main code that talks to a device via Rest API; but your python based webserver suggestion is flawless as you said, and now I'm talking to the device again and it's also flawless.

I think perhaps the only thing I've changed in my main code is that I use client.closeOutput() as I wait for a reply from the receiving device. I hadn't been closing the connection before that, instead leaving it for the tcp rx part of my code to close that after a response (or a timeout). I also hadn't been waiting for the link to come up before I started trying to connect, perhaps that threw it somehow.

I'm still puzzled about the differences with netcat and your library as against NativeEthernet. If I find the time I'll bust open Wireshark and try and suss it.

In any case, it works! And I'm very grateful. Thanks again.
 
It would be amazing (and appreciated) if you could do a little sleuthing for why NativeEthernet seems to behave differently with netcat than QNEthernet does. (Eg. with Wireshark, as you suggest.)

That closeOutput() call does a TCP “half close”. I use it sometimes, with the suggestion of one of the HTTP specs, in my HTTP servers. I don’t think it’s something that’s a usual thing to do, but who knows? I think the way you had it before the closeOutput() would be the way I’d go.

(An interesting link: https://www.excentis.com/blog/tcp-half-close-a-cool-feature-that-is-now-broken/)

I’m glad you got it working more reliably.
 
Back
Top