TEENSY 4.1 REDUCED ENCODER PERFORMANCE WITH NATIVE ETHERNET

I have recently migrated our film scanner core circuit board from Teensy 3.6 to 4.1 w/native ethernet. We use dual Teensy processors: One for Transport/Encoder/Triggering operations, and one for LED lamp house color balance/Exposure timing and digital camera I/O operations. The board in a standard configuration looks like this:

XENA SERIES 6 CONTROLLER BOARD.jpg



We have completed our bench testing with great success. The power and accuracy of the T4.1 is unbelievable! But our initial production alpha testing has presented a potentially serious issue. In the transition to T4.1, our encoder input accuracy has dropped down to approximately 14KHz without losing counts. I used my own c-based encoder interrupt routine, and have just replaced it with Stoffregen's assembly language optimized library, which brings my accuracy up to about 16.5 Khz. We need to run film at native 24 frames per second with 2000 quadrature pulses per frame for audio clock triggering at 48KHz. With the T3.6, we were able to achieve this and then some.

After extensive evaulation, I believe I have isolated the problem to the Teensy native ethernet library server object, which appears to be using significantly more processor resources that our prior T3.6 with Wiznet WIZ811MJ and WIZ812MJ (W5100).

To demonstrate the issue independent of my script, I used the Stoffregen Encoder Library 'BASIC' sketch to test with:

Test #1. 'BASIC' Encoder library as written:

With or without the use of ENCODER_USE_INTERRUPTS or ENCODER_OPTIMIZE_INTERRUPTS, I got an impressive 350KHz accuracy, at which time my servo motor stalled on the load it was connected to. So the accuracy is probably far greater than what I was able to measure. I verify the Teensy encoder input accuracy by sending the same encoder A/B channel outputs simultaneously to the Teensy and to my Galil Servo Motion Controller, which has encoder accuracy up to 12MHz. Each output is buffered by separate logic IC's. The motion controller receives 5v TTL, and the teensy channels run through a voltage divider circuit for LVTTL compatibility.

Test #2. Addition of Native Ethernet Server Object:

Code:
/* Encoder Library - Basic Example
 * http://www.pjrc.com/teensy/td_libs_Encoder.html
 *
 * This example code is in the public domain.
 */

#define ENCODER_USE_INTERRUPTS
#define ENCODER_OPTIMIZE_INTERRUPTS

#include <Encoder.h>
#include <NativeEthernet.h>

// Change these two numbers to the pins connected to your encoder.
//   Best Performance: both pins have interrupt capability
//   Good Performance: only the first pin has interrupt capability
//   Low Performance:  neither pin has interrupt capability
Encoder myEnc(4, 3);

EthernetServer*        server;
byte                   mac[] = {0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xEF};
IPAddress              ip(198,0,0,5);

//   avoid using pins with LEDs attached

void setup() {
 
  server = new EthernetServer(12345);
  Ethernet.begin(mac, ip);

  Serial.begin(9600);
  Serial.println("Basic Encoder Test:");
}

long oldPosition  = -999;

void loop() {
  long newPosition = myEnc.read();
  if (newPosition != oldPosition) {
    oldPosition = newPosition;
    Serial.println(newPosition);
  }
}
I added minimal code to initialize an Ethernet server object. With no connection to a socket, the performance is the same as in my full processor script, which is about 16.5Khz accuracy with Paul's Encoder library as written. A simple call to Ethernet.Begin() brings the encoder performance down to this slow rate.

My ethernet bandwidth requirement is quite minimal. It handles basic setting of internal variables, logic states, LCD display, etc. The protocol is text-based, and packets never contain more than 10 characters, send or return.

QUESTION #1: What am I doing wrong?

QUESTION # 2: Is there any way I can modify the ethernet libary to either slow down the engine to allow more resources to the Encoder interrupt routine, or possibly rescind its interrupt priority (Does it even use interrupts?) to a lower level so the Encoder interrupt routines always complete?

SETUP:
Windows 10 Pro For Workstations (Gen 2, March 2021)
*Arduino 1.8.13
*Teensyduino 1.53
Motherboard: Asrock 621A WS w/Intel(R) I210 ethernet port. Locked at 100/100 send and return.
*(Curious Note: If I install current Arduino 1.8.19 and Teensyduino 1.58, my sketch compiles and loads, but the ethernet does not function when running.)

I appreciate any and all help on this problem.
 
The latest Teensyduino version is 1.59.

NativeEthernet is no longer supported, you should try QNEthernet which is very much supported by an author on this forum.
He spends a lot of time on this forum, so would be very much able to help you.

QNEthernet can be installed via the Arduino Library Manager or can be downloaded here.
Also study this thread on this forum (stated by Shawn Silverman)

 
If you're not using interrupts, i.e. you're polling the digital inputs, and you've got up to 48 kHz on those inputs, it would only take about 20 us of the CPU being busy for you to miss an edge. If you haven't tried using interrupts, you could do that first, but you really should be using the QuadEncoder library, which uses Teensy 4.x QuadTimer to do the quadrature counting in hardware. That will work up to quite high frequencies with zero CPU usage.
 
Thank you, BriComp. I'll download that library and try it. Hopefully, it will just "drop in" to my existing code. I see from the discussion you linked me to that it doesn't use timers. I wonder if that will free up the encoder interrupts.

To Joepasquariello: Yes, I do use hardware interrupts. As I mentioned on my original post, I'm getting over 350KHz with the Stoffregen Encoder library running solo. The culprit seems to be the Native Ethernet object, and BriComp has referred me to QNEthernet, which I hope will solve the project.

When I get the new library working, I'll post my progress onto this discussion.

Best regards.
 
Okay, I guess if it's not broken don't fix it, but I'll try one more time. QuadEncoder is installed with TeensyDuino, and using it would eliminate all encoder-related interrupts, which are going to be a big chunk of your processing time. You just configure it for your pins, then call the read() function whenever you want a running quadrature count, just like you do with Encoder, except there are no interrupts and no CPU usage. When you have time, try the example sketches. I think it will work up to 150 MHz.
 
PROBLEM SOLVED!!!! The QNEthernet library dropped into my sketch seamlessly, and my encoder is smokin! I just tested at 250KHz with no loss of quadrature counts. I need to do more testing with the ethernet communication, but it looks good. The one thing I noticed is the the server responses to client inquiries have about 100-150ms delay from what I'm used to. Is that because the library is more "software" oriented?

Thanks again for all your help!
 
The client delays are because the data isn’t flushed immediately when sending. Other Ethernet libraries seem to send data right away. Instead, in the QNEthernet library, data is only sent when an internal count is exceeded (I call it a “timer”, but I didn’t want to confuse the issue), about 250ms, or when the client is “flushed”. Call flush() on your client when you’d like data to be sent immediately.

See also: https://github.com/ssilverman/QNEthernet#write-immediacy (and the previous parts in that section).
 
The client delays are because the data isn’t flushed immediately when sending. Other Ethernet libraries seem to send data right away. Instead, in the QNEthernet library, data is only sent when an internal count is exceeded (I call it a “timer”, but I didn’t want to confuse the issue), about 250ms, or when the client is “flushed”. Call flush() on your client when you’d like data to be sent immediately.

See also: https://github.com/ssilverman/QNEthernet#write-immediacy (and the previous parts in that section).
Is this similar to the "Nagle Algorithm" which attempts to improve efficiency by buffering small data chunks into larger packets when using TCP/IP? i.e. improving ratio of data vs overhead / housekeeping etc. First thing to disable on a real-time system.
 
Not quite, but I agree it feels similar. Nagle’s algorithm uses receiver acknowledgement. What I’m referring to limits data by slots of time or until buffer full or a flush. It’s how the underlying lwIP stack chooses to buffer data.

Note also that you can disable Nagle’s algorithm via a call to the socket’s setNoDelay(flag) method. That won’t affect the buffering I mention above, however.
See also: https://github.com/ssilverman/QNEthernet#tcp-socket-options
 
Last edited:
Back
Top