TCP vs UDP (vs higher level protocol?) for sending ADC data over ethernet

mrm

Member
Hi all

I have written a sketch that collects about 10 kS/s (10,000 samples per second) from an ADC. It does so by means of non-blocking code in the loop. By this I mean, I'm not using an ISR in my code, but handling the timing myself; checking once per loop whether the ADC has data ready via its own IRQ pin, and reading it over 15MHz SPI if so, continuing with no delay if not.

I plan to buffer 1s of data in an SRAM array of 10000 elements, then once the array is full, immediately transfer it via ethernet, whilst continuing to collect the next 10 kS into a second array. Hopefully this ethernet transfer can happen within 1 addition second, otherwise we risk overwriting the first array as the process repeats.

I have included the NativeEthernet library and the device connects to ethernet using Ethernet.begin(mac, ip, dns, gateway, subnet); So far I can ping it whilst it's reading from the ADC...

First question: is this buffer->send idea the most sensible approach? Perhaps it's the only approach?

Second question: will I be able to achieve this with TCP or would it be better advised to use UDP? Ideally I would use something higher level like MQTT, although I may just use MQTT for messaging status updates, flags, etc.

Final question: re code efficiency / timing, are there any useful tips, or any gotchas I should be aware of? e.g. implementing some kind of configurable buffer size?
 
I’m really curious what the results will be. Some random thoughts and questions:
1. How big is each sample?
2. Why are you waiting for 10000 samples (1 second)? How did you choose this number? I would experiment with multiples of, say, UDP packet data size (that is, if using UDP).
3. There’s two Ethernet libraries, NativeEthernet and QNEthernet. The first uses a 1ms timer to move the stack along (and is backed by FNET). QNEthernet (this is the one I wrote) doesn’t use a timer (and is backed by lwIP) and uses a non-blocking approach by hooking into the yield() function (it also gets executed after every call to loop() plus when you call Ethernet.loop() (not often needed because many of the internal functions call it)). I’m curious how the different approaches/libraries would impact the project.
4. Do you need to “stream” the data at all, say to listen to it or something in real-time? Or do you just want to offload the data without losing any? There’s various real-time-style streaming protocols too that might give you some ideas.
5. Since you’re using a polling/looping approach, you won’t necessarily be able to “transmit while collecting.” I suggest looking for some ADC examples that use DMA.
 
Last edited:
Thanks Shawn (sorry for the delay in replying)

1. Each sample is 24 bit. Also I need to send some metadata e.g. serial number of reading etc.. so I will use a 32 bit value
2. 10,000 samples is just an arbitrary value. It could be less, or more.
3. I didn't know that NativeEthernet uses a 1ms timer. Just reading your documentation on Github for QNEthernet. (Thank you for that - it's very nice and clear.) Do you mind if I get back to you here with any additional questions if something doesn't work as expected? (more likely, if I can't make it work?)
4. There is no requirement to stream the data. Actually I believe it may be necessary to send the last 1s worth of readings upon request. Not sure yet, but there may be a repeating request, or it may just be one-shot requests at the press of a button, etc. I'm working to someone else's spec, this part to be decided. When you say "there's various real time style streaming protocols"... can you elaborate? Do you mean, higher level than TCP / UDP?
5. I considered DMA but have been essentially scared off by the lack of documentation and possible complexity. Will keep that in mind though!

Cheers
 
UPDATE for anyone interested (Shawn if you are reading this - see (7) below)

1) So far I've used NativeEthernet library, created a TCP client on the Teensy and sent an array of uint32_t data using client.write.
2) I was able to send 10,000 ADC readings (2x uint32_t arrays of 5000 elements) in two packets.
3) I benchmarked sending one array of 10,000 32-bit values, which translates to 40,000 bytes. This took 77 microseconds, according to the micros() function. I simply grabbed a value for micros() before client.write, then compared it with the micros() value after that line of code. I don't really understand how it can be sent that fast. Can someone explain? 1/77 microseconds = 12.987kHz, which would be the frequency / rate at which I could send 40,000 bytes. Multiply 40,000 bytes by this rate to get bytes per second, giving 519MB per second. This is clearly not possible with a 10/100 interface (or a gigabit one for that matter) - so what gives?

4) Anyway, to do the above, I hit up against the maximum socket size of 2048 bytes, however I found out that NativeEthernet allows you to set the socket size like this:

Code:
Ethernet.setSocketSize(40000); //Set buffer size to 40,000 bytes

40,000 bytes is exactly how many bytes are required to send the data, un-encoded.

5) I added some test encoding like this:
Code:
    client.write("startmsg");
    client.write((const byte*) reading0, sizeof(reading0));
    client.write("endmsg");

(reading0 is a 32 bit array of 5000 elements.)

6) I noticed that bytes seem to be sent backwards. So when you send a single 32 bit value, with the MSB on the left, the binary value of this number is split into 4 octets and you get the rightmost octet first in the sequence. I'm sure there's a sensible reason for this, relating to how client.write converts a 32 bit value into 4 bytes. Maybe this can be changed? I'm sure I could change it in the source data.

7) Shawn - re your point above about 1ms blocking delay in the ethernet code, I measured timings in the loop using micros() function and found that the loop was executing in a relatively small number of microseconds, and see above my point about the number of microseconds for a TCP write to take place: not sure where this 1ms is coming in to it? Or maybe my timings measurement method is incorrect?
 
I don't really understand how it can be sent that fast. Can someone explain?

My guess is the time you're measuring is very short because you are measuring the time to write the data to a buffer, as opposed to the time for the data to actually go out on the wire. I think if you look at the source, you'll see that the write function returns before the data has been sent.

Teensy is little-endian (least-significant byte first), whereas network transfer is usually big-endian (most-significant byte first). Whatever device is going to receive this data will typically then convert to its own endianness.
 
Teensy is little-endian (least-significant byte first), whereas network transfer is usually big-endian (most-significant byte first). Whatever device is going to receive this data will typically then convert to its own endianness.

I know it's possible to piece the bytes back together in the opposite order at the receiving end, but...

When you send a 32 bit value out from Teensy, can it (NativeEthernet library?) be configured send the most significant *byte* first?

Reason: the ADC data is stored as most-significant-BIT first, so when the 32 bit value gets split into 4 bytes, we get this bizarre order:

[bits 7 - 0] [bits 15 - 8] [bits 23 - 16] [bits 31 - 24]


My guess is the time you're measuring is very short because you are measuring the time to write the data to a buffer, as opposed to the time for the data to actually go out on the wire. I think if you look at the source, you'll see that the write function returns before the data has been sent.

Thank you, that makes sense. Either way, it's good news because it means the loop is not blocked for long and I can continue to take ADC readings at a reasonably high rate of 10,000 times per second.
 
Hi, @mrm. Here's some notes:

4. I just know that they exist and may have interesting things they do to keep data streaming in "real time" that might be of interest.

5. DMA might be fast, but you're right that it's hard to get started with it. It may not be applicable here, just be aware it exists should the need arise.

3) @joepasquariello is right that you may be timing just the part that sends to the internal buffer(s). However, there's more. It may be the case that not all the data gets sent or queued. When using any of the `write` calls, you need to check to see how many bytes were actually "written" (meaning sent to a buffer or sent on the wire, such that the calling application no longer needs to worry about it). (See https://forum.pjrc.com/threads/6910...rocess-a-POST)?p=296736&viewfull=1#post296736 and also https://forum.pjrc.com/threads/6846...nt-misbehaving?p=295438&viewfull=1#post295438 (and the surrounding posts)). Also see this section in the QNEthernet README: https://github.com/ssilverman/QNEth...e9/README.md#how-to-write-data-to-connections. That's generally applicable to anything that uses the `Print` API.

Question: Have you verified that all the data actually gets sent by reading it at the receiving end?

6) Teensy is little-endian, and so the LSB gets sent first (that's how it's stored in memory, being little-endian). (Looks like Joe answered first.)
To send in "network order" (big-endian), you can do this:

Code:
uint32_t value = something;
write(value >> 24);
write(value >> 16);
write(value >> 8);
write(value);

Endian-ness determines byte order and not bit order. Bit order is dictated by other things.

7) NativeEthernet uses a 1ms timer to call its internal "Ethernet update and check if there's data" code. It runs inside an ISR, and so anything that it calls, including any data listeners, runs within an ISR context (I hope I have that right; there are probably subtleties in the implementation). That might affect your application if it needs precise timing. This is all internal to NativeEthernet and not necessarily related to application code.
 
Last edited:
Shawn, thanks for the really useful comments.

Re (3) yes I have verified on the receiving end that the exact number of bytes I expect to be received are indeed received. I am using Node-RED as a TCP server, it shows you in the debug screen the number of bytes for a given transmission. Strangely, with larger packets, the debug messages in Node-RED are split into two messages, but the numbers add up. Maybe it's just a Node-RED thing, or perhaps a timing thing. Or maybe the Teensy eth chip is doing that?

That said, thanks for the pointer. I will definitely plan to check the number of bytes "written" out.

6) I'll do something similar, I may even rearrange the value using binary operations like that, at the point it's stored from the ADC.

7) The application does need precise timing. I still have some work to do on this!
 
Great! Sounds like it’s working out for you and that you’ve found the tools you need. That writeFully() thing is to make your code more robust, even if no bytes need to be retransmitted most of the time.

If you feel like it, would you be willing to try out QNEthernet in addition? My hope is that it behaves the same. I’d love to get more data on its use and ease-of-use in real-world projects. (No worries if you don’t have the time.) The README and examples should provide all the info you need.
 
I’d love to do that. I’m going to spend some time tomorrow trying to get QNEthernet up and running and will report back here.
 
Back
Top