max speed over usb serial (from PC to teensy2)

Status
Not open for further replies.

mathieu

Member
i'm trying to setup an application that will send a constant flow of data from the PC to the teensy and i need as much speed as i can get.
in theory that's 12mbps as stated here
and that would be plenty for me.

i have a teensy2 connected to a linux pc (tested on windows as well) and with your usb_serial example found here:
http://www.pjrc.com/teensy/usb_serial.html
i manage to get to the ~1Mbyte/s transfer rate you mention at the bottom of the page.

but i need that speed the other way around (PC to T2). i was wondering if there was a rx test program as you do for tx? or if anybody ran any benchmarks?

i can't reproduce that speed with rx (data from PC to T2). i have a barebones serial read loop on the teensy and the computer is sending packets of 64 to 256bytes. i reach a max transfer rate of 315kbytes/s.
if i add a serialavailable() on top of that (to make it more proper), i fall even more (to ~215kbytes/s)

anybody experimented with this?
thanks!
 
Most of this slowness is from the overhead of reading 1 byte at a time.

For sending from Teensy to PC, there's a write function that transfers multiple bytes. The overhead is only slightly higher, but the bytes are copied to the USB packet buffer very rapidly, so it's a lot faster when sending more than 1 byte.

So far, an optimized receive function has never been written.
 
thanks for your answer paul.
the overhead you mention is built into the serial.read() function? (that's the only thing i call on teensy)
ok, so no way around the slowness unless we write a multiple (64) byte read function on the teensy side?
do you have any pointers/advice on how to get started with this?
and of course if anyone has worked on it, i'll take it :)
thanks.
 
Here's my thoughts... if you want to give it a try.

First, you'll probably start with usb_serial_getchar(), keeping this code intact:

Code:
        intr_state = SREG;
        cli();
        if (!usb_configuration) {
                SREG = intr_state;
                return -1;
        }
        UENUM = CDC_RX_ENDPOINT;
        retry:
        c = UEINTX;
        if (!(c & (1<<RWAL))) {
                // no data in buffer
                if (c & (1<<RXOUTI)) {
                        UEINTX = 0x6B;
                        goto retry;
                }
                SREG = intr_state;
                return -1;
        }

This is the code you'll try replacing.....

Code:
        // take one byte out of the buffer
        c = UEDATX;
        // if buffer completely used, release it
        if (!(UEINTX & (1<<RWAL))) UEINTX = 0x6B;

Probably the simplest thing to do is read UEBCLX, to find out how many bytes are available in the currently viewable packet. You already know a packet is pending with at least 1 byte, so you should always get a number between 1 to 64.

A very simply approach would be to use a fixed 64 byte buffer, and just read UEDATX for each byte that's available and put it into the buffer. Then at the end, you can return that number, so the caller can know how many bytes were received.

If you always read every byte, the "!(UEINTX & (1<<RWAL))" test will always be true, so you could just always do "UEINTX = 0x6B" after reading the bytes, to return the buffer to the USB.

If you allow the user (...you...) to pass in a pointer to a buffer and its size, of course you'll have to check if the buffer is large enough, only copy the data that fits, and avoid returning the packet buffer if you didn't read all its bytes.

The simplest way to read bytes is with a loop. A faster way involves a switch/case block with the read code unrolled. See the highly optimized usb_serial_write() function for an example. The code size is larger, but it's much faster. I'd recommend getting it working first the simplest way possible, and later attempt the switch/case optimization if you need it. Doing that really requires verifying the compiled output with "avr-objdump -d" on the generated .elf file, so you can check if the compiler really is generating a long sequence of only the 2 necessary instructions, with a compute jump into the sequence. Lots of seemingly random stuff will cause the compiler to generate much slower code. But even a simple loop with looping overhead is probably much faster than calling usb_serial_getchar() for each byte.

A really good approach would allow the user to pass in a pointer to any size buffer and automatically read as many packets as necessary, or as available, probably with interrupts enabled briefly between each packet. That's how the usb_serial_write() function works... and let me tell you, many hours of work went into those 148 lines of code... analyzing generated code, testing & benchmarking, etc. It's a lot of work!

If you get something simple to function reliably, even a fixed buffer and single packet read with slower looping, I hope you'll share the confirmed-working code here? Hopefully this description will give you a good start?
 
Last edited:
that's a ton of info, i really appreciate the effort paul. we'll get our hands dirty and will be sure to report back if we manage to get anywhere.
cheers! :)
 
hi paul, sorry it took a while to get back to you. i didn't get the chance to spend as much time as i would want on this. nevertheless, i did manage to get a significant bump in transfer speed by doing what you explained. i went for quick and easy (and maybe dirty) so not sure the code is bug proof. i copied your read() function and made a read64() version that looks at how many bytes are in the buffer, puts those bytes into a buffer passed as parameter via a for loop, and returns bytes read. the code is below (goes into usb_api.cpp + need to declare the function in .h).
with that, i was able to go from 150kbps to about 780kbps (both numbers in bytes) so i'm quite happy :).
i believe the theoretical limit is 1.5Mbps (12Mbits/s) and if optimized as well as your write() function, we could probably get to the same results i.e. ~1Mbyte/s so an increase of about 30%. if you ever get around to doing that, i'd be interested in the code.
in the meantime, thanks again for your help on this topic.

Code:
int usb_serial_class::read64(uint8_t *buffer)
{
    uint8_t c,n, intr_state, i;
	// reads buffer size
	n = UEBCLX;
	
	if (peek_buf >= 0) {
		c = peek_buf;
		peek_buf = -1;
		return 1;
	}
	
        intr_state = SREG;
        cli();
        if (!usb_configuration) {
                SREG = intr_state;
                return -1;
        }
        UENUM = CDC_RX_ENDPOINT;
	retry:
	c = UEINTX;
        if (!(c & (1<<RWAL))) {
                // no data in buffer
		if (c & (1<<RXOUTI)) {
			UEINTX = 0x6B;
			goto retry;
		}
                SREG = intr_state;
                return -1;
        }
        // take n bytes out of the buffer
		for (i=0; i<n; i++) {
			buffer[i] = UEDATX;
		}
		// buffer should be drained so release it
		UEINTX = 0x6B;
        SREG = intr_state;
        return n;
}
 
we could probably get to the same results i.e. ~1Mbyte/s so an increase of about 30%. if you ever get around to doing that, i'd be interested in the code.

I did it about 2 weeks ago!

Well, I didn't fully unroll the loop, but it does achieve about 1 Mbyte/sec on Teensy 2.0.

This code is in Serial.readBytes() in 1.14-rc2.
 
Status
Not open for further replies.
Back
Top