Teensy 3.0 to PC communication via USB serial (>256000 baud)

Status
Not open for further replies.

kingforger

Active member
I made a quick VC++ command line program to communicate with the teensy 3.0. I can send basic commands back and forth at 256000 baud. Yay.

Question: How do I take advantage of the 12Mbps communication of the teensy? I'm using serial in my VC++ program, and I can only go up to 256000 baud with my current library that I'm using. Is there a similarly (like serial) easy way to transfer data quickly or set a 12000000 baud?

Basically, I have a 4GB SD card attached to the teensy. I want to be able to dump all the data from the card to the PC going through the teensy. But doing it at 256000 baud is just too slow...

Oh, and my code to establish a serial connection is just like this...not that I think it is relevant...ignore the crazy long timeouts:

void EstablishSerialConnection()
{
hSerial = CreateFile(_T("COM7"),
GENERIC_READ | GENERIC_WRITE,
0,
0,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
0);

if (hSerial==INVALID_HANDLE_VALUE){
if(GetLastError()==ERROR_FILE_NOT_FOUND){
cout<<"Com port 1 does not appear to exist!"<<endl;
}
cout<<"invalid serial port handle thingy..."<<endl;
}

DCB dcbSerialParams = {0};

dcbSerialParams.DCBlength=sizeof(dcbSerialParams);

if (!GetCommState(hSerial, &dcbSerialParams)) {
cout<<"Error getting serial port comm state"<<endl;
}

dcbSerialParams.BaudRate=CBR_256000;
dcbSerialParams.ByteSize=8;
dcbSerialParams.StopBits=ONESTOPBIT;
dcbSerialParams.Parity=NOPARITY;

if(!SetCommState(hSerial, &dcbSerialParams)){
cout<<"Error setting serial port parameters"<<endl;
}

COMMTIMEOUTS timeouts={0};
timeouts.ReadIntervalTimeout=500;
timeouts.ReadTotalTimeoutConstant=500;
timeouts.ReadTotalTimeoutMultiplier=100;
timeouts.WriteTotalTimeoutConstant=500;
timeouts.WriteTotalTimeoutMultiplier=100;

if(!SetCommTimeouts(hSerial, &timeouts)){
cout<<"Error setting serial port timeouts"<<endl;
}
}
 
Last edited:
Just so I'm clear, you are wanting to communicate from the PC to the Teensy 3.0 with USB but not relay the traffic to a real UART, right?

If so what does your Teensy code look like? Have you speed tested the code? What data rate (for a give data transmittion size) are you getting?

dcbSerialParams.BaudRate=CBR_256000;

If you are just doing PC to Teensy 3.0 I don't think the baud setting does much. It gets passed down to the Teensy (I think) but it's only for people who are using a real UART to relay the traffic.
 
When using Teensy's USB Serial, the actual baud rate is always 12 Mbit/sec, regardless of what baud rate you configure. Even if you configure to 300 baud with SetCommState(), you'll get exactly the same speed. The only thing this actually changes is the number you can read with Serial.baud() on the Teensy side.

However, USB has end-to-end flow control built in. The benfit is you can't lose data (or it's quite difficult to do so), but the downside is the entire system runs slower than the maximum speed if you do things inefficiently.

To send from Teensy to your PC quickly, the first and most important step is using Serial.write(buffer, length). If you write one byte at a time, there's a lot of overhead which can slow down even Teensy 3.0. The USB packet size is 64 bytes, so lengths that are multiples of 64 are most efficient. Even 30 or 40 bytes at a time is dramatically more efficient.

On the PC side, you should read with ReadFile using a large buffer. You might think a PC is so fast that reading one byte at time would not matter, but it does. Windows has a lot of overhead. There's also user process scheduling latency on Windows, which is terrible for speed if you're doing the input 1 byte at a time. You really need to read with a substantial buffer.

The other thing that absolutely kills performance is any sort of command-response protocol, where you send a command from the PC and wait for the Teensy to reply, and then send the next command after fully receiving the prior reply. In a best case scenario, you'll usually want until the next 1ms USB frame (enough time to send about 1000 to 1200 bytes). But it can be far worse. Microsoft's drivers have a very simple, unsophisticated design that really hinders this sort of usage (I could write a lot about that, but I'll save it for another time). If you're going to send a large amount of data, build a protocol where the Teensy keeps transmitting and does not need to wait for commands to do so for each little piece.

If you follow these steps, it should be possible to achieve about 1 MByte/sec or 8 Mbit/sec speed, at least sending fixed data from a buffer. I'd suggest doing that first and benchmark it, then add in the SD stuff and see how much of a hit in speed you take by also using the SD library. The USB code on Teensy 3.0 is actually quite efficient and supports substantial buffering, so hopefully you can read from SD while the USB is working to send.
 
I'm only concerned about the speed of teensy-to-PC communication (not the other way). Thank you for all that info Paul, but I still have one question that remains: If I specified 256000 baud on my PC-side program, will the PC still only read the incoming data at 256000 baud or will it run as fast as it can anyway (the USB speed of the teensy)?

Oh yes, and I was definitely planning to read on PC side with a large buffer. My buffer will be about 32768 bytes. And I also have no plans to do commands back and forth. I'll just send one command to the teensy to let it know that I want the data, and then the teensy will straight out dump the data until the file has been fully read while the PC keeps reading.

1MB/sec would be fabulous.
 
Last edited:
Picking up where this thread left off since it's basically the same question I have.

I'm trying to read from an SD card and transfer over USB to the PC.

It takes me 22 seconds to transfer a 3MB file, or about 128K/sec. I was hoping to get closer to 1MB/sec (8x faster).

I'm using a 512 byte buffer on both ends. I took the SD card reading out of the equation, and am now just trying to rapidly send the same buffer in RAM to the PC as fast as possible. BTW, the sdcard read seems to add almost no overhead.

I'm using Processing with RXTX. I've tried different baud rates on the serial port in Processing. It doesn't seem to make a difference. (9600,115200,1000000,2000000)

Any ideas?

I'm looking for a way to capture the data on linux so I can take RXTX out of the equation and see if that's the problem.

T3 code:
Code:
//test serial rate limiting                                                                                  
  for (int i=0;i<(3000320);i++) {
    Serial.write(buff1,512);
  }

Processing code. It doesn't make a difference if I write the data to a file or not, as I've tried commenting it out.:
Code:
		myPort = new Serial(this, Serial.list()[0], 2000000);

	public void receiveAndWriteFile2(Serial serial, String sOutputFilename) {
		int nBytes = -1, totalBytes = 0;
		int chunkSize = 512, fileSize=3000320;
		byte [] buffBytes = new byte[chunkSize];
		ByteBuffer tmpBuff = ByteBuffer.wrap(buffBytes);
		BinaryFileWriter bfw = new BinaryFileWriter(sOutputFilename);
		readFileInfo(serial);
		
		//try & wait until we have chunkSize available
		delay(50);
		System.out.println("avail: " + serial.available());
		long startTime=System.currentTimeMillis();
		
		for (int i=0;totalBytes<(fileSize-chunkSize);i++) {
			if (serial.available() >= chunkSize) {
				nBytes = serial.readBytes(buffBytes);
				totalBytes+=nBytes;
				//System.out.println("read" + i + " nBytes: " + nBytes + " totalBytes: " + totalBytes);
				tmpBuff.rewind();
				bfw.write(tmpBuff);
				//bfw.sync();
			} else {
				//System.out.println("sleep: " + i);
				delay(1);
			}
		}
		delay(50);
		//read last chunk
		if (serial.available() > 0) {
			nBytes = serial.readBytes(buffBytes);
			totalBytes+=nBytes;
			//System.out.println(" last read, nBytes: " + nBytes + " totalBytes: " + totalBytes);
			tmpBuff.rewind();
			//shorten to # read
			bfw.write(ByteBuffer.wrap(buffBytes, 0, nBytes));
			//bfw.sync();
		} else {
			System.out.println("nothing avail on last read");
		}
		System.out.println("Time elapsed: " + (System.currentTimeMillis() - startTime));
		
		System.out.println("done sending: " + totalBytes);
		bfw.close();
	}
 
Any tips appreciated. I realize the code is not easily runnable the way I pasted it in. I'm working on putting together an arduino & processing sketch for testing just the USB speed. Will post when I finish it.
 
... Microsoft's drivers have a very simple, unsophisticated design that really hinders this sort of usage (I could write a lot about that, but I'll save it for another time).

Ha ha, I'm curious to hear if you've written more on the subject!

It seems you have written your own windows usb serial driver and I am looking for more information about the timing characteristics, in particular. For example, I've read about the FDTI windows drivers having a polling/buffering interval settable (default 16ms, settable to 1ms, etc), and wonder what kind of interval your driver uses. Presumably the shortest possible, but just wanted to see. My application is scientific research on timing, so I'm interested in the shortest latency/lowest jitter possible, using the teensy to collect responses, and must be visible as a serial port to the experiment control software. (I'm also assuming serial + your driver will have better timing than HID + windows keyboard/HID driver.)

(I"m not a windows person, so I hope you'll forgive what may be simple questions.) Thanks.
 
It seems you have written your own windows usb serial driver

Nope, it's just Microsoft's USBSER.SYS.

I only wrote a Windows-style installer, because so many people had trouble with installing the raw INF file.

and I am looking for more information about the timing characteristics, in particular. For example, I've read about the FDTI windows drivers having a polling/buffering interval settable (default 16ms, settable to 1ms, etc), and wonder what kind of interval your driver uses.

That setting is actually on the Teensy side, not the Windows driver. It might appear to be a Windows driver setting, because with FTDI that's the only thing you can control. But even with FTDI, their driver merely sends that setting to their chip. That timing is actually controlled entirely on the device side.

On Teensy 3.1, this timing defaults to 5ms. With Teensy, you have far more control over this than you can with FTDI. If you're used Serial.print() or Serial.write() to send data to your PC, you can use Serial.send_now() to effectively set this time to zero. This give you the best of both worlds, where data is aggregated into larger packets for efficient bandwidth usage, and you can have the lowest possible transmit latency (limited only by the host controller chip & drivers in your PC) at the specific points in your application where you want data you've written to get back up to the PC as soon as possible.

The FTDI chip can't do this, because it's just a USB to serial converter without any programming for your specific application. It can't know and understand your data, so it has no idea when it should wait for more versus immediately making any partial USB packet immediately available to the PC. But Teensy can, if you use Serial.send_now() right after you've finished writing your timing-sensitive message.


My application is scientific research on timing, so I'm interested in the shortest latency/lowest jitter possible, using the teensy to collect responses, and must be visible as a serial port to the experiment control software.

While Teensy's Serial.send_now() can give you the best USB latency performance possible, if you have a demanding application, you really should not design your communication with all timing on the PC side based on the arrival of USB data. You'll also suffer the 1ms scale latency of USB bandwidth scheduling, and occasionally you'll suffer operating system scheduling latency.

You really should consider a design where precise, low-jitter timestamping of data is done on Teensy and encoded into your data stream. Your software on the PC side should parse the timing Teensy put into the data, rather than using the PC's clock, after the non-realtime operating system and all sorts of latency-inducing factors have influenced the exact timing of the data's arrival to your PC software.

You should also use a streaming scheme, where Teensy never waits for a message from the PC before sending each piece of data. Even if you have to encode more bytes into your data stream to identify which data is which, the improvement in bandwidth is usually well worth the extra bytes.
 
Thank you for the detailed reply! I'd thought of time stamping at the teensy, but not the streaming idea. I do need to drive additional events based on time of input so the latency is still relevant. I'll look into usb streaming protocols.
 
Hi,

I'm interested in the fastest possible streaming of chunks of data from a Teensy 3.6 to a Windows PC, and have read this (old) thread with interest. Salient points would seem to be:

To send from Teensy to your PC quickly, the first and most important step is using Serial.write(buffer, length). If you write one byte at a time, there's a lot of overhead which can slow down even Teensy 3.0. The USB packet size is 64 bytes, so lengths that are multiples of 64 are most efficient. Even 30 or 40 bytes at a time is dramatically more efficient.

If you're used Serial.print() or Serial.write() to send data to your PC, you can use Serial.send_now() to effectively set this time to zero. This give you the best of both worlds, where data is aggregated into larger packets for efficient bandwidth usage, and you can have the lowest possible transmit latency (limited only by the host controller chip & drivers in your PC) at the specific points in your application where you want data you've written to get back up to the PC as soon as possible.

So clearly I should use Serial.write(buffer, length), followed by a Serial.send_now(). But is there any advantage to be had by optimising the buffer size? Obviously a multiple of 64 bytes is important, but I could choose 64, 128, 256 all the way to 65536 given the amount of RAM available and my requirements for latency. What would be best?

Thanks, Ian
 
If you want fastest possible throughput, don't ever call send_now(). Doing so forces a partial packet to be transmitted. If you're streaming a lot of data, the USB serial code will automatically combine the first data from your next write with the leftover data from your prior write into the same USB packet, for the most efficient transmission. Calling send_now() improves latency if you're not going to send more data for several milliseconds.

Writing in multiplies of 64 isn't a huge issue. Teensy 3.6 is so much faster than the full bandwidth of 12 MBit/sec USB. You could probably even write 1 byte at a time and still fully saturate the USB bandwidth.

Currently there is no USB device support for Teensy 3.6's faster USB port, but I'm planning to work on that in July. At least one person did some benchmarking with the USB host code and got 20 MByte/sec speed, which is an encouraging sign we can look forward to great performance when that project is working.
 
Status
Not open for further replies.
Back
Top