PDA

View Full Version : Teensy 3.0 as logic analyzer, instruction timing



wangnick
11-18-2012, 09:44 AM
Dear all,

I'm working on rewriting the Arduino Logic Analyzer from Andrew Gillham (https://github.com/gillham/logic_analyzer, see also http://letsmakerobots.com/node/31422) to run on the Teensy 3.0.

I'm already able to capture 8 channels at max 13.7MHz into 12KB of SRAM, using:

byte logicdata[MAX_CAPTURE_SIZE];
unsigned int logicIndex;
...
// 7 cycles on Teensy at 96 MHz: 72.916 ns, 13.7 Msps
while (logicIndex) {
logicdata[--logicIndex] = CHANPORT;
}

I measure this by making an Arduino generate a 1Mhz rectangle on one port, where every 256 us one high value is low, at which time I'm setting a second port to high.

However, I seem to observe that my tight Teensy 3.0 loop starts taking 7 cycles and then degrades to 8 cycles at higher (or lower?) memory:
http://sebastian.wangnick.de/olg-teensy3.png

Real time between cursors is 256 us. OLG shows a delta between Cursor 1 and Cursor 2 of 249.08 us, and between Cursor 2 and Cursor 3 of 223.64 us. OLG believes that a sample lasts 72.916ns, so we captured 3416 samples between C1 and C2 and 3067 samples between C2 and C3. So, between C1 and C2 sampling actually took 74.94 ns on average (thats 7.2 cycles), and between C2 and C3 sampling took 83.47 ns on average (thats 8 cycles).

Any idea why the tight loop initially takes 7 cycles and later 8 cycles?

Also, I've been padding out the tight loops with NOP operations for lower sampling speeds. For instance, this loop usually takes 10 cycles:

while (logicIndex) {
logicdata[--logicIndex] = CHANPORT^waitCount; // Three NOPs don't work, they make the loop longer than 10 cycles
__asm__("NOP\n\tNOP");
}

The following loop usually takes 20 cycles:

while (logicIndex) {
logicdata[--logicIndex] = CHANPORT;
__asm__("NOP\n\tNOP\n\tNOP");
__asm__("NOP\n\tNOP");
}

How comes that one NOP sometimes takes 1 cycle and sometimes 2? Or is it so that the branch prediction gets worse when the jump address lies further away?

Kind regards,
Sebastian

PS: I also observe quite some noise on Channel 0. Any clue what might be causing this at those frequencies? I've already tried to use other ports, both on the Teensy and on the Arduino, but in vain. Here the setup:
51

btmcmahan
04-10-2014, 09:16 PM
Hey, did you ever get this working well? I'm very interested in using my teensy3.0 as a logic analyzer.

Jp3141
04-10-2014, 09:50 PM
There are a number of reasons you will observe jitter in these types of systems when using advanced processors.

The ARM processor in the Teensy 3.0 runs quite fast (48 to 96 MHz). Flash memory can't run that fast, so there is a small cache in that portion of the MCU. Depending on whether or not the instruction is in the cache, timing will vary (lookup FMC here http://cache.freescale.com/files/32bit/doc/data_sheet/K20P64M72SF1.pdf?&Parent_nodeId=&Parent_pageType= )

The processor also has a small pipeline -- it is possible that this means that all runs through your short loop don't repeat same the pipeline fetching pattern. I don't think this is what wangnick is seeing.

USB (and microsecs ?) generate interrupts on the processor -- this actually takes larger (100 ns ?) chunks of time -- probably not what he was seeing also.

It is possible to disable interrupts, and with more effort, probably possible to disable the cache (or copy and run code from RAM).