Tips for precise timing?

Status
Not open for further replies.

Alkamist

Well-known member
I'm working on a controller for a GameCube game. It uses a Teensy++ 2.0 to act as the controller.

When I started experimenting with making macros for the game (chains of inputs that are precisely spaced in time), I found that elapsedMillis() and elapsedMicros() seem to be very loose with their timings. It seems like the more complicated the code the board is running, the more jittery the timing.

What's strange to me is that the degree of the timing jitter is around a frame at 60 fps. So that's around 16 ms. What's even stranger, is that I'm having inputs showing up a frame before they are supposed to sometimes.

I even did really simple tests with a padhack of a GameCube controller. I used a Teensy 3.2 patched into two different buttons on a GameCube controller, and simply had the Teensy complete the circuit for those buttons when told to at specific intervals and had the same results. Sometimes I would get frame perfect input, sometimes it would be a frame early, and sometimes it would be a frame late.

I thought it might be a problem with the way the input is polled in the game, but I had the same results with two different games. I also did tests printing out timings to serial while testing in game, and according to the serial, the timing is almost perfect.

Does anyone know why this might be? Is there a way to get around this? I would like the inputs to show up on the exact frame I want. I don't think I can use any time critical applications to achieve it however, because I am using a library that is time critical to communicate with the GameCube.
 
Maybe you've got stuff running from interrupts? elapsedMillis can't run when an interrupt routine runs. Or maybe a lot of other code is running between each time you actually check elapsedMillis, and that much time really has elapsed. Very hard to guess, not knowing much about how your code is structured.

The 8 bit AVR of Teensy++ 2.0, there's unfortunately not a lot of options if you're already using interrupts and have a lot of code to run. AVR doesn't support priority nested interrupts like you get on ARM with Teensy 3.2.
 
Maybe you've got stuff running from interrupts? elapsedMillis can't run when an interrupt routine runs. Or maybe a lot of other code is running between each time you actually check elapsedMillis, and that much time really has elapsed. Very hard to guess, not knowing much about how your code is structured.

The 8 bit AVR of Teensy++ 2.0, there's unfortunately not a lot of options if you're already using interrupts and have a lot of code to run. AVR doesn't support priority nested interrupts like you get on ARM with Teensy 3.2.

I'm using the NicoHood Nintendo library. I'm not sure how it works, but it probably has some interrupt routines I'm guessing. The thing is, even when I was testing on a simple padhack of a GameCube controller I was having this problem.

I wired a Teensy 3.2 up to a couple GameCube controller buttons. Then I would run a single elapsedMillis and reference it for the timings, and when it is time to push a button, I would switch the pin mode of each button connector to OUTPUT.

There would be +- 1 frame of jitter even with very little code running and no interrupt routines. The only thing I can think of is that possibly switching a pin to OUTPUT can mess with timing too. If it doesn't I don't see a reason why the timing should be so far off.

Is there any sort of a solution with an external time unit or something of the sort?
 
Is there any sort of a solution with an external time unit or something of the sort?

Unfortunately on 8 bit AVR your options are very limited, and the small range of options you do have involve digging very deeply into how all the code works.
 
Unfortunately on 8 bit AVR your options are very limited, and the small range of options you do have involve digging very deeply into how all the code works.

That's a bummer. Fortunately, by exploiting a mechanic within the specific game I'm making this for, I am able to get frame-perfect output on the macro I want about 85% of the time.

You say there are limited options on an 8 bit AVR, which unfortunately I am stuck with because of the library I am using; I'm curious though, are there easier and more accessible solutions if I were using a more advanced chip?
 
With the more advanced CPUs you can have multiple priorities of interrupts, so it would have been possible to push the timer and whatever else was triggering to different levels (timer higher) so things still worked without changing any of the code actually doing work with good chance of success.

For the T2 you'd need to find out what is hogging interupt time and tweak it. If this library is bit banging the wave form to the gamecube and is just blocks interupts while doing so that would certainly cause the results you see.
 
You say there are limited options on an 8 bit AVR, which unfortunately I am stuck with because of the library I am using;

With AVR, usually the only solution is to dig deeply into all the code and add checks for input during parts where it spends too much time.

I'm curious though, are there easier and more accessible solutions if I were using a more advanced chip?

Yes, as GremlinWrangler mentioned, you get nested priority interrupts on more advanced chips. On AVR, when any interrupt runs, all others are blocked. On 32 bit ARM, when an interrupt runs, only others of the same and lower priority are blocked. All others of higher priority can still interrupt. By choosing priority levels well (Teensyduino comes with pretty good default priorities already configured), your short time sensitive stuff can get very low latency access, when your longer still timing sensitive stuff can run with lower priority, and the main program can still do the less critical tasks.
 
Thanks for the help!

I'm having a whole new problem now. So this whole time I've been testing this controller on an official GameCube to USB adapter on my PC going to Dolphin (a GameCube emulator). This has been working really well. Today, however, I just tested it on a vanilla GameCube for the first time. Unfortunately I am getting some strange behavior.

When I plug into the actual GameCube, my controller is read, but certain things aren't right. The two big things I notice are that my macros are running at about 1/4 speed, and that it is possible and very easy to push buttons too quickly so they don't register. It sort of leads me to believe that the board is running slowly.

This makes no sense to me though, because the GameCube is still reading commands from the controller, which would require it to be running at the full 16 MHz with this library. I tested this with an Arduino Nano and still had the same problem. I also even tested by powering the board with USB power instead of the GameCube's power, and the result is the same. I'm using the Bounce2 library for my button debouncing, so I'm guessing that is playing a part in the buttons being able to be pushed too fast to be registered. Keep in mind though it still works perfectly on my PC regardless.

I'm starting to think that it is just a problem inherent to the NicoHood library itself. Maybe it just went unnoticed because the library wasn't being pushed as far as I'm trying.

Short of learning assembly and diving really deep into processor programming so I can write my own GameCube library, I am at a loss for what to do. If anyone can provide any help, or maybe just give me some troubleshooting ideas, I would be very grateful. Thanks!
 
I am not familiar with this software at all, but I took a quick look at https://github.com/NicoHood/Nintendo/blob/master/src/Gamecube.c and found this critical section
Code:
    // Don't want interrupts getting in the way
    uint8_t oldSREG = SREG;
    cli();

    // Read in data from the console
    // After receiving the init command you have max 80us to respond (for the data command)!
    
     [snipped]
... here there is a lot of commands sent, responses parsed, etc.. lots of communication happening   ...

    // End of time sensitive code
    SREG = oldSREG;

So the library you are using turns off all interrupts on a regular basis. Among the interrupts it turns off are the timer ticks your code relies on. Much of the time, the timer ticks before or after this critical section so everything works ok. But if the timer ticks while interrupts are off, it has to wait to be serviced until the end of the long critical section. Sometimes it is just a little late and you end up one frame behind. Other times it is later and skips over a frame firing right before the next frame resulting in being ahead one frame. Maybe that is why you sometimes experience rapid fire. This effect is directly related to the amount of time that interrupts are turned off, that in turn is related to how quickly the particular gamecube host system responds to commands.

Now that was all just pure speculation, but I think it is still more likely than a crystal clocked microcontroller speeding up and slowing down dramatically.

slomobile
 
I am not familiar with this software at all, but I took a quick look at https://github.com/NicoHood/Nintendo/blob/master/src/Gamecube.c and found this critical section
Code:
    // Don't want interrupts getting in the way
    uint8_t oldSREG = SREG;
    cli();

    // Read in data from the console
    // After receiving the init command you have max 80us to respond (for the data command)!
    
     [snipped]
... here there is a lot of commands sent, responses parsed, etc.. lots of communication happening   ...

    // End of time sensitive code
    SREG = oldSREG;

So the library you are using turns off all interrupts on a regular basis. Among the interrupts it turns off are the timer ticks your code relies on. Much of the time, the timer ticks before or after this critical section so everything works ok. But if the timer ticks while interrupts are off, it has to wait to be serviced until the end of the long critical section. Sometimes it is just a little late and you end up one frame behind. Other times it is later and skips over a frame firing right before the next frame resulting in being ahead one frame. Maybe that is why you sometimes experience rapid fire. This effect is directly related to the amount of time that interrupts are turned off, that in turn is related to how quickly the particular gamecube host system responds to commands.

Now that was all just pure speculation, but I think it is still more likely than a crystal clocked microcontroller speeding up and slowing down dramatically.

slomobile

Thank you for the info.

I ended up contacting the guy who wrote the library, and he said it was possible that the GameCube console polls more often than the Dolphin emulator. If that were the case, I guess the interrupt-disabling code would be triggered more often, which would lead to slower timings.

Unfortunately, that means the only solution for me is to do away with all code that has anything to do with timing. Maybe eventually I'll think about writing my own library for a faster processor, but right now at my current level, that isn't possible.
 
Unfortunately, that means the only solution for me is to do away with all code that has anything to do with timing. Maybe eventually I'll think about writing my own library for a faster processor, but right now at my current level, that isn't possible.

Well, its not as dramatic as all that. Interrupts are disabled and reenabled with code that looks like what I just showed you. Search the library source code for similar sections{find cli()} and note the function names. There might only be the one section. The guy that wrote the library was kind enough to tell you the maximum time that those sections will last, 80us. So, just be careful how you place your timings around the critical library function calls. Check your running timer immediately before the call to gc_write(...). If it is less than 80us from the end of your interval, call your function now. If more than 80us, wait for the usual place where you call your function. That will keep timing errors to the minimum possible, no worse than 80us early. There are probably ways to eliminate even that error if you have a frame timing reference.

I'd like to point out that you still haven't posted any of your code for us to troubleshoot, so that makes an accurate diagnosis very difficult.
 
Last edited:
Well, its not as dramatic as all that. Interrupts are disabled and reenabled with code that looks like what I just showed you. Search the library source code for similar sections{find cli()} and note the function names. There might only be the one section. The guy that wrote the library was kind enough to tell you the maximum time that those sections will last, 80us. So, just be careful how you place your timings around the critical library function calls. Check your running timer immediately before the call to gc_write(...). If it is less than 80us from the end of your interval, call your function now. If more than 80us, wait for the usual place where you call your function. That will keep timing errors to the minimum possible, no worse than 80us early. There are probably ways to eliminate even that error if you have a frame timing reference.

I'd like to point out that you still haven't posted any of your code for us to troubleshoot, so that makes an accurate diagnosis very difficult.

I haven't posted code because my code is spread out among a bunch of different files and it is very class based. A really boiled down version of what I'm doing is the following:

- I want to make a macro that when it gets a signal, it pushes an input instantly and then a different one 4 frames (60 fps) later.
- Make an elapsedMillis that is tied to that macro.
- When you get a signal reset the elapsedMillis to 0.
- Check the elapsedMillis with >= to determine when you should trigger the other inputs.

I'm afraid your explanation goes over my head a little since I am not that experienced with microcontrollers. I'm not sure exactly how to integrate what you said into my strategy above. I guess I am unclear about how millis() and micros() work under the hood, since elapsedMillis is based on it. I'll try to wrap my head around what you said earlier when I get the chance, but if you could spell it out a bit more it would most likely help a lot.

So I'm guessing that every time the console is polling my controller, it is eating 80us of time from my internal timer? So do I just add back that time every time it is polled or something?
 
I'm starting to understand a little more what is happening with some reading. I managed to find a link that talks about pretty much my exact problem but with a different library. https://community.particle.io/t/millis-and-micros-lose-time-when-interrupts-are-disabled/15416

My problem is that I am trying to keep track of millis() or micros() for long periods of time (8 frames at the most at 60fps). This timing isn't precise because the library code disables interrupts for a very long time according to what I've read. It seems a real GameCube polls much more often than Dolphin, which means that it will be more prone to eating SysTick Interrupts. This causes the timing to be drastically different on the two systems.

In the thread I linked, someone says the following:

So you want the timer to keep ticking when the source of ticks is disabled. Sure, that's easy.

What are the legit reasons for disabling interrupts? The only case I can think of is timing-sensitive code, and the solution there is to put that code itself in a high-priority interrupt so that it executes without interruption while still letting the other system interrupts be queued rather than dropped.

Is it possible to adapt the library I'm using to this strategy instead of disabling interrupts all together? I suppose something like this isn't possible on an AVR since they don't have nested priority interrupts.
 
That's pretty much it.

As noted above you can modify the long 'no interrupt' period by either having it check the timer values when it starts, and knowing how long it'll be on hold for potentially changing the button state it sends to fire early. Or even deliberately delay it's response to poll so that it's hitting the right frame on time.

The other option is to look inside the code and knowing what other interrupts you might have (ie not try to run servo motors or lots of serial data) try to change the bit bashing chunk of code so it can keep the interupts active. problem then is you have to check your timing for the pulses rather than just running delay loops.

How often does the polling trigger? Is it even triggering fast enough to achieve single frame precision? Way to check is to have a counter that gets incremented inside the interrupt loop that you read once a second, note that there are lots of problems there since it's perfectly possible your counter will increment while being read but for human fault finding that could be fine. If you can get that working it may be possible to stop using timers and instead sync your code to polls, so press button, wait one poll, release, wait X polls press another button. Depends if polling is directly tied to game frames.

There are a number of traps getting where two different blocks of code are using the same variables, but where your failure mode is 'macro doesn't fire right' not 'reactor explodes and kills us all' you don't have to spend so much thought on it.
 
That's pretty much it.

As noted above you can modify the long 'no interrupt' period by either having it check the timer values when it starts, and knowing how long it'll be on hold for potentially changing the button state it sends to fire early. Or even deliberately delay it's response to poll so that it's hitting the right frame on time.

The other option is to look inside the code and knowing what other interrupts you might have (ie not try to run servo motors or lots of serial data) try to change the bit bashing chunk of code so it can keep the interupts active. problem then is you have to check your timing for the pulses rather than just running delay loops.

How often does the polling trigger? Is it even triggering fast enough to achieve single frame precision? Way to check is to have a counter that gets incremented inside the interrupt loop that you read once a second, note that there are lots of problems there since it's perfectly possible your counter will increment while being read but for human fault finding that could be fine. If you can get that working it may be possible to stop using timers and instead sync your code to polls, so press button, wait one poll, release, wait X polls press another button. Depends if polling is directly tied to game frames.

There are a number of traps getting where two different blocks of code are using the same variables, but where your failure mode is 'macro doesn't fire right' not 'reactor explodes and kills us all' you don't have to spend so much thought on it.

Thanks for the info!

I think I would want to stay working with time instead of polls, because I want my controller to work with a few other games of this type, so I'm not sure I want to commit to the concept of changing timing based on different poll rates. I might do that if I have to though.

I'll try to do some experimentation with finding out how fast the game is polling. From what I read, for this particular game, it is twice per frame. There still seems to be a discrepancy between the console and Dolphin though.

I'm still confused on how to implement what you guys are talking about, my apologies. I think there is confusion because I asked about two problems I'm having in this thread. One is that there is a +-1 frame jitter in good conditions. The bigger problem however (which I think has to do with the jitter) is that millis() and micros() are running very slowly on an actual GameCube console. In this case, my macro that delays for 4 frames will delay for about 16 frames. I'm thinking that it is because the higher frequency of polling is eating more SysTick Interrupts. I'm just not sure where in my code I implement a fix to this.

I'll try reading your response more thoroughly when I get the chance, and maybe write some example code of what I'm doing.
 
Just thought of a problem I don't know how to fix for testing. I won't be able to print out how many polls there are per second, because currently it isn't possible to track how long a second is since millis() and micros() are running slowly.

EDIT: Did some testing manually, by printing out the number of polls and resetting it every time I connect a pin by hand. I looked at a stopwatch for my timing and made the connection every second.

I found that Dolphin is running the no-interrupt section of the library 850 times per second, while the GameCube console is running it 2200 times per second.

That being the case, if each call of this section takes 80 microseconds, the GameCube is forcing the processor to be in no-interrupt mode for 176,000 microseconds. No wonder my timing is way off.

EDIT 2: Just tried making a correction timer that is referenced along side of micros() to get the corrected time. If I add 160us to the correction timer every time the no-interrupt section is called, it will correct micros() so it is almost in sync with real time. The problem is, even if you get it synced with real time on Dolphin, it still runs slowly on a console. It doesn't seem to be something that a constant addition of time can account for, unless I'm doing something wrong.
 
Last edited:
Duh - didn't think about the fact you can't just report the counts per second...

Looks like the polling time is driven by the game, which is a wrinkle:
http://www.seas.upenn.edu/~gland/gamecube.html
Which might allow really clever code to work out what game is running, and what state the game is in but otherwise is just another complication.

Running 2200 times a second will certainly mean the controller is never going to appear sluggish, and looks like 20% of the time is spent with interrupts disabled just from the controller loop. Now the controller won't be the only one disabling interrupts, there are also interupts firing from USB, the timers and probably some other code. So it's quite possible that sometimes the controller exits the controller code with more interrupts set that it'll have to service before it get backs to the core timer functions and misses one or more rollovers of the timer that drives millis functions. While there are code based ways to get this working, the easiest solution may be a second Teensy that runs the timing functions, and drives 'button' presses to the one running the controller code and spending most if it's time with interupts disabled.

This is by no means an unknown problem, but most of the solutions I know use hardware that isn't in a T2. Looking again at that interface spec, it seems to use 4us per bit, divided up into 1us slots so you are only getting 16ish instructions per bit. Which really doesn't leave a lot of time to release interrupts while it runs those delays. A similar problem drove the design of the OctoWS LED pixel library which uses T3.x hardware to let the hardware drive bits out while CPU is doing other stuff. Porting this code to a faster and more capable micro is certainly possible, but you'd need a scope or at least a logic analyzer to see how your timing looked so depends how committed to what I think was very much a just for fun project.

Doing things within just a Teensy2 I think means looking to some other source of time base - either by tampering with the hardware timers to generate a frame clock directly or counting polls.
 
Duh - didn't think about the fact you can't just report the counts per second...

Looks like the polling time is driven by the game, which is a wrinkle:
http://www.seas.upenn.edu/~gland/gamecube.html
Which might allow really clever code to work out what game is running, and what state the game is in but otherwise is just another complication.

Running 2200 times a second will certainly mean the controller is never going to appear sluggish, and looks like 20% of the time is spent with interrupts disabled just from the controller loop. Now the controller won't be the only one disabling interrupts, there are also interupts firing from USB, the timers and probably some other code. So it's quite possible that sometimes the controller exits the controller code with more interrupts set that it'll have to service before it get backs to the core timer functions and misses one or more rollovers of the timer that drives millis functions. While there are code based ways to get this working, the easiest solution may be a second Teensy that runs the timing functions, and drives 'button' presses to the one running the controller code and spending most if it's time with interupts disabled.

This is by no means an unknown problem, but most of the solutions I know use hardware that isn't in a T2. Looking again at that interface spec, it seems to use 4us per bit, divided up into 1us slots so you are only getting 16ish instructions per bit. Which really doesn't leave a lot of time to release interrupts while it runs those delays. A similar problem drove the design of the OctoWS LED pixel library which uses T3.x hardware to let the hardware drive bits out while CPU is doing other stuff. Porting this code to a faster and more capable micro is certainly possible, but you'd need a scope or at least a logic analyzer to see how your timing looked so depends how committed to what I think was very much a just for fun project.

Doing things within just a Teensy2 I think means looking to some other source of time base - either by tampering with the hardware timers to generate a frame clock directly or counting polls.

Thank you for the info!

I definitely think splitting up the library to its own processor is the way to go. What is the easiest way to communicate between processors? Ideally it would be done using as little pins as possible.

How I envision it working is to have a Teensy that communicates via USB, probably with DirectInput. Would it be possible to connect a Teensy to an AVR, possibly an Arduino Nano, and get commands from there? It would be really clean to have at heart a DirectInput controller that interfaces with another processor that acts as a converter. That way, I could unplug it from the converter and use it directly as DirectInput for other games on the computer.
 
Most serial methods will have their own latency/interupt issues. One option is to just use your second micro purerly as a timer, so you can keep all your existing wiring and have an output that you set high at macro start and then the second CPU just sets one or more pins after the right time interval for your macro. Within your code you then just track what state those timer reply pins are at to hit the timing marks.

To get even more precise you either have the timer listen to the gamecube command line (or mod the library so it sets a pin while busy) so the timer micro can tell if it's going to need to fire early to best hit the target time.
It would certainly be possible to split the processing more evenly to get a USB input device as well, but that'll mean changing a lot of what you've already done.
 
Most serial methods will have their own latency/interupt issues. One option is to just use your second micro purerly as a timer, so you can keep all your existing wiring and have an output that you set high at macro start and then the second CPU just sets one or more pins after the right time interval for your macro. Within your code you then just track what state those timer reply pins are at to hit the timing marks.

To get even more precise you either have the timer listen to the gamecube command line (or mod the library so it sets a pin while busy) so the timer micro can tell if it's going to need to fire early to best hit the target time.
It would certainly be possible to split the processing more evenly to get a USB input device as well, but that'll mean changing a lot of what you've already done.

Most of my code is pretty modular and abstracted, so making seemingly large changes can be fairly simple depending on the change. If I were to use a second microcontroller just as a timer, I would like to find some way to keep all of my logic inside the main microcontroller. Telling a microcontroller when to start a macro and getting the inputs later seems clunky to me, because it would require lots of wire connections for everything I would want to do timing related. I wouldn't just have one macro ideally, and several other logic components are based on millis().

It would be cool to make my own version of micros() or millis() that I can use, that way I can just substitute it into my timer object and everything will work with minimal changes.

Would there still be latency/interrupt issues if I used something like a Teensy 3.6 for all of my logic, and just passed the button presses via serial to another microcontroller that is running the NicoHood library?
 
Tricky part comes with knowing the NicoHood library spends a lot of time busy.

If you send data from the master controller you will get a byte or two of hardware FIFO and then run the risk of overflowing, so better solution is to sync to the end of a controller poll cycle, either by monitoring the pins or by having a process where the controller library sets a flag when it's done that your main code watches to know it's got time to go get button states and format things up for the next poll.

Serial is probably the easist interface to implement, though SPI and i2C would also work. Key thing in whatever you choose is you want hardware pins available (You already have one library doing bitbashing) and able to move your 60 odd bits of data between worst case polls.

Would suggest you don't need a Teensy 3.6. While it's great, buying a pair of LCs, or 3.2s will give you slightly better plug and play library support and a spare for when you break something. Even if you don't need that spare having it is great piece of mind.
 
Tricky part comes with knowing the NicoHood library spends a lot of time busy.

If you send data from the master controller you will get a byte or two of hardware FIFO and then run the risk of overflowing, so better solution is to sync to the end of a controller poll cycle, either by monitoring the pins or by having a process where the controller library sets a flag when it's done that your main code watches to know it's got time to go get button states and format things up for the next poll.

Serial is probably the easist interface to implement, though SPI and i2C would also work. Key thing in whatever you choose is you want hardware pins available (You already have one library doing bitbashing) and able to move your 60 odd bits of data between worst case polls.

Would suggest you don't need a Teensy 3.6. While it's great, buying a pair of LCs, or 3.2s will give you slightly better plug and play library support and a spare for when you break something. Even if you don't need that spare having it is great piece of mind.

Thanks for the help!

I've managed to get decently consistent timing on both the emulator and console with I2C. I'm still having the problem where sometimes inputs come a frame early or late. I don't understand the solution for it that everyone posted, but I'll go back and read it in more detail when I get the chance.
 
Talking about precise timing, does anyone know if elapsedMicros() handles overflow by rollover or saturation?

i.e. After counting up to 4294967295, does it become 0 or stay at that?

BTW, this is about 79 minutes. Overflow for elapsedMillis() is possible but much rarer (about 50 days).
 
Sometimes it is easiest to look at the code to answer that.

If you look in your files: <where you installed Arduino>\hardware\teensy\avr\cores\teensy3\elapsedMillis.h

file you will see:
Code:
operator unsigned long () const { return millis() - ms; }
Note: ms is defined as unsigned long.

So it properly handles when millis() wraps, but it will simply wrap around.
 
Status
Not open for further replies.
Back
Top