Teensy 3.1 disable branch prediction, disable system interrupts

Status
Not open for further replies.
Hi,

I am developing a simple RGB PAL signal generator in Teensyduino.

I am using 64us IntervalTimer to trigger start of each picture line and then output pixels (bits) from RAM to a simple resistor DAC appended to teensy pins. Basically, it is similar hardware to AVGA project (http://avga.prometheus4.com/index.php?p=2-3). I am able to see a picture on TV, but there are still some graphical bugs and glitches. There are two of my biggest problems:

1. At the beginning of my line drawing routine, there is one IF which decides between outputting empty lines or picture data. And this means that first approx. 3 or 4 lines are delayed and image skewed a little bit. I assume that this is problem of changing branch prediction strategy. So I want to disable branch prediction in this if, in my routine or at all. According to Cortex-R4 documentation (http://infocenter.arm.com/help/topic/com.arm.doc.ddi0363g/Bgbficch.html), there is a possibility to disable branch prediction on R4. What about M4, is it possible to disable branch prediction on it?

2. Second problem is with graphical glitches which I think is generated by some other interrupts running on Tennsy (Teensyduino USB loader/serial terminal?). I am disabling interrupts in my drawing routine, but I think it is not get called as regularly as I need because of another interrupts running at the time when my routine should start. Are there any list of Teensy system interrupt routines and a possiblity to turn them off? I cannot disable interrupts at all, because I need them for that 64us IntervalTimer and I also need button firmware upload functionality.
 
Forum Rule: Always post complete source code & details to reproduce any issue!

Re-posted from right above your post ;-)
 
I've pushed the latest code version on git. The IF is now on line 72.
Here is the video:
The "Ext." text at the beginning is overlay from my TV.
You can see, that behaviour of artifacts changes after 10s. This is difference between running loop (during delay()s) and endless for(;;) loop at the end.

Focus on top of the "Hello everybody!" text (distroted by branch prediction, I suppose) and slowly moving waves from top to bottom (IMHO kind of delay by other interrupt activity).
 
Last edited:
I've managed to fix the first problem (distorted first lines) by changing code to draw empty lines from memory instead of using different function and it seems good now.
But the second problem persists. There must be interrupts for USB autoload functionality, maybe serial port functionality and maybe something else. Does anyone know where to find further info?

Please notice also the third problem. The lower part of image is shorter than the rest of picture. That margin changes with location of videoBuffer array in memory. It seems that reading from RAM or calculating addresses is faster for second half of teensy RAM. Is this possible?
 
I have managed to disable USB interrupts by this code:
Code:
#define __CHECK_DEVICE_DEFINES
#define __Vendor_SysTickConfig 1
#include <core_cm4.h>
NVIC_DisableIRQ(73);
I have also disabled IRQ74 which fixed another minor graphics glitch. I am not sure which interrupt is responsible for what.
 
I see these 'const int videoSyncPin = 14;' values are used for 'digitalWriteFast(videoSyncPin, LOW);', if you make them #define values I wonder if they compile differently to a faster hardcoded version with specific PORT## access?

What happens if you use these from a transferred RAM copy? :: "unsigned char videoTextFont[][6] PROGMEM = {"

Do you have this working on 'lesser' 16 MHz AVR's?
 
I probably have an older IDE (Arduino 1.0.6, Teensyduino 1.20), so I don't have "No USB" option.
I think that const int and #define are both hardcoded as constants into instructions by compiler. On TV screen it works the same. I don't have an oscilloscope so I cannot confirm that tining has changed.
I have never tried this code on 16MHz AVR's, but there are projects like AVGA or Uzebox that works analogically.
 
No USB - interesting - I just started as 1.21 was in beta and followed to 1.22. It is using updated ver# complier - for better or worse.
The const's may indeed resolve - if you saw the intermediate asm it would show - but I've not done that here.
Wasn't sure about the AVR's - I clicked web link when I went for the code and just scanned it as I closed the window.
 
In another thread, a method was given for running code directly out of RAM. This could be important for your timing because code is usually executed from flash memory which is relatively slow and if I understand correctly the timing is not necessarily consistent, varying based on what's happening with wear leveling and caching.

Also, if branch prediction is really biting you, you could try to recode critical sections in a branch-free way. For example, by using logical expressions in combination with arithmetic. I don't know if the compiler is smart enough to see through this sort of thing.

Code:
if (a < 10) ++a; else a = 0;

becomes

Code:
a += (a < 10) - a * (a >= 10);

Not so readable, but that's what comments are for. ;-)
 
Status
Not open for further replies.
Back
Top