Is FASTRUN broken in arduino 1.6.7

Status
Not open for further replies.
I haven't gone past Arudino 1.6.5 myself, but I imagine if you code by C++ rules, and always put in forward declarations of the functions at the beginning of the file, it may help things out.

I imagine what is going on is the Arduino processor is trying to be 'helpful' and provide forward references for your functions. However, the Arduino team have been rewriting this and likely the new version doesn't understand Teensy specific things like FASTRUN. It also doesn't work (IMHO) if you have #ifdef's for different processors and include different include files.

I'm coming around to the point of moving everything into libraries and code in real C/C++, and just put the top level includes in the .ino/.pde file with the loop/setup functions calling into the real code in the library.
 
if i remove the FASTRUN then it works. i really need the fastrun to be in there though!

If you run Teensy at 96MHz, you may not _need_ FASTRUN, it is not running faster (according to a statement of Paul) due to smart access to Flash by the MCU.
Only if you run at say 144 MHz, you see some differences, but unwrapping loops and other speed-up tricks outperform FASTRUN.
In a couple of cases FASTRUN slowed my programs down (sìc!).
 
If you run Teensy at 96MHz, you may not _need_ FASTRUN, it is not running faster (according to a statement of Paul) due to smart access to Flash by the MCU.
Only if you run at say 144 MHz, you see some differences, but unwrapping loops and other speed-up tricks outperform FASTRUN.
In a couple of cases FASTRUN slowed my programs down (sìc!).

it is usually run at 120mhz, i have not really got any loops in the ISRs, they just run through everything once each go, but i measured the speed of the main loop with and without FASTRUN, on the ISRs and the loop runs at 3khz with FASTRUN instead of 2.2 or so without

mind you this was on much earlier versions of teensydiono and arduino.

i think i have fixed it with forward references to those functions. (it will compile but i am not at the actual module to test it)
 
If you run Teensy at 96MHz, you may not _need_ FASTRUN, it is not running faster (according to a statement of Paul) due to smart access to Flash by the MCU.
Only if you run at say 144 MHz, you see some differences, but unwrapping loops and other speed-up tricks outperform FASTRUN.
In a couple of cases FASTRUN slowed my programs down (sìc!).

??
Could you pls give a link to these statements ?

I use "FASTRUN" sometimes, and do time-measurements in these cases.
I never noticed that FASTRUN is "not faster". Maybe there are some cases, but i never saw this behavior.

- FASTRUN is independend of CPU-FREQ
- FASTRUN is independend of Compiler-Optimization-levels

It just says the compiler/linker to execute the code in RAM.

(After a quick search for example in this thread : https://forum.pjrc.com/threads/27959-FLOPS-not-scaling-to-F_CPU)
 
Last edited:
??
Could you pls give a link to these statements ?

I use "FASTRUN" sometimes, and do time-measurements in these cases.
I never noticed that FASTRUN is "not faster". Maybe there are some cases, but i never saw this behavior.

- FASTRUN is independend of CPU-FREQ
- FASTRUN is independend of Compiler-Optimization-levels

It just says the compiler/linker to execute the code in RAM.

(After a quick search for example in this thread : https://forum.pjrc.com/threads/27959-FLOPS-not-scaling-to-F_CPU)

I recall that Paul said that up say 96 Mhz there is no real improvement putting code into RAM.
As all my programs run at 144 MHz, I use extensively FASTRAM especially for FFT, etc. However, tuning the execution with a logic analyzer, I found instances where I cut get faster execution when not using FASTRUN, i.e. when code is run from flash. I guess that this could happen, when routine in ram 'interferes' with execution of routines in flash. I cannot give references or an example, but wanted only to indicate that use of compiler instructions do not guarantee the desired result and independent verification is required (e.g speed up from 2.2 to 3 kHz as said by neutron7).
 
I recall that Paul said that up say 96 Mhz there is no real improvement putting code into RAM.
As all my programs run at 144 MHz, I use extensively FASTRAM especially for FFT, etc. However, tuning the execution with a logic analyzer, I found instances where I cut get faster execution when not using FASTRUN, i.e. when code is run from flash. I guess that this could happen, when routine in ram 'interferes' with execution of routines in flash. I cannot give references or an example, but wanted only to indicate that use of compiler instructions do not guarantee the desired result and independent verification is required (e.g speed up from 2.2 to 3 kHz as said by neutron7).

Year, eventually, there may be some rare cases where execution in ram does not give a speedup.
Could you post an example ? Perhaps something is wrong.
Of course there are some limits, for example when the code uses a lot of calls to flash-routines (nano-library), like floating-point or others. But it would be hard to imagine for me that it isn't at minimum as fast as in Flash.
Maybe i'm wrong? So please.. post an example.

And of course, obvoius things like trying to inline "FASTRUN" do not work as intended.

edit:
p.s. or very small flash routines that fit into the cache (which is ram too)
 
Last edited:
Does the software floating point library run in RAM? That may be where an FFT/DCT spends its time.
 
To clarify... if the FFT uses floats for a lot of looping math (versus trig lookup tables and fixed point math), the software floating point library functions and transcendentals will be running in flash.
 
To clarify... if the FFT uses floats for a lot of looping math (versus trig lookup tables and fixed point math), the software floating point library functions and transcendentals will be running in flash.
In my cases, I only use fixed point FFT q15, and put tables into ram (I know how to manipulate CMSIS code) AND I'm not using Arduino IDE. BTW, my FFT in ram IS faster than executed from flash. All data and code locations verified in map file.

Of course, if ram where larger, I would run all my program in ram, but unfortunately most parts of my program have to run from flash.
 
Last edited:
Status
Not open for further replies.
Back
Top