Forum Rule: Always post complete source code & details to reproduce any issue!
Page 1 of 6 1 2 3 ... LastLast
Results 1 to 25 of 143

Thread: Teensy 4: Global vs local variables speed of execution

Hybrid View

  1. #1

    Teensy 4: Global vs local variables speed of execution

    I'm a novice in programming and I'm using mostly global variables in my code to make it more readable. It's hard enough as it is to wrap my mind around what my code is doing so keeping the variables up top helps at the moment and I need them to keep their values in scope of the main loop.

    I'm wondering if this has a major impact on performance on the teensy 4 since it's so fast?

    My project is time critical, I need it to be fast.

    Is there any difference in processing speed with global and local variables and how big of a difference is it?

    The reason I'm asking is because I've seen the memory layout of the teensy 4 and it seems they are stored in different places.

  2. #2
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,268
    Quote Originally Posted by frankzappa View Post
    I'm wondering if this has a major impact on performance
    Usually there is little or no difference in speed. Though as a general rule, you should prefer non-static local variables. Often the compiler can keep them all local variables in registers and never need to perform memory loads & stores, which is worthwhile for performance.

    Another general rule is using no more than about 6 to 8 local variables in any given area of code which needs to run fast. The compiler has only 13 registers to use (plus 3 are dedicated to specific non-variable purposes) and it almost always needs to use a few of them for addresses and other stuff which isn't explicit but is implied by your code. Use of global and static variables also temporarily consumes registers. If you use too many local variables, more than will fit into the CPU's registers, then the compiler is forced to allocate some of your local variables in memory on the stack, which is slower. Some of the most intense optimization work involves pushing these limits, to intentionally use as many local variables as possible, where you compile and then read a disassembly of the generated code to check how many registers the compiler really allocated. I have done much of that work in the audio library, and I can tell you it is very tedious, but can lead to impressive performance gains. But as a general rule to follow for less time consuming coding, keep in mind there's a certain number of local variables "alive" in any giving section of your code which forces the compiler to generate slower code. Stay under that threshold.

    32 bit unsigned integers are the fastest variable type, so use "uint32_t" where you can. You might intuitively think 8 or 16 bits could be faster, but they are actually sometimes slower (and never faster). Especially when used as function inputs and return values, the compiler is sometimes forced to add logical AND instructions to limit to 8 or 16 bits.

    32 bit float can also be surprisingly fast, so ignore ancient microcontroller advice to never use floating point. Teensy 4.0 has a FPU that makes 32 bit float about the same speed as 32 bit integers. If you have an algorithm that naturally makes sense to use float, don't be shy to try it. The FPU has a large group of registers dedicated to floats, so using them can free up some of the normal 13/16 integer registers. But there is only 1 FPU, so you can get at most 1 float operation per cycle, compared to the 2 integer ALUs which allow M7 to (sometimes) perform 2 integer operations per cycle.

    64 bit double is also supported by the FPU, but at half the speed as 32 bit float. One gotcha is the compiler assumes any float constants without a trailing "f" are 64 bits, and when you use them in math with 32 bit numbers, the compiler "promotes" everything to 64 bit math. So in your code, write "3.3f" for 3.3, so the compiler treats it as a 32 bit float.

    As far as memory is concerned, there are 5 areas. In order from fastest to slowest, they are DTCM, ITCM, DMAMEM, PROGMEM/FLASHMEM, and EXTMEM. By default, variables are allocated in DTCM and code goes into ITCM. So you automatically get the very best performance without doing anything special. To use those other slower memory areas, you have to add those keywords to your variables & functions.

    When working on performance critical code, you should use the ARM_DWT_CYCCNT cycle counter to benchmark your code. It's a 32 bit integer which increments every clock cycle. It's inside the ARM core, so reading it is very fast, but keeping a copy throughout your performance-critical code does either consume one of the precious registers or require a write to memory. Still, the cost is low and the insight you gain from benchmarking as you experiment is worthwhile. Just read it before and after and subtract the 2 to get the number of cycles you code too to execute. If running inside an interrupt or other timing critical place, maybe store it to a global variable so you can print the number to the serial monitor at a less critical moment.

    You can also use digitalWriteFast() before and after performance critical code and watch the pulse width with an oscilloscope or logic analyzer, or estimate it by DC voltage on a multimeter if the code is run at known intervals. If you read through some of the driver level code in the libraries, you'll find commented-out copies of digitalWriteFast() which were used for testing & benchmarking.

  3. #3
    Senior Member
    Join Date
    Jul 2014
    Posts
    2,695
    Quote Originally Posted by PaulStoffregen View Post

    32 bit unsigned integers are the fastest variable type, so use "uint32_t" where you can. You might intuitively think 8 or 16 bits could be faster, but they are actually sometimes slower (and never faster). Especially when used as function inputs and return values, the compiler is sometimes forced to add logical AND instructions to limit to 8 or 16 bits.
    So, lets get rid of the stupid 8 bit bool!
    I, personally thought, to use 16 bit, but maybe should only use 32bit variables.

  4. #4
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Location
    Ayer Massachussetts
    Posts
    3,764
    Quote Originally Posted by PaulStoffregen View Post
    32 bit unsigned integers are the fastest variable type, so use "uint32_t" where you can. You might intuitively think 8 or 16 bits could be faster, but they are actually sometimes slower (and never faster). Especially when used as function inputs and return values, the compiler is sometimes forced to add logical AND instructions to limit to 8 or 16 bits.
    Well caching introduces a nuance. Note, I haven't looked in detail at the Teensy 4.x, but in terms of what I've seen over the years for other cached system, you need to consider the caching behavior of memory.

    On the Teensy 4.0/4.1, if you have PROGMEM variables, or variables held in the 4.1's psram or flash memory chips, the values are cached. This means that is slow to get to the real memory. A small subset of the values for memory are kept in the cache in the chips. The cache is fairly small, but it holds a lot of the most recent values. So, if you have a lot of structures in cached memory, it is better to have the values in the structure be as small as possible. By doing so, you allow more values to occupy the cache. This in turn makes things go faster.

    So what you want to do is in the structures, use uint8_t or uint16_t, but when you load these values into scalar values, use uint32_t. The loads will fill the entire 32-bits, and the stores will store the bottom 8 or 16-bits. The instructions on the ARM are 32-bits, and for scalar variables, the compiler does not have to do sign/zero extension to limit the scalar value to 8 or 16 bits. Further more, when you set up the structure, you want to order the fields, so that they go from smallest to largest. This eliminates some extra holes that might have been added to properly align things. For example on other systems if you do;

    Code:
    struct foo {
        uint8_t a;
        uint32_t b;
    };
    The structure might be allocated as 8 bytes, with a hole between the fields a and b. This depends on the instruction set of the chips, and the ABI (application binary interface) that GCC supports for those chips. I believe ARM does some padding in this case, but I've worked on so many different chips and used many different ABI's that I never bothered to learn the ARM instruction set or ABI (hey for me, Teensys are a hobby, not a job -- so I can tell you the details on various PowerPC's but not ARMs).

    The MEM1 region runs the fastest memory (at the speed of accessing memory in the caches for the other memories). Static and global variables are in MEM1 as is the stack.

    Now the MEM2 region (malloc area and DMAMEM) runs at 1/4 CPU speed, but it is cached. So you see some caching effects, but not as many as you would with the slower psram/flash memories on the Teensy 4.1.

  5. #5
    Thank you for this reply, I don't think I have never learned so much from a single response on a forum.

    I will have to research everything you said and rethink the way I'm coding this. Limit the number of variables in scope of fast part of the code makes sense. Not an easy thing to do though.

    I'm making a piezo electronic drum with multiple sensors, I used your example as inspiration although I've added a lot more features.

    Could you post a small example how to use the ARM_DWT_CYCCNT or a link to where I can read about it?

  6. #6
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    463
    Less scope is better, so much so that when you are starting out, you should program without using ANY global variables. It will force better coding practices.

    For example, "for (int i = 0 ; i < foo; ++i) {...." is slightly faster and far less likely to create a bug than a global i. Focus on readability and correctness. Usually speed is only important in some small loop - find that loop and only optimize it for speed.

  7. #7
    Quote Originally Posted by jonr View Post
    Less scope is better, so much so that when you are starting out, you should program without using ANY global variables. It will force better coding practices.

    For example, "for (int i = 0 ; i < foo; ++i) {...." is slightly faster and far less likely to create a bug than a global i. Focus on readability and correctness. Usually speed is only important in some small loop - find that loop and only optimize it for speed.
    I use incrementing for loops on a few places and use a global ĒiĒ haha. So do I declare ĒiĒ just before the for loop every time? It seemed much more convenient to just have it on top since I use it a few times. I imagined it would be faster to have it already declared in stead of declaring it all the time.

    So say Iím reading a sensor and I want to find a peak and store the peak.

    Should I make a function that passes the value in stead of using a static int?

    Sometimes I need the value to be passed on multiple times through the loop.

    For instance if I find a peak and I want to immediately search for the next peak Iím adjusting the threshold so it decays exponentially along the previous peak so it only triggers on new peaks and ignores the aftershock of the previous peak. How would I do that without using a static?

  8. #8
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,268
    Quote Originally Posted by frankzappa View Post
    So say I’m reading a sensor and I want to find a peak and store the peak.

    Should I make a function that passes the value in stead of using a static int?
    Often the best way to answer these questions is by experimentation. Small details matter. Rarely are there universal answers that apply in all cases. That's why you should use ARM_DWT_CYCCNT or digitalWriteFast() to precisely measure the actual performance.

  9. #9
    Junior Member
    Join Date
    Sep 2019
    Location
    Sevilla, Spain
    Posts
    15
    very interesting and useful information

  10. #10
    Quote Originally Posted by WMXZ View Post
    So, lets get rid of the stupid 8 bit bool!
    I, personally thought, to use 16 bit, but maybe should only use 32bit variables.
    Iím using bools in my code because I assumed it would be faster to check if something is true of false. Are you saying that is not ĒtrueĒ, pun intended

  11. #11
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,942
    using ARM_DWT_CYCCNT:
    Code:
    uint32_t myTime = ARM_DWT_CYCCNT; 
     // do someting
    myTime = ARM_DWT_CYCCNT - myTime -2; // ARM_DWT_CYCCNT can wrap in use - but the uint32_t math will return the proper difference in all cases - as long as count doesn't overflow - for longer tests than 2^32/C_CPU_ACTUAL {7.158278 secs at 600 MHz} use micros()
    Above takes off the '2' cycles Paul noted. Also noted that bool is possibly as fast as uint32_t - but no faster.

  12. #12
    Quote Originally Posted by defragster View Post
    using ARM_DWT_CYCCNT:
    Code:
    uint32_t myTime = ARM_DWT_CYCCNT; 
     // do someting
    myTime = ARM_DWT_CYCCNT - myTime -2; // ARM_DWT_CYCCNT can wrap in use - but the uint32_t math will return the proper difference in all cases - as long as count doesn't overflow - for longer tests than 2^32/C_CPU_ACTUAL {7.158278 secs at 600 MHz} use micros()
    Above takes off the '2' cycles Paul noted. Also noted that bool is possibly as fast as uint32_t - but no faster.
    Thanks for the example. Much appreciated.

    Say you have a sensor value, a uint32_t that you want to test. You want to check if itís greater than a threshold. Clearly that must be slower then comparing if a bool is true or false? Just checking if a zero or 1 sounds faster to me but the processor works in mysterious ways (to me).

  13. #13
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,942
    Quote Originally Posted by frankzappa View Post
    Thanks for the example. Much appreciated.

    Say you have a sensor value, a uint32_t that you want to test. You want to check if it’s greater than a threshold. Clearly that must be slower then comparing if a bool is true or false? Just checking if a zero or 1 sounds faster to me but the processor works in mysterious ways (to me).
    In order to assign that bool a compare is needed? And then testing the 'extra' bool? That may not save anything depending how and where tested?

    As @KurtE notes - best to get it working properly first - with reasonable care and clarity in writing it.

    With the 1062 in the T_4.x the processor has possibility of dual integer execution - so if two unrelated things need done - they might run at the same time. So 'wasting' a cycle here or there might actually not affect perf as it will be doing something else on the same clock cycle.

  14. #14
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    463
    +1 on measuring timing. There some non-intuitive things that even experts don't expect.

    Yes, do use lots of functions and pass all of the needed variables to them.

  15. #15
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    7,105
    Again sometimes your differences are next to nothing... Better to first worry about getting your code working properly and then if/when necessary, find out where your performance issues are and work on those.

    But as for local (stack) versus Global... Really not a whole lot of differences. When I have questions about things like that I just run a couple of simple tests.

    Example:
    Code:
    void setup() {
      while(!Serial && millis() < 5000) ; 
      Serial.begin(115200);
    }
    
    uint32_t g_cnt;
    uint32_t g_sum;
    elapsedMicros g_em;
    
    void loop() {
      g_sum = 0;
      g_em = 0;
      for (g_cnt = 0; g_cnt < 10000000; g_cnt++) g_sum += g_cnt;
      Serial.printf("G %u %u\n", (uint32_t)g_em, g_sum);
      Serial.flush();
      delay(250);
      uint32_t l_sum = 0;
      elapsedMicros l_em = 0;
      for (uint32_t l_cnt = 0; l_cnt < 10000000; l_cnt++) l_sum += l_cnt;
      Serial.printf("L %u %u\n", (uint32_t)l_em, l_sum);
      Serial.flush();
      
    }
    Note This was run on T4.1...

    Code:
    G 33336 2280707264
    L 33335 2280707264
    G 33336 2280707264
    L 33335 2280707264
    G 33336 2280707264
    L 33335 2280707264
    Your millage may differ

  16. #16
    Quote Originally Posted by KurtE View Post
    Again sometimes your differences are next to nothing... Better to first worry about getting your code working properly and then if/when necessary, find out where your performance issues are and work on those.

    But as for local (stack) versus Global... Really not a whole lot of differences. When I have questions about things like that I just run a couple of simple tests.

    Example:
    Code:
    void setup() {
      while(!Serial && millis() < 5000) ; 
      Serial.begin(115200);
    }
    
    uint32_t g_cnt;
    uint32_t g_sum;
    elapsedMicros g_em;
    
    void loop() {
      g_sum = 0;
      g_em = 0;
      for (g_cnt = 0; g_cnt < 10000000; g_cnt++) g_sum += g_cnt;
      Serial.printf("G %u %u\n", (uint32_t)g_em, g_sum);
      Serial.flush();
      delay(250);
      uint32_t l_sum = 0;
      elapsedMicros l_em = 0;
      for (uint32_t l_cnt = 0; l_cnt < 10000000; l_cnt++) l_sum += l_cnt;
      Serial.printf("L %u %u\n", (uint32_t)l_em, l_sum);
      Serial.flush();
      
    }
    Note This was run on T4.1...

    Code:
    G 33336 2280707264
    L 33335 2280707264
    G 33336 2280707264
    L 33335 2280707264
    G 33336 2280707264
    L 33335 2280707264
    Your millage may differ
    So in your example there was no difference if I'm understanding correctly.

    What about arrays? Most of my variables are arrays. For instance in my peak tracking loop I have 5 arrays with 10 values each.

    This works very well at the moment but I'm only using 2 sensors. I will be doing this with 10 sensors. I'm getting about 25 sensor readings per millisecond for each sensor. I'm hoping to be able to read each sensor 25 times per millisecond with 10 sensors.

    I haven't figured out yet how to do faster ADC readings yet, I've looked at the pedvide ADC library but it's just too complicated for me to understand at the moment.

    I'm a bit lost at the moment and not getting anywhere. I just don't know what I need to study/learn to understand this.

    So far I've learned some C syntax but when I look at those libraries it's mostly just jibberish.

    I know it's probably very easy once you know it but I just don't know where to learn.

  17. #17
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,942
    @KurtE's example shows no diff in timing or values if using local or glabal variables.

    This edit shows that it can do two things in the same time:
    Code:
    G 33335 2280707264
    L 33335 2280707264
    
    #2L 33335 2280707264
    #2G _same 2280707264
    Added third combined for()::
    Code:
    void setup() {
      while(!Serial && millis() < 5000) ; 
      Serial.begin(115200);
    }
    
    uint32_t g_cnt;
    uint32_t g_sum;
    elapsedMicros g_em;
    
    void loop() {
      g_sum = 0;
      g_em = 0;
      for (g_cnt = 0; g_cnt < 10000000; g_cnt++) g_sum += g_cnt;
      Serial.printf("G %u %u\n", (uint32_t)g_em, g_sum);
      Serial.flush();
      delay(250);
      uint32_t l_sum = 0;
      elapsedMicros l_em = 0;
      for (uint32_t l_cnt = 0; l_cnt < 10000000; l_cnt++) l_sum += l_cnt;
      Serial.printf("L %u %u\n", (uint32_t)l_em, l_sum);
      
      Serial.flush();
      delay(250);
      g_sum = 0;
      l_sum = 0;
      l_em = 0;
      g_cnt = 10000000/2;
      for (uint32_t l_cnt = 1; l_cnt < 10000000; l_cnt++) {
        l_sum += l_cnt;
        g_sum += g_cnt;
      }
      Serial.printf("\n#2L %u %u\n", (uint32_t)l_em, l_sum);
      Serial.printf("#2G _same %u\n\n",  g_sum);
      Serial.flush();
      delay(2500);
    }
    Small edit to get exact result value from #2G summ - had to make sure the calcs were diff so the compiler wouldn't optimize it away.
    Last edited by defragster; 06-27-2020 at 08:34 PM. Reason: Small edit to get exact result value from #2G summ

  18. #18
    Senior Member
    Join Date
    Jul 2014
    Posts
    2,695
    Quote Originally Posted by defragster View Post
    so the compiler wouldn't optimize it away.
    Exactly, all this "programming style" speed issues should consider compiler optimization.

  19. #19
    Senior Member
    Join Date
    Nov 2012
    Posts
    1,375
    I'm getting about 25 sensor readings per millisecond for each sensor
    What are these sensors measuring?

    Pete

  20. #20
    Quote Originally Posted by el_supremo View Post
    What are these sensors measuring?

    Pete
    Piezo sensors measuring vibration. It's an electronic drum.

    Here is a picture showing the peak tracking in action: https://www.dropbox.com/s/vavkp8vdib...test.heic?dl=0

    The green line is the threshold and every time it detects a peak it adapts to the previous peak and then decays so it can catch new peaks almost immediately after a previous peak. You can see when it catches a peak when the green threshold line goes up. Note that the first peaks are cut off on this scope because the scope is not showing the sensor output when it's idling.

    On the picture I'm playing buzz rolls which are very fast, only a few milliseconds between peaks. I have not set the threshold and decay settings properly here (and the electric circuit needs to be cleaned up) but it's looking quite promising.

  21. #21
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    22,268
    Now that we've talked about so much optimization, and more context about the application has come out along the way, this might be a good moment to mention the famous saying "premature optimization is the root of all evil." I had always believed this was from Knuth, but apparently it came earlier. From that article

    What Hoare and Knuth are really saying is that software engineers should worry about other issues (such as good algorithm design and good implementations of those algorithms) before they worry about micro-optimizations such as how many CPU cycles a particular statement consumes.
    Especially in an application like this, where timing is relative to the damping of mechanical resonance in drum pads, which might as well be eternity relative to the time scale of these sorts of optimizations, this wise advice is as on-point as ever. Sure, it's interesting and sometimes even fun to think of optimizations on this level (not always so much fun to repeatedly check a disassembly of the compiled code), it's important to keep a sense of context. On a 600 MHz dual-issue processor, we're talking about optimizations on the scale of a small fraction of a microsecond, for an application paced by mechanical phenomena which occurs over several milliseconds and human movements which occur on a scale of dozens to hundreds of milliseconds.

    This particular application could probably benefit much more from focusing on using the Teensy 4.0 & 4.1 ADC ETC to sequence automatic ADC measurements than details of code optimization.

  22. #22
    Quote Originally Posted by PaulStoffregen View Post
    On a 600 MHz dual-issue processor, we're talking about optimizations on the scale of a small fraction of a microsecond, for an application paced by mechanical phenomena which occurs over several milliseconds and human movements which occur on a scale of dozens to hundreds of milliseconds.

    This particular application could probably benefit much more from focusing on using the Teensy 4.0 & 4.1 ADC ETC to sequence automatic ADC measurements than details of code optimization.
    You are absolutely right. What I need to learn is how to use the ADC to take faster measurements. I need a way to take about 200 ADC measurements every millisecond to be able to do what I want (Iím also tracking the time difference of arrival of peaks between multiple sensors).

    Right now my code takes an adc measurement of all sensors and stores them in an array then checks the values, does peak tracking then goes back, takes new adc etc. in a loop.

    The reason I was thinking of doing it as fast as possible is to not slow down the time between measurements so I can take at least 200 measurements per millisecond.

    The best way to do it would however be to take ADC readings independently from the code but I donít know how to do it.

  23. #23
    Quote Originally Posted by PaulStoffregen View Post
    Especially in an application like this, where timing is relative to the damping of mechanical resonance in drum pads, which might as well be eternity relative to the time scale of these sorts of optimizations, this wise advice is as on-point as ever. Sure, it's interesting and sometimes even fun to think of optimizations on this level (not always so much fun to repeatedly check a disassembly of the compiled code), it's important to keep a sense of context. On a 600 MHz dual-issue processor, we're talking about optimizations on the scale of a small fraction of a microsecond, for an application paced by mechanical phenomena which occurs over several milliseconds and human movements which occur on a scale of dozens to hundreds of milliseconds.

    This particular application could probably benefit much more from focusing on using the Teensy 4.0 & 4.1 ADC ETC to sequence automatic ADC measurements than details of code optimization.
    You are underestimating latency problems with electronic drumming. The latency from the teensy besides the scanning time of the sensor which is 2ms has to be pretty much zero ms.

    A pro level electronic drum should have a latency of below 5ms. The best ones are doing it in 3.5ms.

    You have latency from the stick impoact to the piezo reacting which is about 0.5-1.5ms (the piezo is under a 35mm foam cushion that takes time to deflect).

    Then the scan latency of about 2 ms so a peak can occur, my code is during this time and a bit after.

    After that you need to send a MIDI signal which is traditionally 0.95ms however I'm hoping the teensy 4 can do it in 0.125ms.

    The you send the signal to a PC with a sound card that has latency. A good one will have below 3ms and the best ones can do it in 1ms.

    Then the PC plays a sound in your headphones.

    As you can see the latency adds up and most of it is out of my control so the teensy has to do this really really fast.

  24. #24
    However my reason for these optimisation worries were not beause of drumming latency but because I'm executing code between ADC readings and I need the ADC to be very fast. As I said 200 readings per ms would be optimal.

  25. #25
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    11,942
    Look at the ADC library included in TeensyDuino Installer: pedvide.github.io/ADC/docs/Teensy_4_html/namespace_a_d_c__settings.html
    > there may be a better example with more tricks
    > maybe 12 bits isn't needed - faster
    > maybe averaging is required - slower

    Is desired 200Ksps across all 8 pins - or each pin? Code below manages 704,208 combined single sample reads per second for the pins in the array list - 88K reads of 8 pins.

    A quick hack to this sample "...\hardware\teensy\avr\libraries\ADC\examples\re adAllPins\readAllPins.ino" is doing a 12 bit read with single sample ( no avg ) ::
    Code:
    A0: 0.34. A1: 0.00. A2: 0.62. A3: 0.94. A4: 0.99. A5: 1.08. A6: 1.12. A7: 1.06. 
    	8 pins read 88026 times per second
    Hacked code follows - adjusted for running on T_4.1 <#elif defined(ADC_TEENSY_4_1) // Teensy 4.1> - it reads all it can of 8 pins, then once per second reports value read and how many times all were similarly read in prior second::
    Code:
    /* Example for analogContinuousRead
       It measures continuously the voltage on pin A9,
       Write v and press enter on the serial console to get the value
       Write c and press enter on the serial console to check that the conversion is taking place,
       Write t to check if the voltage agrees with the comparison in the setup()
       Write s to stop the conversion, you can restart it writing r.
    */
    
    #include <ADC.h>
    #include <ADC_util.h>
    
    ADC *adc = new ADC(); // adc object
    
    #if defined(ADC_TEENSY_LC) // teensy LC
    #define PINS 13
    #define PINS_DIFF 2
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12};
    uint8_t adc_pins_diff[] = {A10, A11};
    
    #elif defined(ADC_TEENSY_3_0) // teensy 3.0
    #define PINS 14
    #define PINS_DIFF 4
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13};
    uint8_t adc_pins_diff[] = {A10, A11, A12, A13};
    
    #elif defined(ADC_TEENSY_3_1) || defined(ADC_TEENSY_3_2) // teensy 3.1/3.2
    #define PINS 21
    #define PINS_DIFF 4
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13,
                          A14, A15, A16, A17, A18, A19, A20
                         };
    uint8_t adc_pins_diff[] = {A10, A11, A12, A13};
    
    #elif defined(ADC_TEENSY_3_5) // Teensy 3.5
    #define PINS 27
    #define PINS_DIFF 2
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10,
                          A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, A23, A24, A25, A26
                         };
    uint8_t adc_pins_diff[] = {A10, A11};
    
    #elif defined(ADC_TEENSY_3_6) // Teensy 3.6
    #define PINS 25
    #define PINS_DIFF 2
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10,
                          A11, A12, A13, A14, A15, A16, A17, A18, A19, A20, A21, A22, A23, A24
                         };
    uint8_t adc_pins_diff[] = {A10, A11};
    
    #elif defined(ADC_TEENSY_4_0) // Teensy 4.0
    #define PINS 14
    #define DIG_PINS 10
    #define PINS_DIFF 0
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11, A12, A13};
    uint8_t adc_pins_dig[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9};
    uint8_t adc_pins_diff[] = {};
    
    #elif defined(ADC_TEENSY_4_1) // Teensy 4.1
    #define PINS 8
    #define DIG_PINS 10
    #define PINS_DIFF 0
    uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7};
    uint8_t adc_pins_dig[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9};
    uint8_t adc_pins_diff[] = {};
    #endif // defined
    
    void setup()
    {
    
      pinMode(LED_BUILTIN, OUTPUT);
    
      for (int i = 0; i < PINS; i++)
      {
        pinMode(adc_pins[i], INPUT);
      }
    
      Serial.begin(9600);
    
      ///// ADC0 ////
      adc->adc0->setAveraging(1);                                    // set number of averages
      adc->adc0->setResolution(12);                                   // set bits of resolution
      adc->adc0->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED); // change the conversion speed
      adc->adc0->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED);     // change the sampling speed
    
      ////// ADC1 /////
    #ifdef ADC_DUAL_ADCS
      adc->adc1->setAveraging(1);                                    // set number of averages
      adc->adc1->setResolution(12);                                   // set bits of resolution
      adc->adc1->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED); // change the conversion speed
      adc->adc1->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED);     // change the sampling speed
    #endif
    
      delay(500);
    }
    
    int value = 0;
    int pin = 0;
    
    uint32_t lC = 0;
    uint32_t lShow = 0;
    elapsedMillis lT = 0;
    
    void loop()
    {
      lC++;
      if ( lT >= 1000 ) {
        lShow = lC;
        lC = 0;
      }
      int i;
      for (i = 0; i < PINS; i++)
      {
        value = adc->analogRead(adc_pins[i]); // read a new value, will return ADC_ERROR_VALUE if the comparison is false.
        if ( lShow ) {
          Serial.print("A");
          Serial.print(i);
          Serial.print(": ");
          Serial.print(value * 3.3 / adc->adc0->getMaxValue(), 2);
          Serial.print(". ");
          if (i == 9)
          {
            Serial.println();
          }
          else if (i == 11)
          {
            Serial.print("\t");
          }
          else if (i == 13)
          {
            Serial.print("\t");
          }
          else if (i == 22)
          {
            Serial.println();
          }
        }
      }
      if ( lShow ) {
        Serial.printf("\n\t%u pins read %u times per second \n", i, lShow );
        lShow = 0;
        lT = 0;
      }
      // the actual parameters for the temperature sensor depend on the board type and
      // on the actual batch. The printed value is only an approximation
      //Serial.print("Temperature sensor (approx.): ");
      //value = adc->analogRead(ADC_INTERNAL_SOURCE::TEMP_SENSOR); // read a new value, will return ADC_ERROR_VALUE if the comparison is false.
      //Serial.print(": ");
      //float volts = value*3.3/adc->adc0->getMaxValue();
      //Serial.print(25-(volts-0.72)/1.7*1000, 2); // slope is 1.6 for T3.0
      //Serial.println(" C.");
    
      // Print errors, if any.
      if (adc->adc0->fail_flag != ADC_ERROR::CLEAR)
      {
        Serial.print("ADC0: ");
        Serial.println(getStringADCError(adc->adc0->fail_flag));
      }
    #ifdef ADC_DUAL_ADCS
      if (adc->adc1->fail_flag != ADC_ERROR::CLEAR)
      {
        Serial.print("ADC1: ");
        Serial.println(getStringADCError(adc->adc1->fail_flag));
      }
    #endif
      adc->resetError();
    }

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •