Teensy 4: Global vs local variables speed of execution

Status
Not open for further replies.
@frankzappa - What you are doing sounds good. My main suggestion, is when you are starting out, for me the best thing is to try small bits and pieces, and most importantly have fun!

Again if I know I want to read in the 10 ADCs is to do simple experiments... Even before you decide to use ADC library... You can try different things with the Arduino built in ADC functions.

The simplest is something like:
Code:
const uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9};
void setup()
{
  while (!Serial && millis < 4000) ;
  Serial.begin(9600);
}

void loop()
{
  elapsedMicros em = 0;
  uint32_t sensorValue[10];

  for (int i = 0; i < sizeof(adc_pins); i++)
  {
    sensorValue[i] = analogRead(adc_pins[i]);
  }
  Serial.println(em, DEC);
  delay(1000);
}

And it is printing out 173 the majority of time...

You can then extend it... And see what setting the Resolution and the number of averaging does to this.
Again nothing special here

Code:
const uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9};
void setup()
{
  while (!Serial && millis < 4000) ;
  Serial.begin(9600);
}

void loop()
{
  elapsedMicros em = 0;
  uint32_t sensorValue[10];

  for (int anal_res = 8; anal_res < 14; anal_res += 2) {
    analogReadRes(anal_res);
    for (int anal_avg = 2; anal_avg < 64; anal_avg *= 2) {
      analogReadAveraging(anal_avg);
      em = 0;
      for (int i = 0; i < sizeof(adc_pins); i++)
      {
        sensorValue[i] = analogRead(adc_pins[i]);
      }
      Serial.printf("(%d:%d)=%u ", anal_res, anal_avg, (uint32_t)em);
      Serial.flush(); // make sure not to influence the next run
      delay(5);
    }
  }
  Serial.printf("\n");

  delay(1000);
}
Example output:
Code:
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1358 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1356 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615
as you can see the setting for number of bits of resolution and how many times the analogs are averaged makes a big difference on speed.

So then you need to decide how important is each of these to your usage?

You can then go to the ADC library and see that it has similar settings. Actually even more settings, that you can try. And then as I mentioned you can the work to use both ADCs...
In the above code all of these reads use a single ADC...

Again the goal is to have fun!
 
Updating prior p#28 sample to use that paired 'startSingleRead' it completes ONCE and then quits ... ???
adjustments haven't helped yet - including checking for ADC errors:
Code:
T:\tCode\ADC\readAllPins_10Dual\readAllPins_10Dual.ino Jun 29 2020 14:17:26
A0: 0.17. A1: 0.00. A2: 11.49. A3: 6954532.00. 
	4 pins read 1 times per second at us=2857002

Trying to take it into GDB to debug ... and that starts with this failure? - need to update those GDB to current and see if it helps:
Code:
a Reading symbols from T:\TEMP\\arduino_build_readAllPins_10Dual.ino\readAllPins_10Dual.ino.elf...done.
(gdb) target remote \\.\COM49
Remote debugging using \\.\COM49
operator delete[] (ptr=<optimized out>) at T:\arduino-1.8.13\hardware\teensy\avr\cores\teensy4\new.cpp:55
55              free(ptr);
(gdb) l
50              free(ptr);
51      }
52
53      void operator delete[](void * ptr, size_t size)
54      {
55              free(ptr);
56      }
57
(gdb)

Changing pin order in array ( I have a single pot ) - this error confirms it validates which analog works on which ADC#:
Code:
T:\tCode\ADC\readAllPins_10Dual\readAllPins_10Dual.ino Jun 29 2020 14:29:12
ADC1: Wrong pin
 
IMO, the best way to read 10 pins quickly. There are other ways if you want to do something else while the conversions are being done.

Code:
for (int i = 0; i < PINS; i += 2)
 {
      ADC::Sync_result sr = adc->analogSyncRead(adc_pins[i], adc_pins[i + 1]);  // read two at once
      value[i] = sr.result_adc0;
      value[i+1] = sr.result_adc1;
 }

Will be just slightly faster if you unroll the loop.
 
Last edited:
IMO, the best way to read 10 pins quickly. There are other ways if you want to do something else while the conversions are being done.
...

THANKS >> @jonr: That's the answer - it is working where the prior scheme was hanging something - and the GDB didn't help me see where ...

>> updated to post #55
 
<REPLACING ABOVE> Subtle change for error check still finds error but not every cycle - takes out some overhead - 3.2K more reads per second.
Code:
ADC1: Wrong pin
P#24: 13854508.00V <P#25: 13854508.00V <P#25: 3.29V <P#18: 2.95V <P#19: 0.18V <P#20: 0.18V <P#21: 0.05V <P#22: 0.05V <P#23: 0.03V <P#14: 0.03V <
	10 pins read 157556 times per second at us=125300000

IMO, the best way to read 10 pins quickly. There are other ways if you want to do something else while the conversions are being done.

Code:
for (int i = 0; i < PINS; i += 2)
 {
      ADC::Sync_result sr = adc->analogSyncRead(adc_pins[i], adc_pins[i + 1]);  // read two at once
      value[i] = sr.result_adc0;
      value[i+1] = sr.result_adc1;
 }

@jonr: That's the answer - it is working where the prior scheme was hanging something - and the GDB didn't help me see where ... core part::
Code:
  uint32_t lastR[PINS + 1];
  lastR[PINS] = micros();
  for (int i = 0; i < PINS; i += 2)
  {
    //digitalWriteFast( LED_BUILTIN, !digitalReadFast( LED_BUILTIN));
    ADC::Sync_result sr = adc->analogSyncRead(adc_pins[i], adc_pins[i + 1]);  // read two at once
    lastR[i] = sr.result_adc0;
    lastR[i + 1] = sr.result_adc1;
  }

Gives read of all 10 pins - this prints every second:
Code:
P#24: 3.29V <P#16: 0.05V <P#25: 3.29V <P#18: 2.97V <P#19: 0.17V <P#20: 0.17V <P#21: 0.05V <P#22: 0.05V <P#23: 0.03V <P#14: 0.03V <
	10 pins read 134392 times per second at us=15300000

Adding delayMicroseconds(32); to loop() slows it down to 25272 reads of all 10 per second - so reading two more than before - and more time to spare:
Code:
#include <ADC.h>
#include <ADC_util.h>

ADC *adc = new ADC(); // adc object

#define PINS 10  // MUST BE EVEN to read on paired ADC's
const uint32_t adc_pins[] = {A10, A2, A11, A4, A5, A6, A7, A8, A9, A0};

void setup()
{
  pinMode(LED_BUILTIN, OUTPUT);
  Serial.begin(9600);
  while (!Serial && millis() < 4000 );
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
  ///// ADC0 ////
  adc->adc0->setAveraging(1);                                    // set number of averages
  adc->adc0->setResolution(10);                                   // set bits of resolution
  adc->adc0->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED ); // change the conversion speed
  adc->adc0->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED );     // change the sampling speed
  ////// ADC1 /////
  adc->adc1->setAveraging(1);                                    // set number of averages
  adc->adc1->setResolution(10);                                   // set bits of resolution
  adc->adc1->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED ); // change the conversion speed
  adc->adc1->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED );     // change the sampling speed

  //BSerial.println(" ENTER to START \n");
  //while ( !Serial.available() );

  delay(500);
}

int value = 0;
int pin = 0;

uint32_t lC = 0;
uint32_t lShow = 0;
elapsedMillis lT = 0;
void errCkADC();


void loop()
{
  //delayMicroseconds(32); // Benchmark guess at the time per loop to maintain 25+K reads of 8 pins per second

  lC++;
  if ( lT >= 1000 ) {
    lShow = lC;
    lC = 0;
  }
  uint32_t lastR[PINS + 1];
  lastR[PINS] = micros();
  for (int i = 0; i < PINS; i += 2)
  {
    //digitalWriteFast( LED_BUILTIN, !digitalReadFast( LED_BUILTIN));
    ADC::Sync_result sr = adc->analogSyncRead(adc_pins[i], adc_pins[i + 1]);  // read two at once
    lastR[i] = sr.result_adc0;
    lastR[i + 1] = sr.result_adc1;
  }

  if ( lShow ) {
    [B][COLOR="#FF0000"]errCkADC();[/COLOR][/B]
    for (int i = 0; i < PINS; i++)
    {
      Serial.print("P#");
      Serial.print(adc_pins[i]);
      Serial.print(": ");
      Serial.print(lastR[i] * 3.3f / adc->adc0->getMaxValue(), 2);
      Serial.print("V <");
    }
    Serial.printf("\n\t%u pins read %u times per second at us=%lu\n", PINS, lShow, lastR[PINS] );
    lShow = 0;
    lT = 0;
  }
}

void errCkADC() {
  // Print errors, if any.
  if (adc->adc0->fail_flag != ADC_ERROR::CLEAR)
  {
    Serial.print("ADC0: ");
    Serial.println(getStringADCError(adc->adc0->fail_flag));
  }
  if (adc->adc1->fail_flag != ADC_ERROR::CLEAR)
  {
    Serial.print("ADC1: ");
    Serial.println(getStringADCError(adc->adc1->fail_flag));
  }
  adc->resetError();
}
 
Here is a trick that should work. Say you only need to read 1 pin, but want it fast and with .5 bits less noise. No sense is leaving an ADC idle when it will reduce noise.

Code:
ADC::Sync_result sr = adc->analogSyncRead(A0,A0);     // read twice at the same time
data = (uint32_t)(sr.result_adc0 + sr.result_adc1) / 2;

An alternative would be to wire A0 and A1 to your source and use:

ADC::Sync_result sr = adc->analogSyncRead(A0,A1);     // read same signal twice
data = (uint32_t)(sr.result_adc0 + sr.result_adc1) / 2;
 
Last edited:
> try >> 1 shift instead of / 2

No difference, the compiler is smart about such things. On the other hand, pedvide uses some odd/slow types, so a (uint32_t) cast is needed in this case. Added.
 
Last edited:
@frankzappa - What you are doing sounds good. My main suggestion, is when you are starting out, for me the best thing is to try small bits and pieces, and most importantly have fun!

Again if I know I want to read in the 10 ADCs is to do simple experiments... Even before you decide to use ADC library... You can try different things with the Arduino built in ADC functions.

The simplest is something like:
Code:
const uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9};
void setup()
{
  while (!Serial && millis < 4000) ;
  Serial.begin(9600);
}

void loop()
{
  elapsedMicros em = 0;
  uint32_t sensorValue[10];

  for (int i = 0; i < sizeof(adc_pins); i++)
  {
    sensorValue[i] = analogRead(adc_pins[i]);
  }
  Serial.println(em, DEC);
  delay(1000);
}

And it is printing out 173 the majority of time...

You can then extend it... And see what setting the Resolution and the number of averaging does to this.
Again nothing special here

Code:
const uint8_t adc_pins[] = {A0, A1, A2, A3, A4, A5, A6, A7, A8, A9};
void setup()
{
  while (!Serial && millis < 4000) ;
  Serial.begin(9600);
}

void loop()
{
  elapsedMicros em = 0;
  uint32_t sensorValue[10];

  for (int anal_res = 8; anal_res < 14; anal_res += 2) {
    analogReadRes(anal_res);
    for (int anal_avg = 2; anal_avg < 64; anal_avg *= 2) {
      analogReadAveraging(anal_avg);
      em = 0;
      for (int i = 0; i < sizeof(adc_pins); i++)
      {
        sensorValue[i] = analogRead(adc_pins[i]);
      }
      Serial.printf("(%d:%d)=%u ", anal_res, anal_avg, (uint32_t)em);
      Serial.flush(); // make sure not to influence the next run
      delay(5);
    }
  }
  Serial.printf("\n");

  delay(1000);
}
Example output:
Code:
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1358 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1356 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=109 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615 
(8:2)=32 (8:4)=110 (8:8)=214 (8:16)=423 (8:32)=842 (10:2)=48 (10:4)=174 (10:8)=343 (10:16)=681 (10:32)=1357 (12:2)=56 (12:4)=206 (12:8)=407 (12:16)=810 (12:32)=1615
as you can see the setting for number of bits of resolution and how many times the analogs are averaged makes a big difference on speed.

So then you need to decide how important is each of these to your usage?

You can then go to the ADC library and see that it has similar settings. Actually even more settings, that you can try. And then as I mentioned you can the work to use both ADCs...
In the above code all of these reads use a single ADC...

Again the goal is to have fun!

Oh it’s very much fun. It’s frustrating at times but once I accomplish something and it works great it’s very rewarding 😃

I understand, I was thinking to test how much speed I need and find a happy medium between speed and accuracy by using averages and whatnot. Obviously speed is useless if I get too much noise and bad accuracy.

This thread has derailed from a question about processor speed of execution to you guys helping me with my project which is awesome.
 
Here is a trick that should work. Say you only need to read 1 pin, but want it fast and with .5 bits less noise. No sense is leaving an ADC idle when it will reduce noise.

Code:
ADC::Sync_result sr = adc->analogSyncRead(A0,A0);     // read twice at the same time
data = (uint32_t)(sr.result_adc0 + sr.result_adc1) / 2;

An alternative would be to wire A0 and A1 to your source and use:

ADC::Sync_result sr = adc->analogSyncRead(A0,A1);     // read same signal twice
data = (uint32_t)(sr.result_adc0 + sr.result_adc1) / 2;

Is there a way to reduce noise with multiple sensors?
 
It is great that you are trying out different things here, but again not really sure of what you are trying to accomplish? Or how much faster you believe this code is versus, simply using analogRead(pin) ?
...
I am not sure how much of it is fully described, but if I remember correctly, you can do it something like:
Code:
uint32_t sensorValue[10];

// Note: I would probably have two different lists as
// not all ADC pins are on both ADCs...
for (int i = 0; i < PINS; i+=2)
  {
    // start reads on both ADCs
    adc->adc0->startSingleRead(adc_pins[i]);
    adc->adc1->startSingleRead(adc_pins[i+1]);

    // wait for both to complete
    while (!adc->adc0->isComplete() || adc->adc1->isComplete()) ; 

    // now get both results
    sensorValue[i] = adc->adc0->readSingle();
    sensorValue[i+1] = adc->adc1->readSingle();
  }
}
Again typed this in on the fly and has been awhile since I played with this. But this should more or less double your speed to reading 10 ADC pins.

...
Hope that helps some...

code with some testing outside the post editor ... - there was a !Not missing in the while() - and the use of ADC func()'s perhaps ... below is actually faster than last posted by another 10K per second.
Just replace this central for(){} in prior post #55. It marks a spot between start and complete where some work could be done during the wait { wc not define or displayed in this snippet}:
Code:
  for (int i = 0; i < PINS; i += 2)
  { // reads >> 10 pins read 145200 times per second
    // start reads on both ADCs
    adc->startSynchronizedSingleRead(adc_pins[i], adc_pins[i + 1]);
    // wait for both to complete
    { 
      // some amount of work on prior value testing could be done here
      // substituting this line for below averages 124 counts waiting for each read
      // while (adc->adc0->isConverting() || adc->adc1->isConverting()) wc++;
      while (adc->adc0->isConverting() || adc->adc1->isConverting());
    }
    // now get both results
    lastR[i] = adc->adc0->readSingle();
    lastR[i + 1] = adc->adc1->readSingle();
  }
output:;
Code:
P#24: 3.29V <P#16: 0.05V <P#25: 3.29V <P#18: 2.98V <P#19: 3.14V <P#20: 0.16V <P#21: 0.18V <P#22: 0.07V <P#23: 0.05V <P#14: 0.04V <
	10 pins read 145200 times per second at us=5300005
 
Is there a way to reduce noise with multiple sensors?

Multiple sensors will give different readings by nature and location in the drum head - and more overhead to read them and reduce to a meaningful value.

I just bumped the above fastest yet code to 12 bit resolution and it drops the per second read count of all 10 analog values to ::
Code:
P#24: 3.29V <P#16: 0.05V <P#25: 3.29V <P#18: 2.99V <P#19: 3.14V <P#20: 0.16V <P#21: 0.18V <P#22: 0.07V <P#23: 0.05V <P#14: 0.04V <
	10 pins read 135373 times per second at us=10300007

That was an easy way to improve resolution by 2 bits ... ideally at least one bit improvement effectively.

These are the kinds of things that can be done after it is working.

Adding in a 32 us delay to each loop pass reading the 10 values drops it to reading all 10 at 25.3K/sec and allows the commented wait cnt code "wc++" to reach avg of 139 instead of 124 just posted.

Not sure if that is enough time for the needed compares and calculations - but that is over 1.3M samples per sec at 12 bit resolution. Any added double compares of same or second sensor will cut that in half to start.

Also likely a reordering of that core loop to do a priming pair reads would start with first data pair ready for calculation while the next pair is doing ADC sample - so all but the last pair would be done when the last sample is done without any conditionals.
 
Oh it’s very much fun. It’s frustrating at times but once I accomplish something and it works great it’s very rewarding &#55357;&#56835;

I understand, I was thinking to test how much speed I need and find a happy medium between speed and accuracy by using averages and whatnot. Obviously speed is useless if I get too much noise and bad accuracy.

This thread has derailed from a question about processor speed of execution to you guys helping me with my project which is awesome.

Yes that is the fun of it, and yes there can be a lot of frustration of figuring out why somethings don't work...

As you mentioned, having the fastest raw speed may not be any good if it gives you the wrong answer. Note: again you can and maybe should look at how the different options within the ADC library, influence the speed of your setup as well. They should be very much along the same line. There are a few other options that they break out that can change these timings as well.

But again which approach works best for your application, depends on your application.

Example once you get farther along with your understanding and decide you need even more performance reading the Analog pins, you can then maybe look to doing it using DMA.

And if you need multiple ADC pins read fast, then there are additional tricks the T4.x can do, which are NOT yet in the ADC library.

That is you can setup the ADCs to chain one pin after another and can still do DMA...

BUT this can be real confusing to setup. To get this to work there are several sections of the manual you need to read, and reread and reread and pull hair out and reread... We do have bits and pieces of this built in to some of our examples of like doing DMA reads using a timer

Chapter 65 - ADC - Is the main chapter to understand a lot of the ADC system. (I suggest at least browsing through this chapter to understand some of basics of what the hardware does)

but to chain reads and the like you need to look at
Chapter 66 ADC_ETC (External Trigger Control) - A lot of this has to do with being able to setup for Touch Screen Control (Chapter 67)...

But some of the interesting things you can do the the ADC_ETC - is to setup the chaining of reads. And again I would need to reread and look examples again but there are Multiple Triggers you can define (8 of them I think). Trying to remember if some are used for ADC1 and others for ADC2.... But the interesting thing is for each of the Triggers, you can setup a chain of up to 8 logical pins...

And you also need to use the XBar system to (Chapters 59-61) to glue some of the pieces...

We did work on a couple of examples that did some of this. but first you probably want to work more with the basic pieces :D

But thought I would at least mention there are lots of things one can do, if you really need to make things work as fast as possible.
 
Yes that is the fun of it, and yes there can be a lot of frustration of figuring out why somethings don't work...

As you mentioned, having the fastest raw speed may not be any good if it gives you the wrong answer. Note: again you can and maybe should look at how the different options within the ADC library, influence the speed of your setup as well. They should be very much along the same line. There are a few other options that they break out that can change these timings as well.

But again which approach works best for your application, depends on your application.

Example once you get farther along with your understanding and decide you need even more performance reading the Analog pins, you can then maybe look to doing it using DMA.

And if you need multiple ADC pins read fast, then there are additional tricks the T4.x can do, which are NOT yet in the ADC library.

That is you can setup the ADCs to chain one pin after another and can still do DMA...

BUT this can be real confusing to setup. To get this to work there are several sections of the manual you need to read, and reread and reread and pull hair out and reread... We do have bits and pieces of this built in to some of our examples of like doing DMA reads using a timer

Chapter 65 - ADC - Is the main chapter to understand a lot of the ADC system. (I suggest at least browsing through this chapter to understand some of basics of what the hardware does)

but to chain reads and the like you need to look at
Chapter 66 ADC_ETC (External Trigger Control) - A lot of this has to do with being able to setup for Touch Screen Control (Chapter 67)...

But some of the interesting things you can do the the ADC_ETC - is to setup the chaining of reads. And again I would need to reread and look examples again but there are Multiple Triggers you can define (8 of them I think). Trying to remember if some are used for ADC1 and others for ADC2.... But the interesting thing is for each of the Triggers, you can setup a chain of up to 8 logical pins...

And you also need to use the XBar system to (Chapters 59-61) to glue some of the pieces...

We did work on a couple of examples that did some of this. but first you probably want to work more with the basic pieces :D

But thought I would at least mention there are lots of things one can do, if you really need to make things work as fast as possible.
I see now that there are way more possibilities under the surface.

I think I will start with the more basic stuff and see later if something more advanced is necessary.

Thanks for the tip about reading the manual. For some reason I didn't think of finding the manual.
 
Multiple sensors will give different readings by nature and location in the drum head - and more overhead to read them and reduce to a meaningful value.

I just bumped the above fastest yet code to 12 bit resolution and it drops the per second read count of all 10 analog values to ::
Code:
P#24: 3.29V <P#16: 0.05V <P#25: 3.29V <P#18: 2.99V <P#19: 3.14V <P#20: 0.16V <P#21: 0.18V <P#22: 0.07V <P#23: 0.05V <P#14: 0.04V <
	10 pins read 135373 times per second at us=10300007

That was an easy way to improve resolution by 2 bits ... ideally at least one bit improvement effectively.

These are the kinds of things that can be done after it is working.

Adding in a 32 us delay to each loop pass reading the 10 values drops it to reading all 10 at 25.3K/sec and allows the commented wait cnt code "wc++" to reach avg of 139 instead of 124 just posted.

Not sure if that is enough time for the needed compares and calculations - but that is over 1.3M samples per sec at 12 bit resolution. Any added double compares of same or second sensor will cut that in half to start.

Also likely a reordering of that core loop to do a priming pair reads would start with first data pair ready for calculation while the next pair is doing ADC sample - so all but the last pair would be done when the last sample is done without any conditionals.

Thanks for the examples, I don't think I need that much time to evaluate stuff between reads. All I'm doing is a few conditional statements and store the time and value of peaks. I don't store them on all reads, only if they go from rising to falling and only if the new peak was bigger than a previous peak.

As someone mentioned, multiplying a value for 10 sensors took 17 nanoseconds and that is the most processor intensive stuff I will do during ADC reads. There will be more complicated calculations after all peaks are caught but that is a single shot calculation and the sensors don't need to be read during that time so not that time critical.

I could probably use the extra time to take more accurate readings or averages. Will see what works best.
 
The start read and return in post #61 : adc->startSynchronizedSingleRead
>> then do calc from prior read
then get values
... repeat

Would give time as done to ask for more averaging - and the longer time for the sample just gets used to perform the prior calc. I ran the with avg=2 and of course then it takes twice as long - which just gives more time to do the math

And of course as noted the repeat loop could be unrolled - with only 5 twin reads and calc on them it could be made linear with only conditional on the wait : while (adc->adc0->isConverting() || adc->adc1->isConverting());


Have fun
 
IMO, the best way to read 10 pins quickly. There are other ways if you want to do something else while the conversions are being done.

Code:
for (int i = 0; i < PINS; i += 2)
 {
      ADC::Sync_result sr = adc->analogSyncRead(adc_pins[i], adc_pins[i + 1]);  // read two at once
      value[i] = sr.result_adc0;
      value[i+1] = sr.result_adc1;
 }

Will be just slightly faster if you unroll the loop.

I think I will try this. What does it mean to unroll the loop?
 
Last edited:
I think I will try this. What does it mean to unroll the loop?

instead of 2nd code post #61:
Code:
for (i<pins) {
 start read
 do something - if data from prior read
 read next data
}

unrolled would look more like this where code from 2nd post #61 is repeated inline:
Code:
adc->startSynchronizedSingleRead(adc_pins[0], adc_pins[1]);  // start read of 2 #1
  wait for prior read to complete
  read data pair
adc->startSynchronizedSingleRead(adc_pins[2], adc_pins[3]);  // start read of 2 #2
do calcs on pair #1

  while (adc->adc0->isConverting() || adc->adc1->isConverting()); // wait for prior read to complete 
  read data pair
adc->startSynchronizedSingleRead(adc_pins[4], adc_pins[5]);  // start read of 2 #3
do calcs on pair #2

  wait for prior read to complete
  read data pair
adc->startSynchronizedSingleRead(adc_pins[6], adc_pins[7]);  // start read of 2 #4
do calcs on pair #3

  wait for prior read to complete
  read data pair
adc->startSynchronizedSingleRead(adc_pins[8], adc_pins[9]);  // start read of 2 #5
do calcs on pair #4

  wait for prior read to complete
  read data pair
do calcs on pair #5

Conditionals are expensive and looping/indexing is expensive and that minimizes that to the necessary 'wait' allowing for inline processing of the 5 pairs of data to be read.
Hardcoding the array index numbers to a const array adc_pins[] should have the compiler hardcode that number at compile time.
 
Last edited:
instead of 2nd code post #61:
Code:
for (i<pins) {
 start read
 do something - if data from prior read
 read next data
}

unrolled would look more like this where code from 2nd post #61 is repeated inline:
Code:
adc->startSynchronizedSingleRead(adc_pins[0], adc_pins[1]);  // start read of 2 #1
  wait for prior read to complete
  read data pair
adc->startSynchronizedSingleRead(adc_pins[2], adc_pins[3]);  // start read of 2 #2
do calcs on pair #1

  while (adc->adc0->isConverting() || adc->adc1->isConverting()); // wait for prior read to complete 
  read data pair
adc->startSynchronizedSingleRead(adc_pins[4], adc_pins[5]);  // start read of 2 #3
do calcs on pair #2

  wait for prior read to complete
  read data pair
adc->startSynchronizedSingleRead(adc_pins[6], adc_pins[7]);  // start read of 2 #4
do calcs on pair #3

  wait for prior read to complete
  read data pair
adc->startSynchronizedSingleRead(adc_pins[8], adc_pins[9]);  // start read of 2 #5
do calcs on pair #4

  wait for prior read to complete
  read data pair
do calcs on pair #5

Conditionals are expensive and looping/indexing is expensive and that minimizes that to the necessary 'wait' allowing for inline processing of the 5 pairs of data to be read.
Hardcoding the array index numbers to a const array adc_pins[] should have the compiler hardcode that number at compile time.

So you are basically doing stuff on the previous sensor readings while waiting for the next ones to complete.

I don't think it's necessary. It's not that important to get many readings, the most important thing is that the 10 readings are as close together (in time) as possible. If all 10 sensors could be read at the same time I could get away with maybe only 10 readings per sensor per millisecond. More is better of course but I don't think more than 25 would benefit that much other than the 10 sensor readings are much closer together with faster sensor readings.
 
Yes, during the wait for one set of readings - which are a time fixed by the resolution and any averaging of the values - there is a block of time now available while the hardware is busy.

Anyhow - just popped back to see KurtE's idea work and the next effect - just another option.

Some short calcs there in under 60-500 us won't affect the read times much and will allow more work later or next readings sooner.

But taking out the jump, inc and test overhead of the loop with unrolling will speed the time between readings. With only 5 to do it isn't too much to type/read/repeat and will also allow using constants for the pin#'s to pass.

Running the for loop to read the five pairs takes 4351 CPU cycles.
Adding a delayNanoseconds(60); in between takes only 4362 cycles

If the loop were unrolled that would go down some. Seeing 4261 cycles with some edits to above psuedo code to feed compiler.
Adding 200 us of work only adds 78 cycles to the read series.
 
Last edited:
Yes, during the wait for one set of readings - which are a time fixed by the resolution and any averaging of the values - there is a block of time now available while the hardware is busy.

Anyhow - just popped back to see KurtE's idea work and the next effect - just another option.

Some short calcs there in under 60-500 us won't affect the read times much and will allow more work later or next readings sooner.

But taking out the jump, inc and test overhead of the loop with unrolling will speed the time between readings. With only 5 to do it isn't too much to type/read/repeat and will also allow using constants for the pin#'s to pass.

Running the for loop to read the five pairs takes 4351 CPU cycles.
Adding a delayNanoseconds(60); in between takes only 4362 cycles

If the loop were unrolled that would go down some. Seeing 4261 cycles with some edits to above psuedo code to feed compiler.
Adding 200 us of work only adds 78 cycles to the read series.

I’m not sure how long the stuff I do between the 10 reads take. I do a few nested loops and ”if else”. Also some ellapsed micros to read the timings of peaks and I store the biggest value of peaks. I compare some values. I was assuming that stuff is pretty instantaneous. Certaintly didn’t think it takes microseconds but nanoseconds.

After 2ms of ”scan time” has passed I plan to do some more calculations. I then have up to say a millisecond to decide which note was played by doing some weighted calculations between the sensors. I don’t plan to do anything that has to do with math between analog reads.

You guys have given me so many examples it’s hard to keep track of everything.

I have to sit down and go through everything and try to wrap my head around it.
 
Have fun. The prior loop code is decent - unrolling minimally better as the core. Not sure it can go faster given two ADC's with both in use.

Just make sure to do the error check - at least in debug - as it will identify if an analog pin can't be read on the assigned ADC unit as compiled.
 
Have fun. The prior loop code is decent - unrolling minimally better as the core. Not sure it can go faster given two ADC's with both in use.

Just make sure to do the error check - at least in debug - as it will identify if an analog pin can't be read on the assigned ADC unit as compiled.

Thanks man. Really appreciate your help.

Was this the final code you suggested? A bit hard to keep track because so many examples were posted.

Code:
#include <ADC.h>
#include <ADC_util.h>

ADC *adc = new ADC(); // adc object

#define PINS 10  // MUST BE EVEN to read on paired ADC's
const uint32_t adc_pins[] = {A10, A2, A11, A4, A5, A6, A7, A8, A9, A0};

void setup()
{
  pinMode(LED_BUILTIN, OUTPUT);
  Serial.begin(9600);
  while (!Serial && millis() < 4000 );
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
  ///// ADC0 ////
  adc->adc0->setAveraging(1);                                    // set number of averages
  adc->adc0->setResolution(10);                                   // set bits of resolution
  adc->adc0->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED ); // change the conversion speed
  adc->adc0->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED );     // change the sampling speed
  ////// ADC1 /////
  adc->adc1->setAveraging(1);                                    // set number of averages
  adc->adc1->setResolution(10);                                   // set bits of resolution
  adc->adc1->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED ); // change the conversion speed
  adc->adc1->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED );     // change the sampling speed

  //BSerial.println(" ENTER to START \n");
  //while ( !Serial.available() );

  delay(500);
}

int value = 0;
int pin = 0;

uint32_t lC = 0;
uint32_t lShow = 0;
elapsedMillis lT = 0;
void errCkADC();


void loop()
{
  //delayMicroseconds(32); // Benchmark guess at the time per loop to maintain 25+K reads of 8 pins per second

  lC++;
  if ( lT >= 1000 ) {
    lShow = lC;
    lC = 0;
  }
  uint32_t lastR[PINS + 1];
  lastR[PINS] = micros();
  for (int i = 0; i < PINS; i += 2)
  {
    //digitalWriteFast( LED_BUILTIN, !digitalReadFast( LED_BUILTIN));
    ADC::Sync_result sr = adc->analogSyncRead(adc_pins[i], adc_pins[i + 1]);  // read two at once
    lastR[i] = sr.result_adc0;
    lastR[i + 1] = sr.result_adc1;
  }

  if ( lShow ) {
    errCkADC();
    for (int i = 0; i < PINS; i++)
    {
      Serial.print("P#");
      Serial.print(adc_pins[i]);
      Serial.print(": ");
      Serial.print(lastR[i] * 3.3f / adc->adc0->getMaxValue(), 2);
      Serial.print("V <");
    }
    Serial.printf("\n\t%u pins read %u times per second at us=%lu\n", PINS, lShow, lastR[PINS] );
    lShow = 0;
    lT = 0;
  }
}

void errCkADC() {
  // Print errors, if any.
  if (adc->adc0->fail_flag != ADC_ERROR::CLEAR)
  {
    Serial.print("ADC0: ");
    Serial.println(getStringADCError(adc->adc0->fail_flag));
  }
  if (adc->adc1->fail_flag != ADC_ERROR::CLEAR)
  {
    Serial.print("ADC1: ");
    Serial.println(getStringADCError(adc->adc1->fail_flag));
  }
  adc->resetError();
}
 
Ok, so I've tried the suggested code and it works great. I get much faster readings. It went from 50 readings to hundreds.

However I get too much noise. I haven't done any low pass filtering in the circuit yet (will try) but since it worked well before I don't think it will help much. I have an op amp buffer before the ADC so low impedance going in but I will try adding some capacitors here and there in the circuit to see if it helps.

I'm thinking maybe there is a good way to constantly store say the 5 last readings in a buffer for the sensors and average the different sensors out using the 5 previous readings? This would be the same thing as using more averaging settings on the ADC but without the added delay between multiple sensors.

I know there is the ring buffer library and maybe even the ADC library has a built in function for this?

Any suggestions?
 
The last code I have here was asking for 12 bit ADC reads - with 1 avg - not sure if that gives a more stable 10 bit value? It takes longer - but with the dual read improvement it offset the longer read time.

Here is the code I have last edited - there is an #ifdef for loop versus unrolled with cycle count around the two methods.

YMMV - not sure what I poked at last - but here it is - don't forget to call errCkADC() during debug to test that pin order in array works with the ADC it gets assigned to - perhaps in the 1 second update "if ( lT >= 1000 ) {":
Code:
#include <ADC.h>
#include <ADC_util.h>

ADC *adc = new ADC(); // adc object

#define PINS 10  // MUST BE EVEN to read on paired ADC's
const uint32_t adc_pins[] = {A10, A2, A11, A4, A5, A6, A7, A8, A9, A0};

void setup()
{
  pinMode(LED_BUILTIN, OUTPUT);
  Serial.begin(9600);
  while (!Serial && millis() < 4000 );
  Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
  ///// ADC0 ////
  adc->adc0->setAveraging(1);                                    // set number of averages
  adc->adc0->setResolution(12);                                   // set bits of resolution
  adc->adc0->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED ); // change the conversion speed
  adc->adc0->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED );     // change the sampling speed
  ////// ADC1 /////
  adc->adc1->setAveraging(1);                                    // set number of averages
  adc->adc1->setResolution(12);                                   // set bits of resolution
  adc->adc1->setConversionSpeed(ADC_CONVERSION_SPEED::HIGH_SPEED ); // change the conversion speed
  adc->adc1->setSamplingSpeed(ADC_SAMPLING_SPEED::HIGH_SPEED );     // change the sampling speed

  //BSerial.println(" ENTER to START \n");
  //while ( !Serial.available() );

  delay(500);
}

int value = 0;
int pin = 0;

uint32_t lC = 0;
uint32_t lShow = 0;
elapsedMillis lT = 0;
void errCkADC();
uint32_t priorR[PINS + 1];

uint32_t wc = 0;
void loop()
{
  // delayMicroseconds(32); // Benchmark guess at the time per loop to maintain 25+K reads of 8 pins per second

  lC++;
  if ( lT >= 1000 ) {
    lShow = lC;
    lC = 0;
  }
  uint32_t lastR[PINS + 1];
  lastR[PINS] = micros();
  uint32_t r10c = ARM_DWT_CYCCNT;
#if 0
  for (int i = 0; i < PINS; i += 2)
  { // reads >> 10 pins read 145200 times per second
    // start reads on both ADCs
    adc->startSynchronizedSingleRead(adc_pins[i], adc_pins[i + 1]);
    // wait for both to complete
    {
      // some amount of work on prior value testing could be done here
      // substituting this line for below averages 124 counts waiting for each read
      delayNanoseconds(60); // Benchmark guess at the time per loop to maintain 25+K reads of 8 pins per second
      //while (adc->adc0->isConverting() || adc->adc1->isConverting()) wc++;

      while (adc->adc0->isConverting() || adc->adc1->isConverting());
    }
    // now get both results
    lastR[i] = adc->adc0->readSingle();
    lastR[i + 1] = adc->adc1->readSingle();
  }
#else
  {
    #define someWork //{ delayNanoseconds(50);}
    uint32_t ii=0;
    adc->startSynchronizedSingleRead(A10, A2);
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A11, A4);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A5, A6);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A7, A8);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A9, A0);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();
  }
#endif
  wc += ARM_DWT_CYCCNT - r10c;

  if ( lShow ) {
    errCkADC( 999 );
    for (int i = 0; i < PINS; i++)
    {
      Serial.print("P#");
      Serial.print(adc_pins[i]);
      Serial.print(": ");
      Serial.print(lastR[i] * 3.3f / adc->adc0->getMaxValue(), 2);
      Serial.print("V <");
    }
    Serial.printf("\n\t%u pins read %u times per second at us=%lu [%lu waitCnt]\n", PINS, lShow, lastR[PINS], wc / lShow );
    lShow = 0;
    lT = 0;
    wc = 0;
  }
}

void errCkADC( uint32_t vv ) {
  // Print errors, if any.
  if (adc->adc0->fail_flag != ADC_ERROR::CLEAR)
  {
    Serial.print(vv);
    Serial.print("<< ADC0: ");
    Serial.println(getStringADCError(adc->adc0->fail_flag));
  }
  if (adc->adc1->fail_flag != ADC_ERROR::CLEAR)
  {
    Serial.print(vv);
    Serial.print("<< ADC1: ");
    Serial.println(getStringADCError(adc->adc1->fail_flag));
  }
  adc->resetError();
}
 
Status
Not open for further replies.
Back
Top