Teensy 4: Global vs local variables speed of execution

Status
Not open for further replies.
Ah, I understand.

I'm not that familiar with the preprocessor directives, I'm going to have to read about them a bit. I've only seen them used to include header files and whatnot.

I wish I had gone to computer science school or something, it's a steep learning curve when you are trying to learn by your self reading stuff online.

Pre-Processor is just text replacement as used. It can take 'param' letters for use - but that not done there as I just wanted to be able to put in code or if commented - take out that "someWork" line.

For your development it might be okay to put a function call there to keep the "unrolled reading" code 'uncluttered'. one function for the _Global overhead work - on a timer or however you want to manage that so it only runs as desired. And then turn 'someWork' into a function someWork( uint32_t pairNumber ); - then call that 5 times after each pair is updated and do what is relevant - each call will do the same thing I assume comparing the sensor data as desired and updating peaks/etc. Then ideally returning before the next read is complete so that all reads are as close together in time as possible.

Necessity is the mother of invention ... I learned BASIC first on my own before signing up for my CS degree. Unfortunately the school signed up 1599 others at the same time and spent not 2 but 3 years weeding out the overage - and Sr. year still weren't happy with the counts so they revamped the last required course with some real learning. Then I took my first job programming in ... interpreted BASIC ... to log penicillin production data and monitor the control system.
 
Well I've tried some low pass RC filters and it's still not that great but slightly improved.

Strange thing is when I use averaging of 2 I get worse results with the RC filters.

Maybe my expectations are too high.

Here is a screenshot capturing the noise, top pic is regular analog read.

Middle is ADC library with 10 bit and averaging of 1.

The lowest one is 10 bit averaging of 1 and low pass filters.

It's fine I guess but the analog read version was much better.

What bothers me are the high spikes that shoot up 4 values.

Oh and I'm only reading 2 sensors here so it's actually sampling at a very high speed per sensor.

https://www.dropbox.com/s/3p7bmigvqvokcsn/filter.heic?dl=0

BTW I tried the 12 bit one but it's giving me really high numbers, how can I convert it to show me 0-1023?
 
Well I've tried some low pass RC filters and it's still not that great but slightly improved.

Strange thing is when I use averaging of 2 I get worse results with the RC filters.

Maybe my expectations are too high.

Here is a screenshot capturing the noise, top pic is regular analog read.

Middle is ADC library with 10 bit and averaging of 1.

The lowest one is 10 bit averaging of 1 and low pass filters.

It's fine I guess but the analog read version was much better.

What bothers me are the high spikes that shoot up 4 values.

Oh and I'm only reading 2 sensors here so it's actually sampling at a very high speed per sensor.

https://www.dropbox.com/s/3p7bmigvqvokcsn/filter.heic?dl=0

BTW I tried the 12 bit one but it's giving me really high numbers, how can I convert it to show me 0-1023?

Right shift the number two times.

Or set the analogReadResolution(10) command.
 
Nevermind, I was doing 1023 - sensorValue to give me an inverted signal, didn't work with a 12 bit value because it's a bigger number.

Indeed - need to account for the extra bits. 12 bit Conversion takes a tad longer so ideally the 10th bit less jumpy with right shift of 2 or divided by 4. Then treat the same as the 10 bit samples.

Are those 1000 samples of sensor at rest or under a constant pressure? Going to avg of 2 on read really adds time (posted code will show that in clock cycles with change in setup) - which is okay if results are better and that time can be put to use.
 
Indeed - need to account for the extra bits. 12 bit Conversion takes a tad longer so ideally the 10th bit less jumpy with right shift of 2 or divided by 4. Then treat the same as the 10 bit samples.

Are those 1000 samples of sensor at rest or under a constant pressure? Going to avg of 2 on read really adds time (posted code will show that in clock cycles with change in setup) - which is okay if results are better and that time can be put to use.

That is just with everything connected and at rest.

Only way for it to not jump up and down 4-5 values is to use averaging of 2. Conversion speed/sampling speed doesn't seem to affect it for the better. I don't know, maybe reading that many sensors this fast isn't plausible.

I got it working great with the regular analog read but this seems too erratic.
 
That is just with everything connected and at rest.

Only way for it to not jump up and down 4-5 values is to use averaging of 2. Conversion speed/sampling speed doesn't seem to affect it for the better. I don't know, maybe reading that many sensors this fast isn't plausible.

I got it working great with the regular analog read but this seems too erratic.

Wish there were one or more of your sensors set up here so I could see. Never looked this close at it before - and only from the code end. Not sure of the effect on the reading with the fast transition from one input to another reading both at once. Certainly no reason for that to improve the readings. It is a high speed digital processor, not optimized for analog fidelity.

The upper old FOR loop could be modified easily to read them one at a time to compare the results, or even changed to use analogRead.

I wonder what it does if both are read for a single sample - avg(1), discarded, then both read again - repeating for all 10.

Having the read at 12 bits then /4 to get 10 bits should get rid of noise in the last bits, unless it is really bad. Not sure of the effect of the capacitor feeding the onchip measuring device - though at steady state that should not have jitter.
 
Looks like AnalogRead() normally uses averaging of 4 samples. Not sure about other settings.

Not sure what you expect, one or even a few bits of noise isn't bad.
 
But I see nothing that indicates the equivalent pedvide _SPEED library settings for the defaults of AnalogRead().
 
But I see nothing that indicates the equivalent pedvide _SPEED library settings for the defaults of AnalogRead().

No that would be an added setting. Not sure if that changes the speed when doing the non-repeating reads here - but may only affect the continuous type? Easy to test with a couple of loops with the ADC code used - and changing the speed between loops.
 
Wish there were one or more of your sensors set up here so I could see. Never looked this close at it before - and only from the code end. Not sure of the effect on the reading with the fast transition from one input to another reading both at once. Certainly no reason for that to improve the readings. It is a high speed digital processor, not optimized for analog fidelity.

The upper old FOR loop could be modified easily to read them one at a time to compare the results, or even changed to use analogRead.

I wonder what it does if both are read for a single sample - avg(1), discarded, then both read again - repeating for all 10.

Having the read at 12 bits then /4 to get 10 bits should get rid of noise in the last bits, unless it is really bad. Not sure of the effect of the capacitor feeding the onchip measuring device - though at steady state that should not have jitter.

12 bit looked the same but maybe it would improve when divided by 4. The 10 bit with two averages seems to be good enough. The problem is not when you have a strong signal, it is when you play very softly. Like just touching the drum head. The signal goes over the idle noise and you can detect it but it looks very erratic when you zoom in. I'll post a picture later. Not sure if that can be captured reliably in amplitude and timing.

I think the only way to do this is to detect when a value goes over threshold, record all samples in an array for 2 milliseconds and then look at the whole array afterwards and determine when the peak occured in time and amplitude. Every spot in the array could be a time of 100 microseconds or so. If a smoothed out curve can be plotted somehow from the samples I could probably detect peaks pretty accurately. Catching the peak amplitude should be pretty easy but with a noisy signal comparing two peaks to determine the timing difference is tricky. I guess maybe detecting when the signal crossed the threshold and when it went back below the threshold divided by two could be an easy way but still the signal would have to smoothed out.

Looks like AnalogRead() normally uses averaging of 4 samples. Not sure about other settings.

Not sure what you expect, one or even a few bits of noise isn't bad.

Yeah as I said maybe I expect too much. I'm comparing with a regular analog read which is pretty good. Again 10 bits with averaging of two is good enough I think. I'm not sure how many samples per millisecond that is.

Again if you look at I think posting 51, the example code I did showed the different settings for analogRead() and how long it took to do so many of them...
https://forum.pjrc.com/threads/6156...d-of-execution?p=244780&viewfull=1#post244780

Your example showed that 10 samples at 10bit with 2 averages took 48 microseconds per ADC. Hmmm then 10 bits with average of 2 should work. It looked decently low noise. I mistakenly thought it took 48 microsec. for one sample :p
 
I'm using a center point of value 500 for the signal. I'm catching the negative part of the signal and inverting it. This is the way the best edrums are triggered and I assumed it would have some advantage. However as far as I can see on my oscilloscope all signals I play look the same except that the peak is more narrow and thin if I play farther from the sensor (this can also be used to determine the position where you play).

Anyway I think if I set my voltage divider higher for the piezo negative reference point (maybe 2.7V or so) I can get much more resolution and still get some part of the whole signal in case I find it useful. When you strike the drum the signal goes negative, I'm not sure what effect reversing the polarity on the circuit would have if any but since the best edrums are done this way I assume there is a reason for it.

Also the response from this drum as I play it is very linear which feels odd. It feels exponential. I think a natural response would be logarithmic (Ibelieve I read this in a scientific paper somwhere). Is there a way to change a linear electric signal to logarithmic? This would give me much more resolution in the weakly played notes?
 
Wish there were one or more of your sensors set up here so I could see. Never looked this close at it before - and only from the code end. Not sure of the effect on the reading with the fast transition from one input to another reading both at once. Certainly no reason for that to improve the readings. It is a high speed digital processor, not optimized for analog fidelity.

The upper old FOR loop could be modified easily to read them one at a time to compare the results, or even changed to use analogRead.

Here is the worst case scenario that I want to track. This is a very weakly struck drum:

https://www.dropbox.com/s/il5tuzlr0k2wvvc/peak.heic?dl=0

The lower pic is the first peak zoomed in.

I guess I could detect the microseconds when it goes over the threshold of say a value of 520 and back down again then calculate the middle point between that to determine when the peak occured. It would have to ignore the noise, maybe by ignoring signals that are above the threshold for a very short duration.
 
That was 10 bits average of 2. However this was only two sensors (one shown) so it will be 5 times less measurements.
 
But I see nothing that indicates the equivalent pedvide _SPEED library settings for the defaults of AnalogRead().

In some of these cases, the ADC has given you greater control over the different settings that control the ADC timings...

Again there is lots of details in the first messages by @pedvide in the thread: https://forum.pjrc.com/threads/25532-ADC-library-with-support-for-Teensy-4-3-x-and-LC

Also lots of details in the T4.x IMXRT 1060 Reference Manual, in the ADC chapter(65). Example section 65.5.4.5 talks about Sample Time and Total Conversion Time.

Simply put there are some Settings, that the default analogRead setting that are set to hopefully good enough settings. Some are setup based on your setting for Resolution and Averaging.

That is if you look at the sources, you see things like:

Code:
void analogReadRes(unsigned int bits)
{
  uint32_t tmp32, mode;

   if (bits == 8) {
    // 8 bit conversion (17 clocks) plus 8 clocks for input settling
    mode = ADC_CFG_MODE(0) | ADC_CFG_ADSTS(3);
  } else if (bits == 10) {
    // 10 bit conversion (17 clocks) plus 20 clocks for input settling
    mode = ADC_CFG_MODE(1) | ADC_CFG_ADSTS(2) | ADC_CFG_ADLSMP;
  } else {
    // 12 bit conversion (25 clocks) plus 24 clocks for input settling
    mode = ADC_CFG_MODE(2) | ADC_CFG_ADSTS(3) | ADC_CFG_ADLSMP;
  }

  tmp32  = (ADC1_CFG & (0xFFFFFC00));
  tmp32 |= (ADC1_CFG & (0x03));  // ADICLK
  tmp32 |= (ADC1_CFG & (0xE0));  // ADIV & ADLPC

  tmp32 |= mode; 
  ADC1_CFG = tmp32;
  
  tmp32  = (ADC2_CFG & (0xFFFFFC00));
  tmp32 |= (ADC2_CFG & (0x03));  // ADICLK
  tmp32 |= (ADC2_CFG & (0xE0));  // ADIV & ADLPC

  tmp32 |= mode; 
  ADC2_CFG = tmp32;
}
So it hard codes the ADSTS tibs in the CFG register depending on size of conversion... Than if you look manual, you see that field defined as:
Code:
Defines the sample time duration. This has two modes, short and long. When long sample time is selected
(ADLSMP=1) this works for long sample time otherwise this works for short sample. This allows higher
impedance inputs to be accurately sampled or to maximize conversion speed for lower impedance inputs.
Longer sample times can also be used to lower overall power consumption when continuous conversions
are enabled if high conversion rates are not required.
00 Sample period (ADC clocks) = 2 if ADLSMP=0b
Sample period (ADC clocks) = 12 if ADLSMP=1b
01
Sample period (ADC clocks) = 4 if ADLSMP=0b
Sample period (ADC clocks) = 16 if ADLSMP=1b
10
Sample period (ADC clocks) = 6 if ADLSMP=0b
Sample period (ADC clocks) = 20 if ADLSMP=1b
11
Sample period (ADC clocks) = 8 if ADLSMP=0b
Sample period (ADC clocks) = 24 if ADLSMP=1b

..
Where as the ADC library may allo you to for example set sample size to 8 and set the ADSTS to 0 if that is something you wish to try...

But you may need to look at the ADC library to see exactly how his settings map to the CFG value settings.
 
The last code I have here was asking for 12 bit ADC reads - with 1 avg - not sure if that gives a more stable 10 bit value? It takes longer - but with the dual read improvement it offset the longer read time.

Here is the code I have last edited - there is an #ifdef for loop versus unrolled with cycle count around the two methods.

YMMV - not sure what I poked at last - but here it is - don't forget to call errCkADC() during debug to test that pin order in array works with the ADC it gets assigned to - perhaps in the 1 second update "if ( lT >= 1000 ) {":

Hey defragster, could you help me break down this code. What exactly is happening in detail when it comes to the unrolled part? I'm going to explain what I think is happening and you can tell me if I'm understanding correctly.

You have defined "someWork" on top to be a delay of 50 nanoseconds (to represent code I can run between reads).

For the first pair of sensors you just start the reading of two sensors and while adc0 or adc 1 is converting you assign the value of adc 0 and adc 1 to the lastR array somehow but I don't quite understand what is happening here. I'm guessing this is within the while loop (I don't see the body of while).

Next pair of sensors you start a new syncronised read and here there is time to do some peak tracking code which will be done on the previous readings. While adc0 and adc1 is converting you assign the readings to lastR again.

Code:
#else
  {
    #define someWork //{ delayNanoseconds(50);}
    uint32_t ii=0;
    adc->startSynchronizedSingleRead(A10, A2);
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A11, A4);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A5, A6);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A7, A8);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

    adc->startSynchronizedSingleRead(A9, A0);
    someWork
    while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();
  }
#endif

The part I don't understand is this:

Code:
while (adc->adc0->isConverting() || adc->adc1->isConverting());
    lastR[ii++] = adc->adc0->readSingle();
    lastR[ii++] = adc->adc1->readSingle();

Thanks
 
Let me try again:

It's simpler then I thought.

You start reading both adc's. Then there is time to do some work on previous sensor readings.

Then there is a while loop which serves the purpose of checking whether the conversion of both adc's has finished.

When the condition of the while loop is true you assign the value of ADC0 to lastR[0] and increment to lastR[2].
Then you assign the value of adc1 to lastR[2] and increment to lastR[4].
 
Note that while the T4 doesn't have a pseudo differential mode, with two converters, one can sample the signal and a reference voltage at about the same time. So similar common mode noise/offset reduction benefit.
 
Status
Not open for further replies.
Back
Top