Forum Rule: Always post complete source code & details to reproduce any issue!

# Thread: Exponential Moving Average - How does it work?

1. Originally Posted by jonr
Be careful when using 16 bits - your input sample range is limited. For example, -1024 to 1023 (??) for the code in #21. Some comments or checks would be a good idea.
Agreed, I added a check and more comments on the page: https://tttapa.github.io/Pages/Mathe...implementation

You can now verify the range:
Code:
```EMA<5, int_fast16_t> filter;

static_assert(filter.supports_range(-1024, 1023),
"use a wider state or input type, or a smaller shift factor");```
Originally Posted by WMXZ
if data are 16 bit, then IMO the state variable (effectively sample<<K) should be declared as 32 bit
As mentioned by Nominal Animal, a 32-bit state is much slower on an 8-bit MCU, and since a common use is to filter 10-bit analogRead values with a relatively small K, a 16-bit state is preferable.

Originally Posted by Nominal Animal
Note: It looks like the state has to be an exact-width type (so uintN_t and not uint_fastN_t), as any extra bits will mess up the modulo arithmetic.
I don't think this is an issue, especially if you explicitly cast the input to state_t before adding it. Even if you don't cast it, you're probably fine thanks to the usual arithmetic conversions:
[...]
Otherwise, the operand has integer type and integral conversions are applied to produce the common type, as follows:
- If both operands are signed or both are unsigned, the operand with lesser conversion rank is converted to the operand with the greater integer conversion rank
- Otherwise, if the unsigned operand's conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand's type.
- Otherwise, if the signed operand's type can represent all values of the unsigned operand, the unsigned operand is converted to the signed operand's type
- Otherwise, both operands are converted to the unsigned counterpart of the signed operand's type.

The conversion rank above increases in order bool, signed char, short, int, long, long long. The rank of any unsigned type is equal to the rank of the corresponding signed type.
Since the rank of the state type is never less than the rank of the input type, and since the state type is unsigned, the input will be converted to the unsigned state type before performing the addition.
The conversion from a signed integer to an unsigned integer effectively sign extends the integer to the full size of the unsigned type because of integral promotion:
If the destination type is unsigned, the resulting value is the smallest unsigned value equal to the source value modulo 2ⁿ where n is the number of bits used to represent the destination type.
Once the type is converted, the C and C++ standards guarantee 2ⁿ modular arithmetic for unsigned types, so it's fine.

The rules apply to the number of bits used to represent the type, which could be 32, for uint_fast16_t on ARM, for example.
uint_fast16_t doesn't guarantee arithmetic modulo 2¹⁶, but it does guarantee arithmetic modulo some power of two.

2. Originally Posted by PieterP
As mentioned by Nominal Animal, a 32-bit state is much slower on an 8-bit MCU, and since a common use is to filter 10-bit analogRead values with a relatively small K, a 16-bit state is preferable.
This statement far too much application and hardware oriented (low quality ADC and 8-bit MCU).
EMA is not only used to filter IMU accelerators and other 10bit data, but is used in sound signal detection. Here you can easily have mean levels of 8 bits and need a K of 10, so 16 bits are too little.

3. Originally Posted by WMXZ
This statement far too much application and hardware oriented (low quality ADC and 8-bit MCU).
EMA is not only used to filter IMU accelerators and other 10bit data, but is used in sound signal detection. Here you can easily have mean levels of 8 bits and need a K of 10, so 16 bits are too little.
For my purposes, and as a beginner-friendly Arduino/Teensy sketch, using the same size for the state as for the input is a perfectly sensible default.
You can still supply your own types as a template parameter if you don't want to use the default. You could even change the default if you feel like that's the right thing to do for your application.

From a practical point of view, having a default state type that's larger than the input type requires some extra template boilerplate to convert one type to a larger type. It also cannot be done consistently, what should the state type be if the input is uint64_t?

I expect users working with audio to know about data types and overflow, and I expect users to read the documentation, which clearly states the valid input range, as well as two examples how to compute it and a compile-time check.
If a user ignores the documentation and removes the range checks, that's on them.

4. Originally Posted by WMXZ
This statement far too much application and hardware oriented (low quality ADC and 8-bit MCU).
EMA is not only used to filter IMU accelerators and other 10bit data, but is used in sound signal detection. Here you can easily have mean levels of 8 bits and need a K of 10, so 16 bits are too little.
For that particular case, yes.

On 32-bit microcontroller, it makes sense to use the 32-bit filter implementation only.
On 8-bit microcontrollers, it makes sense to use the 16-bit filter if it suffices, but switch to 32-bit when it doesn't.

The C polymorphic header implementation can be easily changed to do this automatically, so the user does not need to care.

I'm not exactly sure what the cleanest way to implement the selection in C++ would be, but one option is to reuse part of the preprocessor magic for C for selecting the desired state type (16- or 32-bit unsigned integer) and sample type (signed or unsigned 16- or 32-bit integer), emitting a suitable filter class definition (preprocessor supplying the types as template parameters). For verification I'd add an override macro, so that an unit test case could compare the output of some PRNG filtered using floating-point types to the filter, to be run on a host computer.

5. Originally Posted by Nominal Animal
On 32-bit microcontroller, it makes sense to use the 32-bit filter implementation only.
That is why I use T3.x and T4.x. OK, there are still T2.x around and sold.

The question for me (on T4.x) is to keep, say signal detection and the related averaging (to estimate background noise) in the integer domain, or to convert to floating point. But this is another topic and not related to OP.

6. As the OP shows, floating point allows clearer code - use it if you your application isn't near the cpu capacity limit.

7. Originally Posted by WMXZ
integer [..] or [..] floating point
Floats (Binary32) have 23 bits of precision, but a 254-bit range.
32-bit integers have 32 bits of precision and a 32-bit range, of course.
If the absolute maximum signal range is known, then 32-bit integer (fixed-point) arithmetic will give you better precision.
However, Teensy 4.x do have hardware double (binary64) support too, with 53 bits of precision and a 2046-bit range.

So, choosing what format to use is a balance between code maintainability, efficiency on the given architecture (Teensy 4.x having hardware support for both floats and doubles, and Teensy 3.5 and 3.6 for floats), precision, and required range.

Finally, it is obviously trivial to add a 64-bit integer filter implementation (next to the 16- and 32-bit ones, I mean; it is just a few characters' difference), for use on 32-bit microcontrollers like Teensy 3.x/4.x.

8. Originally Posted by Nominal Animal
Floats (Binary32) have 23 bits of precision, but a 254-bit range.
32-bit integers have 32 bits of precision and a 32-bit range, of course.
If the absolute maximum signal range is known, then 32-bit integer (fixed-point) arithmetic will give you better precision.
That is my reasoning too, so decision to use floating points for ADC data (max signal range is known) is always pushed into future. Also FPU could consumed in the past more power than fixed point arithmetic. It may have changed, but do there exist measurements?

9. (I haven't found any, and can't even find instruction timings for ARMv7 or FPv5 used on the i.MX RT1062 on Teensy 4.0 and 4.1.
As far as I know, one would have to microbenchmark the various approaches oneself to find out. I haven't done that.
Others on his forum might know more.)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•