Forum Rule: Always post complete source code & details to reproduce any issue!

# Thread: Fast changing range or bits of a number (e.g. 0-31 to 0-255)

1. ## Fast changing range or bits of a number (e.g. 0-31 to 0-255)

Hi all,

EDIT: Nope this doesn't work for all ranges. Works for 4-bit to 2-bit back to 4-bit, and works for 8-bit to 4-bit and back to 8-bit, but not others.

Just putting this here for future searches, because I had to come up with this on my own when I couldn't find any results in my searches. Maybe I was using the wrong terms? Anyway:

During my work with different image pixel formats, I've often had to scale color values up and down. For example in most image formats, each color channel takes 8-bits (0-255), but on some displays, such as ILI9341, color channels are 5-bit (0-31) or 6-bit (0-64). Scaling down is straight-forward and well documented, just with a bit shift. However, I couldn't find any 'fast' solutions for scaling back up again.

Scaling down:
Scaling down is easy. You can shift-right >> by the correct number of bits and you get the correct value.
Code:
```// 8-bit number (0-255)
uint8_t n = 255;
// Change to 5-bit by shifting 3 bits
uint8_t m = n >> 3; // 31```
Scaling up:
Scaling up is more difficult, because simply shifting left << will not give the correct values. The formula seems to be:
Code:
```// 5-bit number (0-31)
uint8_t n = 31;
// Change to 8-bit by shifting 3 bits and adding self
uint8_t m = (n << 3) + n; // 255```
I've tested with a few different bit levels and it seems to hold up. Could someone verify this method results in correct values? Is this a known method?  Reply With Quote

2. This is better for scaling up, but still not ideal -

Code:
```// b = bits to shift
// n = original number
uint8_m = (n==0)?0:((n+1) << b) - 1;```
Code:
```// 4-bits to 1-bit and back to 4-bits
0    0    0
1    0    0
2    0    0
3    0    0
4    0    0
5    0    0
6    0    0
7    0    0
8    1    15
9    1    15
10   1    15
11   1    15
12   1    15
13   1    15
14   1    15
15   1    15

// 4-bits to 2-bits and back to 4-bits
// This is not ideal. Numbers in (brackets) would be better (which is the original (n << b) + n that only works in some cases).
0    0    0     (0)
1    0    0     (0)
2    0    0     (0)
3    0    0     (0)
4    1    7     (5)
5    1    7     (5)
6    1    7     (5)
7    1    7     (5)
8    2    11    (10)
9    2    11    (10)
10   2    11    (10)
11   2    11    (10)
12   3    15    (15)
13   3    15    (15)
14   3    15    (15)
15   3    15    (15)

// 4-bits to 3-bits and back to 4-bits
0    0    0
1    0    0
2    1    3
3    1    3
4    2    5
5    2    5
6    3    7
7    3    7
8    4    9
9    4    9
10   5    11
11   5    11
12   6    13
13   6    13
14   7    15
15   7    15```
Probably the best we could hope for.  Reply With Quote

3. Each shift right is a divide by 2 - and is lossy - a zero comes in the left and the rightmost bit goes 'away' to the bit bucket - whether it is a 0 or a 1.

Same but opposite with a left shift being a multiply by 2 where zeros come in on the left and the higher bit is shifted off and lost - this is the same as adding a number to itself.  Reply With Quote

4. Yes, I understand completely what shifting does That's not the issue.

I'm trying to find a 'fast' way to increase the number of bits which results in a linear interpolation.
So far the formula a = ((b+1)<<3)-1; is approximately linear most of the time, but not all the time (see middle example above of 4-bits to 2-bits back to 4-bits).
Take this very practical example:

The alpha of the pixel is 255 (full alpha):
a = 255;
If we shift this to the right to change it to the range 0-31 (5-bit number):
b = a >> 3;
We get the correct value of b=31 (still full alpha - or the maximum value of 5-bit number):
However, if we want to change the range from 0-31 back to 0-255 and we try to simply shift left again:
a = b << 3;
We get a=248, which is no longer at full alpha, and that pixel will no longer be solid.
However, if use the formula above, we get.
a = ((b+1)<<3)-1;
So we get a = 255, which is 'solid' alpha again.

At the moment I use the following approach. It is accurate, but not at all fast. In my case I only need to calculate f once (floating point division), but i need to do a floating point multiplication for every pixel. i am trying to find a faster method using only bit shifting and addition/subtraction if possible (I'm assuming they are faster on Teensy, especially non-HW floating point):
Code:
```// Example, moving from 5-bits to 8-bits
float f = 255/31; // max 8-bit value divided by max 5-bit value

a = (uint8_t)(b*f); // per pixel. b is a 5-bit number, a will be 8-bit interpolation```  Reply With Quote

5. That is a special case - not just a general shrinking of a number in bit size. It would only take 256 bytes to make a table to look up 8 bit values.  Reply With Quote

6. Yes, a LUT would be one solution for sure We would actually only need a 32 bytes LUT to go from 5-bit to 8-bit (up scaling).

But I disagree - it's not just a special case. It is fast linear interpolation (for specific number ranges) and could be applied to lots of things.
For example, I get an ADC input on a Teensy analogue pin. The range is 10-bits (in this example) so maximum value is 1023 (max value with 10 bits).
I want to change the range to 13-bit so that I can do some math with another library that uses 13-bit ADC values.
If I shift the number 1023 (maximum for 10-bits) to the left 3 times, I get:
1023 << 3; // 8184
I get the value 8184. What I should actually get is the maximum 13-bit number of 8191
If I use the formula ((b+1)<<3)-1 I get:
((1023+1)<<3 )- 1; // 8191 (correct)
I get the correct value. The problem is that the formula is not linear over the whole range 0-1023. Take a number somewhere else, like 172:
((172+1)<<3 ) - 1; // 1383 (incorrect)
The actual number I should get (using the slow floating point calculation) is:
f = 8191/1023; a = (uint16_t)(b*f); // 1377 (correct)

It might start to get impractical to have a 10-bit LUT I'm trying to come up with a better 'fast' linear approximation.
The current approx. is ok for trivial stuff (like alpha-blitting, perhaps), but not super accurate where accuracy might matter (especially with lower numbers).  Reply With Quote

7. Just a visual to show the 'accuracy' of the current approximation.
This example changes a 6-bit number to 3-bits and back again.
Attachment 16501  Reply With Quote

8. TL;DR: This is well known. Add extra bits by duplicating the existing bits as needed, instead of padding with zero bits. So, abcde2abcdeabcdeab2 for 5-to-12-bit conversion, for example.

Mathematically, to scale between AminAAmax and BminBBmax you use
B = Bmin + (A - Amin) × (Bmax - Bmin) / (Amax - Amin)
or, inversely,
A = Amin + (B - Bmin) × (Amax - Amin) / (Bmax - Bmin)

For a and b -bit unsigned integers, this simplifies to
B = A × (2b - 1) / (2a - 1)
Note: the scaling factor is not 2b-a (that you would obtain with a pure binary shift), but (2b-1)/(2a-1) instead.

Duplicating and concatenating the bits in A yields a very good approximation, affecting only how the mathematically exact real value is effectively rounded to an integer value.

In C:
If b < a, we can use B = A >> (a - b); .
If b > a but b < 2a, we can use B = (A << (b - a)) | (A >> (2*a - b)); .
If b ≥ 2a but b < 3a, we can use B = (A << (b - a)) | (A << (b - 2*a)) | (A >> (3*a - b)); .
And so on.

For example, to calculate a 3-bit unsigned integer y (0 ≤ y ≤ 7) from a 2-bit unsigned integer x (0 ≤ x ≤ 3), in C you can use
Code:
`y = (x << 1) | (x >> 1);`
To calculate a 12-bit unsigned integer y (0 ≤ y ≤ 4095) from a 5-bit unsigned integer x (0 ≤ x ≤ 31), in C you can use
Code:
`y = (x << 7) | (x << 2) | (x >> 3);`
To calculate a 8-bit unsigned integer y (0 ≤ y ≤ 255) from a 5-bit unsigned integer x (0 ≤ x ≤ 31), in C you can use
Code:
`y = (x << 3) | (x >> 2);`
For the cases when unsigned integer y has 2n bits and x has n bits,
Code:
`y = (x << n) | x;`
which yields the exact same value as
Code:
`y = (x << n) + x;`
when (0 ≤ x < 2n).

On hardware architectures with a fast multiply operation with sufficiently wide integers and b several times a, it is better to approximate (2b-1)/(2a-1) by a multiplication by an integer constant and optionally a right shift, i.e. mathematically use factor (C / 2c). It depends on the compiler (version), hardware, and particular a and b, though.  Reply With Quote

9. That is fantastic The standard mathematical linear interpolation equations are well known, but this is the first time I've run into these approximations, so thank you!
I'll take a look when I get chance later this week. Awesome.  Reply With Quote

10. You're welcome.

I tried to look for the sources, as I do remember seeing this somewhere discussing linear interpolation and bit hacks, but cannot find one right now.
You can find other similar useful "tricks" by looking for bit twiddling hacks; the one by Sean Eron Anderson might be the most well known list of such hacks on the web.  Reply With Quote

11. Have bookmarked the link. Very useful.
Just did a bit of testing, and in most of the cases I'm interested in the approximation is dead-on, or only 1 out at most. This is great.

Example:   Reply With Quote

12. Originally Posted by Projectitis in most of the cases I'm interested in the approximation is dead-on, or only 1 out at most. This is great.
It is always ±1; it is only the 'rounding' that is affected. Consider the 5-bit to 8-bit scaling case:
Code:
```y = 255 · x / 31:

x ║    y    │   exact  ║   exact    ║ approx.     error
═══╬═════════╪══════════╬═════╪══════╬═════╪═════╪═══════════
0 ║    0    │    0     ║   0 │      ║   0 │     │
1 ║  255/31 │    8.226 ║   8 │ +8   ║   8 │ +8  │ +0.225806
2 ║  510/31 │   16.452 ║  16 │ +8   ║  16 │ +8  │ +0.451613
3 ║  765/31 │   24.677 ║  25 │ +9 ! ║  24 │ +8  │ +0.677419
4 ║ 1020/31 │   32.903 ║  33 │ +8   ║  33 │ +9 !│ -0.096774
5 ║ 1275/31 │   41.129 ║  41 │ +8   ║  41 │ +8  │ +0.129032
6 ║ 1530/31 │   49.355 ║  49 │ +8   ║  49 │ +8  │ +0.354839
7 ║ 1785/31 │   57.581 ║  58 │ +9 ! ║  57 │ +8  │ +0.580645
8 ║ 2040/31 │   65.806 ║  66 │ +8   ║  66 │ +9 !│ -0.193548
9 ║ 2295/31 │   74.032 ║  74 │ +8   ║  74 │ +8  │ +0.032258
10 ║ 2550/31 │   82.258 ║  82 │ +8   ║  82 │ +8  │ +0.258065
11 ║ 2805/31 │   90.484 ║  90 │ +8   ║  90 │ +8  │ +0.483871
12 ║ 3060/31 │   98.710 ║  99 │ +9 ! ║  99 │ +9 !│ -0.290323
13 ║ 3315/31 │  106.935 ║ 107 │ +8   ║ 107 │ +8  │ -0.064516
14 ║ 3570/31 │  115.161 ║ 115 │ +8   ║ 115 │ +8  │ +0.161290
15 ║ 3825/31 │  123.387 ║ 123 │ +8   ║ 123 │ +8  │ +0.387097
16 ║ 4080/31 │  131.613 ║ 132 │ +9 ! ║ 132 │ +9 !│ -0.387097
17 ║ 4335/31 │  139.839 ║ 140 │ +8   ║ 140 │ +8  │ -0.161290
18 ║ 4590/31 │  148.065 ║ 148 │ +8   ║ 148 │ +8  │ +0.064516
19 ║ 4845/31 │  156.290 ║ 156 │ +8   ║ 156 │ +8  │ +0.290323
20 ║ 5100/31 │  164.516 ║ 165 │ +9 ! ║ 165 │ +9 !│ -0.483871
21 ║ 5355/31 │  172.742 ║ 173 │ +8   ║ 173 │ +8  │ -0.258065
22 ║ 5610/31 │  180.968 ║ 181 │ +8   ║ 181 │ +8  │ -0.032258
23 ║ 5865/31 │  189.194 ║ 189 │ +8   ║ 189 │ +8  │ +0.193548
24 ║ 6120/31 │  197.419 ║ 197 │ +8   ║ 198 │ +9 !│ -0.580645
25 ║ 6375/31 │  205.645 ║ 206 │ +9 ! ║ 206 │ +8  │ -0.354839
26 ║ 6630/31 │  213.871 ║ 214 │ +8   ║ 214 │ +8  │ -0.129032
27 ║ 6885/31 │  222.097 ║ 222 │ +8   ║ 222 │ +8  │ +0.096774
28 ║ 7140/31 │  230.323 ║ 230 │ +8   ║ 231 │ +9 !│ -0.677419
29 ║ 7395/31 │  238.548 ║ 239 │ +9 ! ║ 239 │ +8  │ -0.451613
30 ║ 7650/31 │  246.774 ║ 247 │ +8   ║ 247 │ +8  │ -0.225806
31 ║  255    │  255.000 ║ 255 │ +8   ║ 255 │ +8  │ +0.000000```
I've marked the interesting values with an exclamation mark, !. Note how on the exact 8-bit value, there are 3 or 4 deltas of +8 between each delta of +9; but on the approximation, there are always exactly 3 deltas of +8 between each delta of +9. The correctly rounded value is at most ±0.5 from the exact value, but the approximation is ±1.

In essence, the approximation constructs values where the change between successive values is regular, i.e. there are always N1 changes of ∆ followed by N2 changes of ∆+1, repeated over the entire range. This is a direct result from the binary construction. If you plot the error in the correctly rounded value (red) and the error in the approximation (blue), you'll see the structure in the two different rounding errors. The bit repeating approximation yields a completely regular triangular "wave" (within ±1), whereas the correctly rounded value keeps to ±0.5 (with two or more different sized "tooths").

Apologies for delving so deep into a relatively simple thing, but I personally find these approximations and the error analysis in such approximations interesting, and actually useful in practice.  Reply With Quote

13. By all means. A little math doesn't scare me, and, yes, it's very interesting!  Reply With Quote

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•