Fixed-Point: Fractions via Integer Arithmetic
CS 301 Lecture, Dr. Lawlor
Ordinary integers can only represent integral values.
We'd often like to represent fractional values.
In binary, each digit is worth half as much as the preceeding
digit. So just like "1" is worth half as much as "10", "0.1"
should be worth half as much as "1": in other words, the digit just
past the "binary point" is worth 1/2. The next digit is worth 1/4.
You can represent something more complex like "two and three-eighths" as "10.011". That is, there's:
- a 1 in the 2's place (2=21)
- a 0 in the 1's place (1=20)
- a 0 in the (beyond the "binary point") 1/2's place (1/2=2-1),
- a 1 in the 1/4's place (1/4=2-2), and
- a 1 in the 1/8's place (1/8=2-3)
for a total of two plus 1/4 plus 1/8, or "2+3/8". Note that this
is a natural measurement in carpenter's fractional inches, but it's a
weird unnatural thing in metric-style decimal inches! That is,
fractions that are (negative) powers of two have a nice binary
representation, but look weird in decimal (1/16 = 0.0001base2 = 0.0625base10). Conversely, short decimal numbers
have a nice decimal representation, but often look weird as a binary
fraction (0.2base10 = 0.001100110011...base2).
You can even write fractions in hex, like 0x2.F = 2 and 15/16; or 0xF.03 = 15 and 3/256.
Place/bit Number | i
| ...
| 3
| 2
| 1
| 0
| .
|
-1
|
-2
|
-3
|
Decimal: Base-10
|
10i
|
...
|
1000
|
100
|
10
|
1
| Decimal point
|
1/10
|
1/100
|
1/1000
|
Binary: Base-2
|
2i |
...
|
8 = 23
|
4 = 22
|
2
|
1
| Binary point
|
1/2
|
1/4
|
1/8
|
Hex: Base-16
|
16i |
...
|
4096 = 163
|
256 = 162
|
16
|
1
| Radix point
|
1/16
|
1/256
|
1/4096
|
Fixed point Arithmetic
One very common trick for dealing with fractions is to just start
*thinking* of integers as fractions. For example, a lot of banks
apparently store values as an integer number of cents:
int v1=345; // value in cents
int v2=102; // value in cents
return v1+v2; // return value in cents
(Try this in NetRun now!)
This returns the integer 447, which is measured in cents ($4.47).
It makes sense that that adding cents to cents gives cents.
But multiplying cents doesn't work quite as well:
int v1=345; // value in cents
int v2=102; // interest in percent
return v1*v2; // return value in cents * percent (?)
(Try this in NetRun now!)
This returns an integer 35190, with units of "cents * percent".
Dividing by 100 to get rid of the percent gives 351.9 cents, which is
345 cents plus 2% interest, but you do need to remember to divide by
the percent!
Because you need to multiply and divide by the integer-to-fraction
scale factor, it's common in non-financial work to pick a power of two
as the scale factor. For example, if I pick 256 as my scale
factor between fractions and integers, then I can divide by 256 with a
bit shift by 8 bits. This tends to get called ".8 fixed point",
indicating there are 8 bits after the decimal point. Sometimes
you see all the bits listed, like "24.8 fixed point" for a 32-bit
number scaled by 256--this indicates there are 24 bits ahead of the
decimal point, and 8 bits after the decimal point.
int v1=(3.45*256); // value in .8 fixed point
int v2=(1.02*256); // value in .8 fixed point
int x=v1+v2; // result in .8 fixed point
std::cout<<"Result = "<<x/256.0<<"\n"; // scale back to decimal
return x; // see true value in hex
(Try this in NetRun now!)
This prints 4.46875 (decimal) and 0x4.78 (hex). Note that
the exact answer is 4.47 (decimal). We can get closer to the
exact answer by just using more bits, like this .16 version of the same
computation
float scale=256*256; // == 2**16
int v1=(3.45*scale); // value in .16 fixed point
int v2=(1.02*scale); // value in .16 fixed point
int x=v1+v2; // result in .16 fixed point
std::cout<<"Result (decimal) = "<<x/scale<<"\n"; // scale back to decimal
return x; // see true value in hex
(Try this in NetRun now!)
This prints 4.46999 (decimal) and 0x4.7851 (hex). We're closer,
but still not exactly right. It turns out that 4.47 is an
infinitely repeating number in binary, so even using 48 bits doesn't
capture it exactly:
double scale=pow(2,48); // == 2**48
long v1=(3.45*scale); // value in .48 fixed point
long v2=(1.02*scale); // value in .48 fixed point
long x=v1+v2; // result in .16 fixed point
std::cout.precision(20);
std::cout<<"Result (decimal) = "<<x/scale<<"\n"; // scale back to decimal
print_long(x); // see true value in hex
(Try this in NetRun now!)
This prints 4.4699999999999988631 (decimal) and 0x4.7851EB851EB8 (hex). That's probably close enough.
In almost any real computation, you end up needing to throw away some
bits--for example, if the result is irrational, you *must* throw away
some bits! In real science and engineering work, it's rare that
you need more than 4 or 5 decimal places in the final answer (otherwise
things like thermal expansion start dominating your
measurements). Understanding when this "roundoff" process occurs,
and how to manage it, is an important skill!