A real is represented by a floating number d, that is
d=2α (1+m), 0<m<1, −210 < α < 210. |
If α>1−210, then m ≥ 1/2, and d is a normalized floating point number, otherwise d is denormalized (α=1−210). The special exponent 210 is used to represent plus or minus infinity and NaN (Not a Number). A hardware float is made of 64 bits:
Examples of representations of the exponent:
2−52=0.2220446049250313e−15.
We have
|
hence α=1 and m=1/2+∑k=1∞(1/24k+1+1/24k+2). Hence the hexadecimal and binary representation of 3.1 is:
40 (01000000), 8 (00001000), cc (11001100), cc (11001100), |
cc (11001100), cc (11001100), cc (11001100), cd (11001101),
|
the last octet is 1101, the last bit is 1, because the following digit is 1 (upper rounding).
We have 3=2· (1+1/2). Hence the hexadecimal and binary representation of 3 is:
40 (01000000), 8 (00001000), 0 (00000000), 0 (00000000), |
0 (00000000), 0 (00000000), 0 (00000000), 0 (00000000)
|
For the representation of 0.1:
|
hence α=1 and
m= |
| + |
| ⎛ ⎜ ⎜ ⎝ |
| + |
| ⎞ ⎟ ⎟ ⎠ | , |
therefore the representation of 0.1 is
3f (00111111), b9 (10111001), 99 (10011001), 99 (10011001), |
99 (10011001), 99 (10011001), 99 (10011001), 9a (10011010),
|
the last octet is 1010, indeed the 2 last bits 01 became 10 because the following digit is 1 (upper rounding).
For the representation of a=3.1−3: computing a is done by adjusting exponents (here nothing to do), then subtracting the mantissa and adjusting the exponent of the result to have a normalized float. The exponent is α=−4 (that corresponds at 2·2−5) and the bits corresponding to the mantissa begin at 1/2=2·2−6: the bits of the mantissa are shifted to the left 5 positions and you get:
3f (00111111), b9 (10111001), 99 (10011001), 99 (10011001), |
99 (10011001), 99 (10011001), 99 (10011001), a0 (10100000),
|
Therefore, a>0.1 and a−0.1=1/250+1/251 (since 100000−11010=110).
This is the reason why:
floor(1/(3.1-3)) |
returns 9 and not 10 when Digits:=14.