After changing Integer>>asFloat, let's inspect Fraction>>asFloat. The spirit is quite simple: perform an Euclidean division after shifting the numerator or the denominator enough to obtain a certain precision in the quotient. The quotient will be the mantissa of the Float and the remainder are the truncated bits. Then come a dissertation on the conditions when to round upper or not. Let's see the code:
asFloat
"Answer a Float that closely approximates the value of the receiver.
This implementation will answer the closest floating point number to
the receiver.
It uses the IEEE 754 round to nearest even mode
(can happen in case denominator is a power of two)"
| a b q r exponent floatExponent n ha hb hq q1 |
a := numerator abs.
b := denominator abs.
ha := a highBit.
hb := b highBit.
"If both numerator and denominator are represented exactly in floating point number,
then fastest thing to do is to use hardwired float division"
(ha < 54 and: [hb < 54]) ifTrue: [^numerator asFloat / denominator asFloat].
"Try and obtain a mantissa with 54 bits.
First guess is rough, we might get one more bit or one less"
exponent := ha - hb - 54.
exponent > 0
ifTrue: [b := b bitShift: exponent]
ifFalse: [a := a bitShift: exponent negated].
q := a quo: b.
r := a - (q * b).
hq := q highBit.
"check for gradual underflow, in which case we should use less bits"
floatExponent := exponent + hq - 1.
n := floatExponent > -1023
ifTrue: [54]
ifFalse: [54 + floatExponent + 1022].
hq > n
ifTrue: [exponent := exponent + hq - n.
r := (q bitAnd: (1 bitShift: hq - n) - 1) * b + r.
q := q bitShift: n - hq].
hq < n
ifTrue: [exponent := exponent + hq - n.
q1 := (r bitShift: n - hq) quo: b.
q := (q bitShift: n - hq) bitAnd: q1.
r := (r bitShift: n - hq) - (q1 * b)].
"check if we should round upward.
The case of exact half (q bitAnd: 1) isZero not & (r isZero)
will be handled by Integer>>asFloat"
((q bitAnd: 1) isZero or: [r isZero])
ifFalse: [q := q + 1].
^ (self positive
ifTrue: [q asFloat]
ifFalse: [q = 0
ifTrue: [Float negativeZero]
ifFalse: [q asFloat negated]])
timesTwoPower: exponent
"Answer a Float that closely approximates the value of the receiver.
This implementation will answer the closest floating point number to
the receiver.
It uses the IEEE 754 round to nearest even mode
(can happen in case denominator is a power of two)"
| a b q r exponent floatExponent n ha hb hq q1 |
a := numerator abs.
b := denominator abs.
ha := a highBit.
hb := b highBit.
"If both numerator and denominator are represented exactly in floating point number,
then fastest thing to do is to use hardwired float division"
(ha < 54 and: [hb < 54]) ifTrue: [^numerator asFloat / denominator asFloat].
"Try and obtain a mantissa with 54 bits.
First guess is rough, we might get one more bit or one less"
exponent := ha - hb - 54.
exponent > 0
ifTrue: [b := b bitShift: exponent]
ifFalse: [a := a bitShift: exponent negated].
q := a quo: b.
r := a - (q * b).
hq := q highBit.
"check for gradual underflow, in which case we should use less bits"
floatExponent := exponent + hq - 1.
n := floatExponent > -1023
ifTrue: [54]
ifFalse: [54 + floatExponent + 1022].
hq > n
ifTrue: [exponent := exponent + hq - n.
r := (q bitAnd: (1 bitShift: hq - n) - 1) * b + r.
q := q bitShift: n - hq].
hq < n
ifTrue: [exponent := exponent + hq - n.
q1 := (r bitShift: n - hq) quo: b.
q := (q bitShift: n - hq) bitAnd: q1.
r := (r bitShift: n - hq) - (q1 * b)].
"check if we should round upward.
The case of exact half (q bitAnd: 1) isZero not & (r isZero)
will be handled by Integer>>asFloat"
((q bitAnd: 1) isZero or: [r isZero])
ifFalse: [q := q + 1].
^ (self positive
ifTrue: [q asFloat]
ifFalse: [q = 0
ifTrue: [Float negativeZero]
ifFalse: [q asFloat negated]])
timesTwoPower: exponent
Let's review the three parts in yellow : first the comment (can happen in case denominator is a power of two) is obscure. It is supposed to explain that in case of tie, the algorithm will round to nearest even. The case of tie can only happen if the denominator is a power of two indeed, any other factor in the denominator will lead to an infinite series of digits and thus cannot produce a tie. But this information is not essential enough to be placed in the header. After all it is not essential at all.
The second part in yellow tells that the quotient could be one bit less. Let's see how false it is.
Let's write that b have hb bits, a ha bits and q hq bits, and the conditions of Euclidean division:
- (1<< (ha-1)) <= a, a < (1 << ha)
- b < (1 << hb)
- q < (1 << hq)
- a=(b*q+r)
- r < b
From the last two relations we have a < (b*q+b), that is a < (b * (q+1)).
We still have, q + 1 <= (1 << hq), thus b * (q+1) < ((1 < hb) * (1 << hq)), that is a < (1 < (hb+hq)).
But we also have (1 << (ha - 1)) <= a, thus ha - 1 < (hb + hq).
This can be written hq > (ha - hb - 1), or hq >= (ha - hb). So the quotient will have at least ha - hb bits, maybe more, but never less.
The third part in yellow is trying to perform an addition with bitAnd: - ouch ! It could eventually be bitOr:, but why not simply writing q := (q bitShift: n - q) + q1...
Fortunately, this is the case of one bit less which can never happen.
There are other things to be enhanced. The numbers 54 and 1022 are kind of magical and use implicit knowledge that 54 = (Float precision + 1) and -1022 = Float emin, the minimum possible exponent of a Float before gradual underflow occurs.
So we should better re-write this code and:
- change the heading comment;
- remove the unreachable branch;
- use the new floating point inquiry messages precision and emin;
- name the mantissa mantissa rather than q;
- avoid to compute the remainder r, but just check if it is zero or hasTruncatedBits;
- let the floating point hardware handle the case of negative zero by itself
(we just have to keep at least one bit in the mantissa).
asFloat
"Answer a Float that closely approximates the value of the receiver.
This implementation will answer the closest floating point number to the receiver.
In case of a tie, it will use the IEEE 754 round to nearest even mode.
In case of overflow, it will answer +/- Float infinity."
| a b mantissa exponent hasTruncatedBits lostBit n ha hb hm |
a := numerator abs.
b := denominator. "denominator is always positive"
ha := a highBit.
hb := b highBit.
"Number of bits to keep in mantissa plus one to handle rounding."
n := 1 + Float precision.
"If both numerator and denominator are represented exactly in floating point number,
then fastest thing to do is to use hardwired float division."
(ha < n and: [hb < n]) ifTrue: [^numerator asFloat / denominator asFloat].
"Shift the fraction by a power of two exponent so as to obtain a mantissa with n bits.
First guess is rough, the mantissa might have n+1 bits."
exponent := ha - hb - n.
exponent >= 0
ifTrue: [b := b bitShift: exponent]
ifFalse: [a := a bitShift: exponent negated].
mantissa := a quo: b.
hasTruncatedBits := a > (mantissa * b).
hm := mantissa highBit.
"Check for gradual underflow, in which case the mantissa will loose bits.
"Answer a Float that closely approximates the value of the receiver.
This implementation will answer the closest floating point number to the receiver.
In case of a tie, it will use the IEEE 754 round to nearest even mode.
In case of overflow, it will answer +/- Float infinity."
| a b mantissa exponent hasTruncatedBits lostBit n ha hb hm |
a := numerator abs.
b := denominator. "denominator is always positive"
ha := a highBit.
hb := b highBit.
"Number of bits to keep in mantissa plus one to handle rounding."
n := 1 + Float precision.
"If both numerator and denominator are represented exactly in floating point number,
then fastest thing to do is to use hardwired float division."
(ha < n and: [hb < n]) ifTrue: [^numerator asFloat / denominator asFloat].
"Shift the fraction by a power of two exponent so as to obtain a mantissa with n bits.
First guess is rough, the mantissa might have n+1 bits."
exponent := ha - hb - n.
exponent >= 0
ifTrue: [b := b bitShift: exponent]
ifFalse: [a := a bitShift: exponent negated].
mantissa := a quo: b.
hasTruncatedBits := a > (mantissa * b).
hm := mantissa highBit.
"Check for gradual underflow, in which case the mantissa will loose bits.
Keep at least 1 bit to let the underflow preserve the sign of zero."
lostBit := Float emin - (exponent + hm - 1).
lostBit > 0 ifTrue: [n := n - lostBit max: 1].
"Remove excess bits in the mantissa."
hm > n
ifTrue:
[exponent := exponent + hm - n.
hasTruncatedBits := hasTruncatedBits or: [mantissa anyBitOfMagnitudeFrom: 1 to: hm - n].
mantissa := mantissa bitShift: n - hm].
"Check if mantissa must be rounded upward.
The case of tie (mantissa odd & hasTruncatedBits not)
will be handled by Integer>>asFloat."
(hasTruncatedBits and: [mantissa odd])
ifTrue: [mantissa := mantissa + 1].
^ (self positive
ifTrue: [mantissa asFloat]
ifFalse: [mantissa asFloat negated])
timesTwoPower: exponent
lostBit := Float emin - (exponent + hm - 1).
lostBit > 0 ifTrue: [n := n - lostBit max: 1].
"Remove excess bits in the mantissa."
hm > n
ifTrue:
[exponent := exponent + hm - n.
hasTruncatedBits := hasTruncatedBits or: [mantissa anyBitOfMagnitudeFrom: 1 to: hm - n].
mantissa := mantissa bitShift: n - hm].
"Check if mantissa must be rounded upward.
The case of tie (mantissa odd & hasTruncatedBits not)
will be handled by Integer>>asFloat."
(hasTruncatedBits and: [mantissa odd])
ifTrue: [mantissa := mantissa + 1].
^ (self positive
ifTrue: [mantissa asFloat]
ifFalse: [mantissa asFloat negated])
timesTwoPower: exponent
That's it. No revolution, but I hope it's a bit clearer, and for the same price, a tiny bit more efficient.
I could also explain in this blog why it is better to use floating point division if both numerator and denominator can be converted asFloat exactly... Or why the mantissa loose bits in case of underflow... But, oh, I'm so lazy, ain't that boring enough?