Saturday, September 7, 2019

How to compare exact value of SmallInteger and Float in Smalltalk

Long time ago, mixed arithmetic has been very well thought in Scheme with the notion of exact arithmetic vs inexact arithmetic (the resulting value of an operation might be an approximation of the mathematically exact value).

The R5RS specification require that compare operations be transitive in §6.2.5 Numerical operations. Doing so preserve important mathematical properties, like equivalence relation or total ordering. These properties are more than welcome when building higher level abstractions.

No matter that the inexact operand value may already carry it's own error bounds: as an agnostic library, we have no license to waste ULPs, that would only worsen the task of application writers.

A consequence of transitivity requirement is that mixed arithmetic comparisons between integer and float shall better answer the result of comparing the exact value of operands, rather than convert the int to float, then compare the potentially inexact (rounded) values. I spent enough time to convince Smalltalk community that we should adopt such superior model too. Typically, early versions of Squeak did not fulfill the transitivity expectations (but neither did most Lisp implementations, nor any Smalltalk dialect I knew):

    | int1 int2 float |
    int1 := 16r100000000000000 "1 bitShift: 56".
    int2 := 16r100000000000001 "(1 bitShift: 56) + 1".

    float := 16r100000000000000.0 "1.0 timesTwoPower: 56".
    int1 = float & (float = int2) ==> (int1 = int2).

Nowadays, a majority of Smalltalk dialects will do the right thing, including the Squeak brand, the Cuis/Pharo derivatives, GNU Smalltalk and Dolphin Smalltalk. Visualworks does not abide, but I published Image side patches for 32bits version. I did not look at VisualAge for ages, the temporary licenses from Instantiations are not liberal enough to play with. Gemstone is deadly missing on my check list and has liberal enough licenses, but I have my own limitations and need to delegate... I'll take case of Smalltalk/X below.

Mathematics are one thing, but how do we technically achieve that feature at a reasonable price? Particularly in the statistically most common case of mixed SmallInteger op: Float.

With 32bits Smalltalk, and IEEE 754 double precision float providing a 53 bits precision significand, there is no problem, every conversion SmallInteger Float will be exact, and we can thus use naive conversion followed by float comparison. This is mostly done inside the VM in primitives or JIT.

At image side in Squeak, for LargeInteger and Fraction, the strategy is a bit more elaborated, because conversions can cost a lot! The idea is to test if the operand can be converted exactly (see isAnExactFloat in Squeak family), and if so, perform the conversion to Float, and compare floats (that's the fastest path). Else, exact arithmetic shall be used to perform the comparison (unless the Float is infinite or nan) and thus the Float is converted to it's exact value (a.k.a. asTrueFraction in Squeak or asExactFraction in gst).

With 64bits OpenSmalltalk VM which provides 61 bits signed SmallInteger, we now have to take care to not exceed the float precision. We had to rewrite the naive primitives and forbid an inexact conversion int64 double like what is done at image side for LargeInteger. A naive and simpler version of isAnExactFloat check has first been programmed, to eventually make the primitive fail: we just shift right by Float precision and check case of excessive high bits, rather than check the bit span highBit - lowBit + 1. False negative do not matter, they can be handled at image side where it is simpler to write more elaborated code. But those precautions were forgotten in the JIT, that's Issue 417.

In fact, all this started after observing that Smalltalk/X had the very same problem: the primitives were doing naive inexact conversions like in 64bits Visualworks instead of failing. But in Smalltalk/X, these are not really primitives, but rather inlined C-code that gets compiled dynamically and can be changed easily (see for example I thus started to think how I would resolve it in standard C so as to preserve speed. This had to be simple enough to be a viable evening hack (a few lines without too many compiler specific builtins).

So let's backtrack: when we convert 16r100000000000001 it might be converted upper or lower, and will loose last 4 bits (in OpenSmalltalk that's at most SmallInteger maxVal highBit - Float precision = 8). This specific integer loose only last bit and is transformed to 16r100000000000000.0 but is such tiny inexactness really going to be a problem if we compare to say -1000.0, 2.5, 1.0e100 or Float infinity? No, of course, the last few rounded bits won't change the comparison result... Unless the Float operand is sufficiently near the SmallInteger. How near?

Let's inquire this. In Squeak with simpler notations:
  • a is the SmallInteger
  • f is the Float whose exact value is b (either an Integer or a Fraction).
    f asTrueFraction = b.
    b asFloat = f.
    b asFloat = b.

Above propositions are obvious, since in Smalltalk:

    aFloat isAnExactFloat.

This was not true of those C using x87 extended 80bits extended precision float registers, but it is in C99 with appropriate flags, and easy when using SSE2 64bits registers.

And since they share same value:

    aFloat asTrueFraction isAnExactFloat.

More interestingly, the conversion asFloat has this property of monotonicity by virtue of IEEE754 standard (because we round the exact to nearest inexact value):
    a < b ==> (a asFloat <= b asFloat).

Even if different, a and b might be converted to same floating point value, the order can be blurred, but it can never be reversed! We can deduce from this property that:

    a asFloat < b asFloat ==> (a < b).


    a asFloat > b asFloat ==> (a > b).

The only case when we are in doubt is:

   a asFloat = b asFloat.

From which one cannot deduce anything about ordering of a and b...

We got the bases of a new algorithm: naively convert the SmallInteger a asFloat, then compare to the Float f value: if both float values are equal, there is an ambiguity to be resolved, else we perform comparison in floating point and we're done, even if conversion asFloat were inexact, we know that it was innocuous (unordered case of f = Float nan, as well as case of f = Float infinity included).

If there is ambiguity, we have to fallback to asTrueFraction conversion... But we have additional properties:

    a asFloat asTrueFraction isInteger.

That's a property of floating point representation, for any SmallInteger a, asFloat is either exact and thus preserve integerness, or will remove some precision (zero the last bits). In no way can we obtain a fractionPart, because that would mean adding more precision.
Only LargeInteger will loose this property when starting to overflow to Float infinity.


    a asFloat = f ==> f asTrueFraction isInteger.

Great! We thus do not need to deal with Fraction in case of ambiguity:

    f asTrueFraction isInteger ==> (f asTrueFraction = f asInteger).

With this knowledge, we can sketch the whole algorithm in Smalltalk:

    SmallInteger>>compareOpFloat: aFloat
        ^(self asFloat = aFloat)
            ifTrue: [self compareOp: aFloat asInteger]
            ifFalse: [self asFloat compareOp: aFloat]

So exactness costs only 1 additional comparison versus inexactness, and 1 additional conversion in case of unlucky ambiguity.
Note that aFloat asInteger could degenerate to a LargeInteger in Smalltalk due to upper rounding... For example, in 64 bits Squeak Images:

    SmallInteger maxVal asFloat asInteger isLarge.

Similarly in C, for int64 a, there would be some edge case of signed integer overflow when performing (int64) f , and such situation would deserve a uint64 handling in order to avoid undefined behavior... Not less complex in assembler for the JIT.
I may have a solution for it, but YAGNI, our OpenSmalltalk SmallInteger are only on 61 bits, so we can write it naivly in primitives and avoid the image side case of LargeInteger! Rather than testing inexactness and failing in edge case, the primitive can always answer the result of exact comparison at a very low price, both in term of speed and simplicity. There's no reason to not do it.

Or there might be some good reasons... After all, I hate the fact that Visualworks has hardcoded naive inexact conversion in primitive!
If we want to change the behavior, there is no other solution than checking the preconditions in slower Smalltalk code (either via double dispatching or type testing) BEFORE calling fast primitives, which is ruining the benchmarks of number crunching applications... It's far more efficient to call the fast primitive first, and handle failure cases in fallback code.
Even if I'm convinced that exact comparison is a better thing than inexact, it's stil questionable to hardwire it in the VM. Juan Vuletich, the author of Cuis gave a sounding echo to the little voice inside me telling to not cut too much the world of possibilities...

I have thus prototyped a very simple behavior with a single boolean VM parameter: either the primitives and jitter do perform mixed arithmetic, and if so they do it exactly for comparisons, or they don't perform any mixed arithmetic at all and always fallback to the image code, whatever the operation (arithmetic or comparison). This way, the unmixed arithmetic Float op: Float and SmallInteger op: SmallInteger are still very fast.
With the help of Eliot Miranda, the master of OpenSmalltalk VM, I hope that we consolidate these features soon, have our cake and eat it too. That's what every serious Smalltalk implementation should do. If some vendor decide to hardcode comparison of inexact (rounded) values, then please provide a way to desactivate it! If choosing exactness, everyone now knows how to do it without wasting too many CPU cycles nor engineering, all the material is licensed MIT, and unless I re-invented some well known prior work, I hope patent-free, thanks to this publication.

1 comment:

  1. I first wrote that Cuis did compare the exact value, err, it doesn't! It convert operands to inexact... I understand better why Juan did not want the "improved" algorithm.