Sunday, May 12, 2019

Tuning the LargeArithmetic sqrtRem

After further analysis, it turns out that sqrtRem algorithm is always faster than naive Newton-Raphson iterations for LargePositiveInteger in Squeak with 64 bits Opensmalltalk Spur VM. 

So rather than tuning the threshold, I just removed the threshold, which is something we all like: less code to write, less code to test, simple, small and beautiful code 😀.

So here is the updated benchmark:

n (bits)
100
200
500
1 000
2 000
5 000
10 000
20 000
50 000
100 000
200 000
500 000
SR(n)
4 800
9 200
13 490
22 120
49 130
233 170
895 920
3 691 610
25 134 780
107 222 200
456 219 390
3 130 480 810
SRa(n)
1 042
3 644
8 742
11 994
16 288
27 678
67 182
181 219
645 797
1 843 856
5 275 917
19 623 314
gain
78 %
60 %
35 %
46 %
67 %
88 %
93 %
95 %
97 %
98 %
99 %
99 %

That's pretty good, but still has the same pot-bellied look in low bits, probably related to high division costs.

While at it, I totally revamped various implementation of sqrt in Squeak. I don't give further details here, the commit message is already pretty long!

No comments:

Post a Comment