Smallissimo: Tuning the LargeArithmetic sqrtRem

Sunday, May 12, 2019

Tuning the LargeArithmetic sqrtRem

After further analysis, it turns out that sqrtRem algorithm is always faster than naive Newton-Raphson iterations for LargePositiveInteger in Squeak with 64 bits Opensmalltalk Spur VM.

So rather than tuning the threshold, I just removed the threshold, which is something we all like: less code to write, less code to test, simple, small and beautiful code 😀.

So here is the updated benchmark:

n (bits)	100	200	500	1 000	2 000	5 000	10 000	20 000	50 000	100 000	200 000	500 000
SR(n)	4 800	9 200	13 490	22 120	49 130	233 170	895 920	3 691 610	25 134 780	107 222 200	456 219 390	3 130 480 810
SRa(n)	1 042	3 644	8 742	11 994	16 288	27 678	67 182	181 219	645 797	1 843 856	5 275 917	19 623 314
gain	78 %	60 %	35 %	46 %	67 %	88 %	93 %	95 %	97 %	98 %	99 %	99 %

That's pretty good, but still has the same pot-bellied look in low bits, probably related to high division costs.

While at it, I totally revamped various implementation of sqrt in Squeak. I don't give further details here, the commit message is already pretty long!

Smallissimo

Sunday, May 12, 2019

Tuning the LargeArithmetic sqrtRem

No comments:

Post a Comment