Again, I will evaluate a suboptimal algorithm because it is nice and fun in Smalltalk.
I guess in other languages too, especially functional languages, but you know my favourite path.
The exploration started with this little factorial contest submitted by Paul Baumann.
We already know the fastest algorithm are the prime swing variant (read this wiki to get an explanation of swing by Peter Luschny), and simple and clever split recursive algorithm submitted by John Brant.
In both cases, only the odd products are performed, in a divide and conquer fashion which produces well balanced large integers (this is important for accelerated products like Karatsuba Toom or FFT). The final power of two being replaced by a bitShift: which saves many costly LargeInteger multiplications.
Our suboptimal variant of interest consists in evaluating the factorial from its decomposition in prime factors. We know that each prime <= n will be compose n factorial, but we also know the power of each of these primes thanks to Legendre. This algorithm can be naively transcribed in Smalltalk:
Integer>>primeFactorFactorial
"Recompose the factorial from the prime factors, knowing the power of each prime."
^((Integer primesUpTo: self + 1) collect: [:p |
p raisedToInteger: self - (self sumDigitsInBase: p) // (p - 1)]) product
"Recompose the factorial from the prime factors, knowing the power of each prime."
^((Integer primesUpTo: self + 1) collect: [:p |
p raisedToInteger: self - (self sumDigitsInBase: p) // (p - 1)]) product
primesUpTo: is a method found in Squeak which use a sieve of Eratosthenes, up to means not inclusive, so we need the disgracious + 1. Then Squeak lacks two methods, sumDigitsInBase: and product.
A naive product, could be simply:
Collection>>product
^self inject: 1 into: [:product :element | product * element]
With this definition, Collection new product is answering 1, and we don't need any protection for evaluating 0 primeFactorFactorial.
For sure, it won't be efficient and should better be replaced by a divide and conquer variant, but we'll see this in another post.
A naive sumDigitsInBase: could be:
Integer>>sumDigitsInBase: base
^(self printStringBase: base) inject: 0 into: [:sum :digit | sum + digit digitValue]
OK, we're done...
1000 primeFactorFactorial. -> Error: subscript is out of bounds: 303
Oops! Unfortunately, it's not only suboptimal, it won't work for primes above 36 because Squeak has no idea which character to use to represent digit 37. Anyway, even with unicode we would only have a limited set of characters. Huge, but limited.
It does not matter anyway, we can just reimplement printStringBase: and instead of streaming out digits converted to characters, just sum the digits.
A naive algorithm would be to just remove the last digit (the remainder by a division by base):
Integer>>sumDigitsInBase: base
"Answer the sum
"Answer the sum
| sum |
sum := 0.
q := self abs.
[q = 0]
whileFalse:
[sum := sum + (q - ((q := q // base) * base))].
^sum
Since we probably won't compute the factorial of a LargeInteger, this shall be enough.
For LargeInteger, we would use a divide and conquer method in order to avoid repeated LargeInteger divisions, and replace them by as many SmallInteger divisions as possible. This would be the exact image of printStringBase:, so we won't develop it here.
Now we are done with this very naive implementation. Let's see how it performs on COG VM:
Smalltalk garbageCollect.
Time millisecondsToRun: [1 to: 3000 do: [:x | x factorial]].
-> 29753 29780
Smalltalk garbageCollect.
Time millisecondsToRun: [1 to: 3000 do: [:x | x primeFactorFactorial]].
-> 19204 19136
Time millisecondsToRun: [1 to: 3000 do: [:x | x factorial]].
-> 29753 29780
Smalltalk garbageCollect.
Time millisecondsToRun: [1 to: 3000 do: [:x | x primeFactorFactorial]].
-> 19204 19136
Incredible, with this very naive code we already gain 33%. In fact, any implementation will probably beat factorial performance, but not its simplicity.
No comments:
Post a Comment