Tuesday, August 23, 2011

Smallapack progress on Squeak

This post is about Smallapack, a Smalltalk interface to LAPACK.
Smallapack deals with basic linear algebra (the BLAS) or more ellaborated linear algebra, like solving least squares problem, eigenvalues problems, singular values decompositions etc... (LAPACK).
It enables Smalltalk to deal efficiently with number crunching by delegating these operations to specially optimized external libraries, at the express condition that operations can be arrayed (a lot like Matlab interpreter for example).
Smallapack implementation was working for a long time in Cincom VW and Dolphin dialects, but I never was able to have it running reliably on Squeak. Until yesterday night, when I learned the stupid little detail that caused that mess:
Modern C ABI expect an 8 byte alignment for double * data (especially for some library optimized thru SSE2 instructions or such).
Squeak has a super feature that enables passing a ByteArray allocated in Object memory rather than an ExternalAddress pointing to some memory allocated on external heap. I abused this feature, because heap management requires malloc and free, and why bother with so much basic tasks, when I can just let the garbage collector take care of Object memory... Unfortunately, this excess of trust was my mistake:
Squeak objects are 4 bytes aligned, at least on all 32 bit VMs available to date.
Hence, some function calls will randomly crash the VM with a protection fault (OSX gdb says EXC_BAD_ACCESS). The garbage collector has a license to allocate on 4 byte boundary or to rellocate on 4 bytes boundary at any time, and thus there is not much we can do about it. Eliot told me that a new garbage collector with proper 8 byte alignment was in his plans, but not soon, Once more, I will depend on the tremendous amount of work this guy put in Smalltalk, but who doesn't? Viva Miranda!
So my bug was happening by example with cblas_dgemv on Mac OSX veclib if the matrix is transposed, and by extension with any LAPACK function calling a transposing DGEMV...
GOOD NEWS! Once discovered last night, the bug was quickly fixed this evening, it was just a matter of redefining two methods (arrayPointer) so that they transfer the data in external heap (storeInCSpace) before answering the pointer object...

It is very dangerous to use a (ByteArray new: byteSize) as a memory for holding a table of double and passing it as argument to an external procedure... Some external procedures will tolerate a 4 bytes alignment, some won't... You should always prefer an (ExternalAddress gcallocate: byteSize) for passing a double *. By virtue of libc malloc(), the ExternalAddress shall always contain a correctly aligned pointer.

No comments:

Post a Comment