Tuesday, August 30, 2011


After Smallapack, I decided to open another google code project for another already existing cross-dialect package - ArbitraryPrecisionFloat. This time, no FFI complications, the package is 100% "pure" Smalltalk (it does not mean 100% beautiful, I can't judge myself but severly). I will try to update the Wikipedia page because my favourite language deserves a bit more visibility.

Friday, August 26, 2011

Smallapack on google code

Finally I decided to gather notes about Smallapack on google code https://code.google.com/p/smallapack
The site is very young and does not contain much material yet, but it will be completed with installation notes, code snippets and a few insights on design.

Tuesday, August 23, 2011

Smallapack progress on Squeak

This post is about Smallapack, a Smalltalk interface to LAPACK.
Smallapack deals with basic linear algebra (the BLAS) or more ellaborated linear algebra, like solving least squares problem, eigenvalues problems, singular values decompositions etc... (LAPACK).
It enables Smalltalk to deal efficiently with number crunching by delegating these operations to specially optimized external libraries, at the express condition that operations can be arrayed (a lot like Matlab interpreter for example).
Smallapack implementation was working for a long time in Cincom VW and Dolphin dialects, but I never was able to have it running reliably on Squeak. Until yesterday night, when I learned the stupid little detail that caused that mess:
Modern C ABI expect an 8 byte alignment for double * data (especially for some library optimized thru SSE2 instructions or such).
Squeak has a super feature that enables passing a ByteArray allocated in Object memory rather than an ExternalAddress pointing to some memory allocated on external heap. I abused this feature, because heap management requires malloc and free, and why bother with so much basic tasks, when I can just let the garbage collector take care of Object memory... Unfortunately, this excess of trust was my mistake:
Squeak objects are 4 bytes aligned, at least on all 32 bit VMs available to date.
Hence, some function calls will randomly crash the VM with a protection fault (OSX gdb says EXC_BAD_ACCESS). The garbage collector has a license to allocate on 4 byte boundary or to rellocate on 4 bytes boundary at any time, and thus there is not much we can do about it. Eliot told me that a new garbage collector with proper 8 byte alignment was in his plans, but not soon, Once more, I will depend on the tremendous amount of work this guy put in Smalltalk, but who doesn't? Viva Miranda!
So my bug was happening by example with cblas_dgemv on Mac OSX veclib if the matrix is transposed, and by extension with any LAPACK function calling a transposing DGEMV...
GOOD NEWS! Once discovered last night, the bug was quickly fixed this evening, it was just a matter of redefining two methods (arrayPointer) so that they transfer the data in external heap (storeInCSpace) before answering the pointer object...

It is very dangerous to use a (ByteArray new: byteSize) as a memory for holding a table of double and passing it as argument to an external procedure... Some external procedures will tolerate a 4 bytes alignment, some won't... You should always prefer an (ExternalAddress gcallocate: byteSize) for passing a double *. By virtue of libc malloc(), the ExternalAddress shall always contain a correctly aligned pointer.

Tuesday, August 16, 2011

Funny design decision

I write this for sharing but also for reminding me later: if you ever have to develop in FORTRAN there is a gfortran option that SHOULD be your companion
With this option, every real variable will automatically be initialized with a signalling nan, and your program should stop when it will incorrectly use the variable. The SUN FORTRAN compiler did have such option for long, but it takes time to reach this original quality.
And indeed, g77 did not provide such option but only a poor man's
The former helps you finding your bugs while the later is more to help you hiding your bugs. What a funny design decision... 

I had to code array/matrix crunching code for tight loops and wanted to use Lapack/Blas for that. And I wanted the latest version with bug corrections. However, I discovered that release 3.3 of Lapack now contains FORTRAN 90 specific instructions. Of course, there are plenty sources of errors in FORTRAN 77 that have been addressed by FORTRAN 90... The ones that would have been most useful to me today are:
  • verification of subroutine interface signature (number of arguments, types and input/output intent)
  • passing an object (with number of dimensions, bounds and stride) rather than just the address of the raw array storage.
The former was traditionally  verified by 3rd party analyzers (forcheck, ftnchek).
But the later is a real improvement. There was a compiler option that did perform runtime checks
But it was more a joke than something useful because it uses the declared bounds, not the really allocated ones.
Unfortunately, if you want to profit by these features, your code has to adhere to FORTRAN 90 interface definitions. Lapack does not. It was written in FORTRAN 77 for historical reasons and I suspect it remains like this both for economical reasons and efficiency reasons (like passing an address is just faster than passing an array and going thru runtime checks). Since the FORTRAN 90 interface declarations wrappers to Lapack are not even kept up to date, there is not much to gain by using FORTRAN 90...  That just prevents me to use g77, f2c, or ftnchek for no good reason (except a recursive routine), so I thought, what a funny design decision...

Finally, I went thru edit/compile/print (gdb will not debug FORTRAN that easily, especially if you want to inspect multi-dimensional arrays, you first have to debug the debugger). As usual, some error was on my test code, while I was stupidly focusing on the more complex tested code. At the end of the day I had to admit that most of the time lost was in:
  • translating a few f90-ism back to f77 (to please g77);
  • and not testing uninitialized variables (no such option in g77).
All this, just because I decided to use g77... What a funny design decision!
And next time, I SHALL remember that ftnchek SHOULD always be the natural companion of such funny decisions.
Well, we are very far from Smalltalk. Our debugger rarely has to be debugged and our instance and temp variables are initialized to nil, so design quality was again ahead and the subject is closed. But would Smalltalk perform your number crunching? Not mine. Not even, C, C++, or FORTRAN will. That's why the Blas were invented. I eventually have Smallapack, a functioning interface to Blas/Lapack in Visualworks, but it's not up to date either (Lapack 3.0...) and debugging bare bone allocation problems through an additional DLLCC/VM layer would not help. What I would need (and maybe program one day in Smalltalk) is a FORTRAN interpreter with all the necessary bound checks...

Thursday, August 4, 2011

Lazy initialization of Shared Variable Bindings

One problem we encounter with package management in Squeak/Pharo, but also in other Smalltalk dialects is the initialisation of shared variables.

Generally a class initialization message is defined like this:
MyClass class>>initialize
    MyShared1 := SomeOtherClass new.
    self snipLotOfMoreCode
And by convention, this initialize method is executed at package load.

But what if the initialization of a package require another package which is not yet loaded (for example containing SomeOtherClass or less trivially, containing a message sent during the initialization process).

One work around often encountered  is to use a message indirection coupled with a lazy initialization to access the shared variable value, like this:
MyClass class>>someSharedInfo
    ^SomeSharedInfo ifNil: [self initializeSomeSharedInfo]

But why putting pressure on coder when the system could do it by itself ?
The idea is to create a new class for holding uninitialized variable bindings or value and associate an initialization method to be executed when first access will be attempted.

Associating an initialization method could be performed through use of pragmas, for example:
MyClass class>>initializeSomeSharedVariables
    MyShared1 := 5.
    MyShared2 := #yourself.

In Smalltalk, a shared variable is a binding (an Association) with the variable name as key (generally a Symbol) and a value (the value of the variable). The Smalltalk Compiler arrange so that binding is unique across all method accessing the variable, thus the value can effectively be shared.

Now, what happens if we access MyShared1 ? In Smalltalk, shared variable are generally not accessed by sending value/value: messages to the binding. Instead, the compiler produces a byte code that directly fetch the second inst.var. for reading and another similar byte code for writing the value. 
But this can be controlled in Squeak/Pharo by a compiler hook: just define your own binding class and two methods:
    "Return true if this variable binding is read protected, e.g., should not be accessed primitively but rather by sending #value messages"
    "Return true if this variable binding is write protected, e.g., should not be accessed primitively but rather by sending #value: messages"

So we now have many options to implement a lazy initialization...

Solution 1) instead of initializing value with nil, create an UnitializedSharedValue class pointing to the binding and an initialization message to be performed
And implement doesNotUnderstand: like this
ProtoObject subclass: #UnitializedSharedValue
    instanceVariableNames: 'binding initializer'
    classVariableNames: ''
    poolDictionaries: ''
    category: 'Utility'
setInitializer: anEvaluator
    "Set the evaluator responsible of initializing variable value.
    anEvaluator should understand the message value."
    initializer :=  anEvaluator
setBinding: aVariableBinding
    binding :=  aVariableBinding
doesNotUnderstand: aMessage
            [binding error: 'This variable binding does not have any known initializer'.
    initializer value.
    binding value == self
            [binding error: 'The initializer failed to initialize this binding'.
    ^binding value perform: aMessage selector withArguments: aMessage arguments

The problem with this kind of trick is that it is hard to catch all message sends (in Squeak/Pharo, class is not a true message send for example). And if you catch them, you then can't debug easily, because the debugger uses message sends too.

Solution 2) create AutoInitializedSharedVariableBinding that is Special Read Binding
and use an indirection to access the value:
LookupKey subclass: #AutoInitializedSharedVariableBinding
    instanceVariableNames: 'value undefined initializer'
    classVariableNames: ''
    poolDictionaries: ''
    category: 'Collections-Support'
    value := undefined := Object new.
setInitializer: anEvaluator
    initializer :=  anEvaluator
    ^value == undefined
        ifTrue: [value]
        ifFalse: [self initializeValue]
            [self error: 'This variable binding does not have any known initializer'.
    initializer value.
    value == undefined
            [self error: 'The initializer failed to initialize this binding'.

This will cost an indirection (the #value send) and one test at each access. Not worse than manual lazy initialization we saw first.

Solution 3) avoid the value == undefined test by using a become: trick. There will still be a value/value: indirection cost though, so it might not be worth the complications...
LookupKey subclass: #UninitializedVariableBinding
    instanceVariableNames: 'value'
    classVariableNames: ''
    poolDictionaries: ''
    category: 'Collections-Support'
key: aKey value: anEvaluator
    key := aKey.
    value := anEvaluator
setInitializer: anEvaluator
    value :=  anEvaluator
        ifNil: [UnitializedVariableException signal]
        ifNotNil: [self value: value value]
value: anObject
    self become: (InitializedVariableBinding key: key value: anObject).

LookupKey subclass: #InitializedVariableBinding
    instanceVariableNames: 'value'
    classVariableNames: ''
    poolDictionaries: ''
    category: 'Collections-Support'
key: aKey value: anObject
    key := aKey.
    value := anObject
setInitializer: anEvaluator
    self become: (UninitializedVariableBinding key: key value: anEvaluator).
value: anObject
    ^value := anObject

Note the return in last method so that we can chain MyShared1 := MyShared2 := someValue...

Solution 4) like in Cincom VisualWorks, let all SharedVariableBinding carry their own initialization code (which is like a clean block). But this does not solve the initialization order problems and this requires tools support (and also source code management support for storing/retrieving/executing the code associated to the shared variable binding).

The pragma handling common to all solutions:
Just create this class
InstructionClient subclass: #SharedInitializationFinder
    instanceVariableNames: 'usedBeforeDefined initializedBindingsByFetchingValue initializedBindingsBySendingValue stack stackPerPC pc'
    classVariableNames: ''
    poolDictionaries: ''
    category: 'Utility'`

With these class methods:
    "SharedInitializationFinder initialize."
    self registerForEvents
    SystemChangeNotifier uniqueInstance noMoreNotificationsFor: self.
    SystemChangeNotifier uniqueInstance notify: self ofAllSystemChangesUsing: #initializeEvent:.
scanMethod: aCompiledMethod
    "Answer with a list of variable bindings initialized by aCompiledMethod via sending #value: message."
    ^self new scanMethod: aCompiledMethod
initializeEvent: anEvent
    "Check if this system event defines or removes an automatic initialization."
    | aClass aSelector method |
    (anEvent itemKind = SystemChangeNotifier classKind and: [anEvent isRemoved])
        ifTrue: ["We should track initializations bound to this class..."].
    anEvent itemKind = SystemChangeNotifier methodKind ifTrue: [
        aClass := anEvent itemClass.
        aClass isMeta ifFalse: [^self]. "ignore instance methods"
        aClass := aClass theNonMetaClass.
        aSelector := anEvent itemSelector.
        aSelector numArgs > 0 ifTrue: [^self]. "ignore methods with arguments"
        (anEvent isRemoved) ifTrue: [
            "We should track initializations bound to this class and method..."].
        (anEvent isAdded or: [anEvent isModified]) ifTrue: [
            method := anEvent item.
            method pragmas do: [:pragma |
                | shared |
                pragma keyword == #initialize ifTrue: [
                    shared := self scanMethod: method.
                    shared do: [:binding | binding setInitializer: (MessageSend receiver: aClass selector: aSelector)]]]]].
On instance side, let the class process the bytecodes decoding (most are omitted), the most important being:
    usedBeforeDefined := Set new.
    initializedBindingsByFetchingValue := Set new.
    initializedBindingsBySendingValue := Set new.
    "Answer with the list of variable bindings being initialized by target CompiledMethod by directly storing into value."
    "Answer with the list of variable bindings being initialized by target CompiledMethod by directly sending value: message."
pushLiteralVariable: anAssociation
    "Push Contents Of anAssociation On Top Of Stack bytecode."
    stack addLast: anAssociation
popIntoLiteralVariable: anAssociation
    "Remove Top Of Stack And Store Into Literal Variable bytecode."
    initializedBindingsByFetchingValue add: anAssociation.
    stack removeLast
send: aSelector super: supered numArgs: numberArguments
    | theReceiver |
    1 to: numberArguments do: [:i | stack removeLast].
    theReceiver := stack removeLast.
    theReceiver ifNotNil: [
        aSelector == #value ifTrue: [usedBeforeDefined add: theReceiver].
        aSelector == #value: ifTrue: ["Protect against trivial infinite loops..."
            (usedBeforeDefined includes: theReceiver)
                ifFalse: [initializedBindingsBySendingValue add: theReceiver]]].
    stack addLast: nil
scanMethod: method
    | scanner end |
    self initializePC.
    scanner := InstructionStream on: method.
    end := method endPC.
    [(pc := scanner pc) <= end] whileTrue:
        [stackPerPC at: pc ifPresent: [:restoredStack | stack := restoredStack].
        scanner interpretNextInstructionFor: self].
    ^self initializedBindingsBySendingValue

Last, we need to define a hook, such that any binding could become an uninitialized or auto initialized variable binding, for example, solution 3) would be:
LookUpKey>>setInitializer: anEvaluator
    "Set the evaluator responsible of initializing variable value.
    anEvaluator should understand the message value."
    self become: (UninitializedVariableBinding key: key value: anEvaluator).

The nice thing with above hook is that each time you change the initialization method, the variables will be reset by the setInitializer: mechanism. No more manual doIt : self initializeThisOrThat.
Of course, there are more things to handle before getting a fully functioning framework:
1) what if you remove an initializer method/class ?
2) what if you define two or more initializer methods ?
Especially, if you define a second initialization method then, oops, remove it, the variables should switch back to the first initialization...
3) detecting unchanged initialization method and avoiding unecessary re-initialization would be smart.
But these are boring little details... (where the devil is hidden as you may know).