Sunday, April 15, 2012

Meteor Contest - Part 8 dumb and fast

Remember our clever trick of rotating the board 180° once 6th row is filled up?
The possible positions of pieces were simple, just one case for south edge, and another for all the other pairs of rows...
...and another two cases after we rotate the board 180°, because we have to take complementary orientations of the 6th piece into account in order to generate only half the solutions.

All in all, that was not so simple. So let's go back to the dumb case of generating all possible positions for all rows.
We keep the shift of south row pairs as soon as filled to avoid generating costly LargePositiveInteger, so we still generate all possible positions on two rows:

    ^pieces collect: [:aPieceMask |
        | possible iRot |
        possible := (Array new: twoRows) collect: [:freeCell | Array new: 12 withAll: (ShootoutMeteorPiece new mask: 0)].
        iRot := 0.
        self rotationsOf: aPieceMask do: [:rotated |
            iRot := iRot + 1.
            self placesFor: rotated do: [:shifted |
                (possible at: shifted lowBit) at: iRot put: (ShootoutMeteorPiece
                    mask: ((self hasEastOrWestIsland: shifted) ifTrue: [0] ifFalse: [shifted])
                    islands: (self islandsFor: (shifted bitOr: shifted - 1)))]].

Note that (shifted bitOr: shifted - 1) previously was in islandsFor: but we have to change it for finding north islands.

We just have to eliminate different cases for each of the 5 different twoRows offsets.
For this, we will have to detect the case of north edge again.
So our possible positions becomes:

    | positionsPerPiecePerCell thePieceWhichBreakSymmetry |
    positionsPerPiecePerCell := self possiblePositionsOnTwoRows.
    thePieceWhichBreakSymmetry := (1 to: pieces size) detectMax: [:i | (positionsPerPiecePerCell at: i) detectSum: [:orientations | orientations count: [:e | e mask > 0]]].
    positionsPerPiece := (1 to: 5) collect: [:iRow |
        | maxMaskForRow northRow |
        maxMaskForRow := (1 bitShift: (#(6 6 6 4 2) at: iRow) * ncol) - 1.
        northRow :=  southEdge bitShift: ((#(-1 -1 6 4 2) at: iRow) - 1 * ncol).
        (1 to: twoRows) collect: [:cellNumber |
            (1 to: pieces size) collect: [:pieceNumber |
                | orientations n |
                orientations := (positionsPerPiecePerCell at: pieceNumber) at: cellNumber.
                n := pieceNumber = thePieceWhichBreakSymmetry ifTrue: [6] ifFalse: [12].
                Array new: n streamContents: [:str |
                    1 to: n do: [:i |
                        | aPiece |
                        aPiece := orientations at: i.
                        (aPiece mask > 0 and: [aPiece mask <= maxMaskForRow])
                                [(iRow = 1 and: [cellNumber <= ncol])
                                    ifTrue: [(self hasSouthIsland: aPiece mask)
                                        ifFalse: [str nextPut: (ShootoutMeteorPiece mask: aPiece mask islands: 0)]]
                                    ifFalse: [(aPiece mask bitAnd: northRow) > 0
                                        ifTrue: [(self hasNorthIsland: aPiece mask row: iRow)
                                                [| isle |
                                                isle := iRow = 5
                                                    ifTrue: [0]
                                                    ifFalse: [self northIslandsFor: aPiece mask row: iRow].
                                                str nextPut: (ShootoutMeteorPiece mask: aPiece mask islands: isle)]]
                                        ifFalse: [str nextPut: aPiece]]]]]]]]

Quite a long method...
But we now offer the luxury of recomputing which piece shall better have half orientations removed (thePieceWhichBreakSymmetry), instead of hardcoding 6.
And we compute the case of northern islands (except on the last two rows, we'll see later why).

For detecting north islands, we change our existing island detection to stop a bit earlier:

fillMaskStartingAt: pos stoppingAbove: maxCell ifFoundEnough: exitBlock
    (fillMask bitAnd: pos) = 0 ifFalse: [^self].
    (pos > maxCell) ifTrue: [^exitBlock value].
    fillMask := fillMask + pos.
    (self canShiftE: pos) ifTrue: [self fillMaskStartingAt: (self shiftE: pos) stoppingAbove: maxCell ifFoundEnough: exitBlock].
    (self canShiftNE: pos) ifTrue: [self fillMaskStartingAt: (self shiftNE: pos) stoppingAbove: maxCell ifFoundEnough: exitBlock].
    (self canShiftNW: pos) ifTrue: [self fillMaskStartingAt: (self shiftNW: pos) stoppingAbove: maxCell ifFoundEnough: exitBlock].
    (self canShiftW: pos) ifTrue: [self fillMaskStartingAt: (self shiftW: pos) stoppingAbove: maxCell ifFoundEnough: exitBlock].

islandsFor: aPieceMask
    | islands aMask nextFreeCellMask open top |
    islands := 0.
    fillMask := aPieceMask.
    top := 1 << (fillMask highBit - 1 // ncol * ncol - 1).
    [(nextFreeCellMask := 1 + fillMask bitAnd: -1 - fillMask) <= top]
            [open := false.
            aMask := fillMask.
            self fillMaskStartingAt: nextFreeCellMask stoppingAbove: top ifFoundEnough: [open := true].
            open ifFalse: [islands := islands + (fillMask - aMask)]].

Finding case of bad northern island is same as bad southern island once we reverse the bits:
hasNorthIsland: aPieceMask row: iRow
    | bitReverse |
    bitReverse := (#(-1 -1 6 4 2) at: iRow) * ncol.
    ^self hasSouthIsland: (aPieceMask bitReverse: bitReverse)

And finding the possibly bad islands is also a bit more tricky on north:
northIslandsFor: aPieceMask row: iRow
    | filled isleSEW bitReverse isleNE isleNW |
    bitReverse := (#(-1 -1 6 4 2) at: iRow) * ncol.
    filled := aPieceMask bitOr: aPieceMask - 1.
    isleSEW := self islandsFor: filled.
    (aPieceMask bitAnd: (eastEdge bitOr: westEdge)) = 0 ifFalse: [^isleSEW].
    (isleSEW bitAnd: (eastEdge bitOr: westEdge)) = 0 ifFalse: [^isleSEW].
    (southEdge bitAnd: aPieceMask) = 0
        ifTrue: [filled := filled >> ncol << ncol].
    isleNE := ((self islandsFor: (filled bitReverse: bitReverse)) bitReverse: bitReverse) bitOr: isleSEW.
    isleNW := (1 << bitReverse - 1 - (isleNE bitOr: (aPieceMask bitOr: aPieceMask - 1))) bitOr: isleSEW.
    ^isleNW bitCount < isleNE bitCount
        ifTrue: [isleNW]
        ifFalse: [isleNE]

Why being so clever? This deserves some explanations.
Obviously, if we put a barrier on south with aPieceMask bitOr: aPieceMask - 1 then every remaining cell will be an island. We must isolate more useful information. 
Knowing that previous cells were filled for sure (dark gray), we first see if we can find an isle (light gray) on east or west like these cases:

or on south, like this case:

If the piece touches an east or west edge, then we're done because if island there is, it already has been handled by hasNorthIsland:row:
If the islands touches east or west edge, then we're done too, we gathered enough information.

Else, we then find isleNE only in north-east side by removing the south row bits which belong to previous pieces:

But the complementary island isleNW in north-west side might be smaller.

So we consider only the island with lesser bitCount, because it will fast up our tests inside inner loops.

Last thing, there was one case we forgot, which is when a piece touches both east and west edges, it then must have a multiple of 5 free cells below (and above), otherwise be eliminated.
hasEastOrWestIsland: aMask
    ^ (self hasInsetZero: southEdge * (eastEdge bitAnd: aMask))
        or: [(self hasInsetZero: southEdge * (westEdge bitAnd: aMask))
            or: [(aMask bitAnd: eastEdge) > 0 and: [(aMask bitAnd: westEdge) > 0 and: [(self findIsland: aMask) bitCount \\ 5 > 0]]]]

Let's see how many pieces*orientations we get for each pair of rows:
| possible |
possible := ShootoutMeteorBoard default instVarNamed: 'positionsPerPiece'.
possible collect: [:e | e inject: 0 into: [:s1 :e1 |
    s1 + (e1 inject: 0 into: [:s2 :e2 |
    s2 + (e2 size)])]]
-> #(487 600 594 326 19)

Yes, that's pretty few on the last pair, so this should accelerate the solving a lot! And that's also why we did not bother about island tests on the last two rows...

Now we modify the search algorithm again, and remove the bitReverse:
searchPuzzlesWithColorMask: colorMask boardMask: bMask rowOffset: rowOff pieces: pArray ifFound: solutionBlock
    | nextFreeCell possibles colorBit iRow boardMask |
    colorMask = 0 ifTrue: [ ^solutionBlock value: (self boardStringWithPieces: pieces) ].
    loopCount := loopCount + 1.
    boardMask := bMask.
    iRow := rowOff.
    [(nextFreeCell := (boardMask + 1) lowBit) > twoRows]
            [ iRow := iRow + 2.
            boardMask := boardMask >> twoRows ].
    possibles := (positionsPerPiece at: iRow // 2 + 1) at: nextFreeCell.
    colorBit := 1.
    1 to: pieces size do: [:pieceNumber |
        (colorMask bitAnd: colorBit) = 0
                [ | positions |
                positions := possibles at: pieceNumber.
                1 to: positions size do: [:i |
                    | aPiece |
                    ((aPiece := positions at: i) fitOnBoard: boardMask)
                            [pieces at: pieceNumber put: (aPiece forRow: iRow).
                            self addPiece: aPiece index: pieceNumber forRow: iRow.
                            self searchPuzzlesWithColorMask: colorMask - colorBit boardMask: boardMask + aPiece mask rowOffset: iRow pieces: pArray ifFound: solutionBlock.
                            self removePiece: aPiece index: pieceNumber forRow: iRow]]].
        colorBit := colorBit * 2].

Since we don't bitReverse anymore, we have to correct filling of board with a piece mask:
ShootoutMeteorPiece>>fillSolution: aString ncol: ncol withColor: c
    | offset |
    offset := row * ncol.
    mask bitsDo: [:k | aString at: offset + k put: c]

ShootoutMeteorBoardMorph>>fillPiece: aPiece forRow: row withColor: hColor
    | offset |
    offset := row * 5.
    aPiece mask bitsDo: [:k | (submorphs at: offset + k) fillColor: hColor].
    self changed.

Twice the same code, but we prefer to duplicate for CPU optimization.

If we try it without the instrumentation hooks, 
[ShootoutMeteorBoard solveDefault] timeToRun.
-> 507 (milliseconds)
That's a few milliseconds more with the hooks (525 to 530).
Only 6 times slower than the faster C++ candidate, great!
And we get:
545906 loops
The C++ solution only has
309378 loops

But there is more memory and more island tests involved, I'm not sure whether I can find a better balance in Smalltalk...

Here is an animation, you can see by yourself that first solution is found faster:
(ShootoutMeteorInstrumentedBoard default) forMovie: true; solvedPuzzleDo: [:e | ^self].

Ah, it's good to be dumb!
The materials is at

No comments:

Post a Comment