What he did was basically to remove laziness and polymorphism from the state monad using a fancy new GHC feature called indexed type families. Benchmarking the different implementation was done by calculating the Fibonacci sequence and printing the 400,000th element.

There are quite a number of such adaptive data types. They range from lists to maps to monads but they all share two fundamental drawbacks: (1) All usage combinations must be explicitly enumerated, (2) laziness must be eliminated. Fortunately for LHC, using whole-program optimization solve both problems (by static analysis and unboxing at a higher granularity).

I believe it's important to realise that polymorphism and laziness are two sides of the same coin. Destroy one and you are likely to inadvertently destroy the other. 'Is this a bad thing?' you might ask. Well, the short answer is "yes!". Laziness, when used correctly, is incredibly powerful. Let's have another look at the State monad benchmark.

The following program implements the above mentioned benchmark. It is 10 times faster and uses 50 times less memory than the most efficient strict version.

{-# OPTIONS_GHC -O2 -XBangPatterns #-}

import Control.Monad

import Control.Monad.State

main = print . last . fst $ fib 400000

fib n = unS (fibN n) (0,1::Int)

fibN n = replicateM' n oneFib

oneFib = (get >>= \(m,n) -> put (n,m+n) >> return m)

replicateM' 0 fn = return []

replicateM' n fn

= do !e ← fn

r ← replicateM' (n−1 ∷ Int) fn

return (e:r)

So, in conclusion: Constructor specialization is an interesting technique but its full power can only be realised as an interprocedural optimization pass.

Would you care to elaborate on "whole-program optimization solve both problems (by static analysis and unboxing at a higher granularity)." and on "polymorphism and laziness are two sides of the same coin. Destroy one and you are likely to inadvertently destroy the other".

ReplyDeleteI'm sure you're right, but I'd like to see how.

Simon