Notes on the LHC: The mess with variable ids.

Monday, January 12, 2009

The mess with variable ids.

Variable identification tags can contain four different types of information. In Haskell we would write it as such:


data Id = Empty            -- Unused binding. Eg: '\ _ -> ...'.
        | Etherial Int     -- Internal variable. Only used when type-checking.
        | Anonymous Int    -- Anonymous variable created by the compiler.
        | Named Name       -- Named variable created by the user.

However, in LHC this data structure was unrolled and packed into an Int. The encoding went as following:


  Empty        = zero
  Etherial     = negative numbers
  Anonymous    = even, positive numbers
  Named        = odd, positive numbers, used as keys in a global hash table.

This encoding gives us very fast operations on Sets and Maps but it also punishes mistakes with a vengeance. The increased performance is definitely not worth it and we've been working on untangling the Ids from day-1.

As of today, I'm glad to say that we've finally restored the beautiful ADT and we can now hack without fear of segfaulting.

2 comments:

SamBJanuary 13, 2009 at 5:05 PM
At least, not at compile time ;-P.
ReplyDelete
Replies
UnknownJanuary 14, 2009 at 7:03 AM
I feel your pain. I made the same mistake in Happy and I'm still regretting it, someday I should change it back.
ReplyDelete
Replies

Add comment