Case short-circuiting.
Since case expressions in GRIN do not force the evaluation of the scrutinized value, they are usually preceded by a call to 'eval'. Then, after the 'eval' calls have been inlined, case-of-case patterns like this are very common:
do val <- case x of
[] -> unit []
CCons x xs -> unit (CCons x xs)
Ffunc a -> func a
case val of
[] -> jumpToNilCase
CCons x xs -> jumpToConsCase x xs
This is obviously inefficient since the case for Nil and Cons will be scrutinized twice. In the GRIN paper, Boquist deals with this by implementing a case short-circuiting optimization after the GRIN code has been translated to machine code. However, dealing with optimizations on the machine code level is quite a tricky thing to do and I'd much rather implement this optimization in GRIN. By making aggressive use of small functions we can do exactly that:
do case x of
[] -> jumpToNilCase
CCons x xs -> jumpToConsCase x xs
Ffunc a -> do val <- func a; checkCase val
checkCase val =
case val of
[] -> jumpToNilCase
CCons x xs -> jumpToConsCase x xs
Register allocation and tail calls.
Using a fixed calling convention is not necessary for whole-program compilers like LHC. Instead, we choose to create a new calling method for each procedure (this is easier than it sounds).
This has the obvious consequence of requiring the convention for return values to be identical for procedures that invoke each other with tail-calls. This was deemed an unacceptable restriction in the GRIN paper, and all tail-calls were subsequently removed before register allocation took place. Afterwards, another optimization step reintroduced tail-calls where possible.
I believe this is too much trouble for too little gain. The possible performance hit is out-weighed by the ease of implementation and the guarantee of tail-calls.
Simple node layout.
An unevaluated value is represented simply by a function name (or tag) and a fixed number of arguments. This value is then overwritten once it has been evaluated. However, the new value may be bigger than what was allocated to represent the unevaluated function.
One way to deal with this is to have two different node layouts: a fixed size node for small values, and a variable size node for big values. This is the approach taken in the GRIN paper and it understandably adds quite a bit of complexity.
Another method is to use indirections. This trades smaller average node size and ease of implementation against more heap allocations.
ROLEX DATEJUST 116139A
ReplyDeleteROLEX DATEJUST 116139C
ROLEX DATEJUST 116200
ROLEX DATEJUST 116200A
ROLEX DATEJUST 116200B
ROLEX DATEJUST 116200C
ROLEX DATEJUST 116200D
ROLEX DATEJUST 116200E
ROLEX DATEJUST 116200F