RTLs are quite simple - a solitary RTL is just a single abstract machine instruction. A sequence of RTLs might look like so:
An RTL must specify a simple property - the machine invariant, which states that any RTL maps to a single instruction on the target machine.
t := reg
t := mem * reg
t := t + reg
t := t * 10
The compilation process is like so, the input language (for example, C--, GRIN or any AST for that matter) is taken and fed to the code expander which takes the input AST and generates a simple sequence of RTLs representing the AST - no attention at this point is paid to the generated code, and this keeps the code expander's job simple and easy. The code expander, however is a machine-dependent part of the backend - it requires knowledge of the target machine so that it may establish the machine invariant.
After expansion, the code is subsequently optimized using e.g peephole optimization, constant subexpression elimination and dead code elimination. These optimizer parts are machine independent. Every optimization must make sure it does not violate the machine invariant.
After optimization, we perform the task of code emission, it being the only other machine-dependent part of the compiler.
In application to LHC, it is quite obvious a new backend is necessary. I believe this compilation method is applicable and preferable. Combined with whole program analysis, and my plans for a small and simple as possible runtime, this makes the compiler easier to retarget, and I wish for LHC to be a cross-compiler as well. It's obvious C is not the way to go, so minimizing burden on retargeting seems like a good strategy. My inevitable plans are for you to be able to take a haskell application, use LHC to compile it and get nice ASM code for whatever architecture you wish.
The overall structure of the new backend I am thinking of is something like this (drawn very badly using Project Draw):
We will perform as many optimizations as possible on the whole-program while it is in GRIN form, leaving the backend to do the task of peephole optimization/DCE/etc. in a machine-independent way, and then emit assembly code.
On the note of the implementing an optimization engine for which to perform these operations, the approach described in An Applicative Control-flow Graph based on Huet's Zipper seems promising as well. Garbage collection as well needs definite addressing before we can get LHC to compile truly serious programs; this will take a little more time to think and talk about with David. Eventually, we may even look into using Joao Dias' thesis research to automatically generate compiler backends, like I believe GHC HQ wants to do.