Taking magic out of GHC or: Tracing compilation by transformation
When I was planning to learn about compilers I heard one phrase that still sticks in my mind:
Taking compilers course is a good thing because it takes magic out of compilers.
It may have been put into words differently but this was the meaning. Having taken compilers course and learning about GHC’s internals I think this is so true. A compiler is no longer a black box. It’s just a very complicated program built according to some general rules. In this post I want to reveal a little bit of magic that GHC does behind the scenes when compiling your Haskell program.
Imagine you are writing an image processing algorithm and you want to check
whether pixel coordinates x
and y
are within the image. That’s simple: if
any of these coordinates is less than 0
or if x
equals or exceeds image
width or if y
equals or exceeds image height than coordinates are not within
the image. Here’s how we can express this condition in Haskell:
case (x < 0) || (x >= width) || (y < 0) || (y >= height) of
False -> e1 -- do this when (x,y) is within the image
True -> e2 -- do that when (x,y) is outside of image
When compiling a Haskell program GHC uses language called Core as an
intermediate representation. It’s a very simplified ((We usually call it
“desugared”, because simplifying Haskell to Core simply removes syntactic
sugar.)) form of Haskell that has only let and case expressions and type
annotations. You don’t need any knowledge of Core to understand this post but if
you want to learn more I suggest to start with a blog post by Gabriel
Gonzalez and then take a
look at Edward Z. Yang’s
post,
that also shows GHC’s other intermediate languages: STG and Cmm. You can see
Core representation to which your program was transformed by passing
-ddump-simpl
flag to GHC (I also use -dsuppress-al
l to get rid of all type
informations that obscure the output). If you compile above case expression with
optimisations (pass the -O2
option to GHC) you end up with following Core
representation:
case <# x 0# of _ {
False ->
case >=# x width of _ {
False ->
case <# y 0# of _ {
False ->
case >=# y height of _ {
False -> e1;
True -> e2;
};
True -> e2;
};
True -> e2;
};
True -> e2;
}
GHC turned our infix comparison operators into prefix notation. It also unboxed
integer variables. This can be noticed by #
appended to integer literals and
comparison operators. There’s also some syntactic change in case
expressions:
there are braces surrounding the branches, semicolon is used to delimit branches
from each other and there is a mysterious underscore after the keyword of
. We
can rewrite this in a more familiar Haskell syntax (I will also reverse the
order of True
and False
branches - it will be more readable):
case x < 0 of
True -> e2
False ->
case x >= width of
True -> e2
False ->
case y < 0 of
True -> e2
False ->
case y >= height of
True -> e2
False -> e1
The most noticeable thing however is that our original case
expression has
suddenly turned into four nested case
s, which resulted in duplicating
expression e2
four times. In this post I will show you how GHC arrived at this
representation.
A bit of theory
There are some things you need to know in order to understand how GHC
transformed the code. First is the definition of (||)
operator (logical or):
(||) :: Bool -> Bool -> Bool
(||) x y = case x of
True -> True
False -> y
When optimizations are turned on GHC performs inlining of short functions. This
means that function calls are replaced by function definitions and this will be
the case with (||)
function.
Second thing is case-to-case code transformation. Imagine a code fragment like this:
case (
case C of
B1 -> F1
B2 -> F2
) of
A1 -> E1
A2 -> E2
We have a case
expression nested within a scrutinee ((A scrutinee is an
expression which value is checked by the the case
expression. Scrutinee is
placed between words ‘case’ and ‘of’.)) of another case
expression. You may be
thinking that you would never write such a code and you are right. GHC however
compiles programs by performing subsequent Core-to-Core transformations and such
nesting of case
expressions is often generated during that process (as we will
see in a moment). If nested case
expressions appear in the Core representation
of a program they are turned inside out by case-of-case transformation: the
nested case
scrutinising C
becomes the outer case
expression and the outer
case expression is pushed into branches B1
and B2
:
case C of
B1 -> case F1 of
A1 -> E1
A2 -> E2
B2 -> case F2 of
A1 -> E1
A2 -> E2
You see that code for E1
and E2
has been duplicated. This is worst case
scenario. In real life programs one of the branches can often be simplified
using case-of-known-constructor transformation. See what happens when expression
returned by a branch of nested case is a constructor that is matched by outer
case (A1
in this example):
case (
case C of
B1 -> A1
B2 -> F2
) of
A1 -> E1
A2 -> E2
After performing case-of-case transformation we end up with:
case C of
B1 -> case A1 of
A1 -> E1
A2 -> E2
B2 -> case F2 of
A1 -> E1
A2 -> E2
In the first branch of outer case expression we are now matching A1
against
A1
. So we know that first branch will be taken and thus can get rid of this
case
expression reducing it to E1
:
case C of
B1 -> E1
B2 -> case F2 of
A1 -> E1
A2 -> E2
Thus only E1
was duplicated. We will see that happen a lot in a moment.
The fun begins
Knowing all this we can begin optimizing our code:
case (x < 0) || (x >= width) || (y < 0) || (y >= height) of
False -> e1
True -> e2
First thing that happens is inlining of (||)
operators. The call to (x < 0) || (x >= width)
is replaced by definition of (||)
:
case (
case (x < 0) of -- this case comes from definition of ||
True -> True
False -> (x >= width)
) || (y < 0) || (y >= height) of
False -> e1
True -> e2
Let’s inline next use of (||)
:
case (
case ( -- this case is introduced by inlining of second ||
case (x < 0) of
True -> True
False -> (x >= width)
) of
True -> True
False -> (y < 0)
) || (y >= height) of
False -> e1
True -> e2
One more inlining and where done with (||)
:
case (
case ( -- this case is introduced by inlining of last ||
case (
case (x < 0) of
True -> True
False -> (x >= width)
) of
True -> True
False -> (y < 0)
) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
We ended up with three case
s nested as scrutinees of other case
s - I told
you this will happen. Now GHC will start applying case-of-case transformation to
get rid of all this nesting. Let’s focus on two most internal case
s for
simplicity:
case (
case (x < 0) of
True -> True
False -> (x >= width)
) of
True -> True
False -> (y < 0)
Performing case-of-case transformation on them gives:
case (x > 0) of -- nested case is floated out
True ->
case True of -- outer case is pushed into this branch...
True -> True
False -> (y < 0)
False ->
case (x >= width) of -- ...and into this branch
True -> True
False -> (y < 0)
Looking at first nested case
we see that case-of-known-constructor
transformation can be applied:
case (x < 0) of
True -> True -- case-of-known-constructor eliminated
-- case expression in this branch
False ->
case (x >= width) of
True -> True
False -> (y < 0)
Now let’s put these case
s back into our expression:
case (
case (
case (x < 0) of
True -> True
False ->
case (x >= width) of
True -> True
False -> (y < 0)
) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
Now we only have two case
s nested as scrutinees of other case
. Applying
case-of-case one more time will get rid of the first nesting:
case (
case (x < 0) of
True -> -- we can use case-of-known-constructor here
case True of
True -> True
False -> (y >= height)
False ->
case ( -- these nested cases weren't here before!
case (x >= width) of
True -> True
False -> (y < 0)
) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
Hey, that’s something new here! We eliminated nested case
s in one place only
to introduce them in another. But we know what to do with nested case
s - use
case-of-case of course. Let’s apply it to the second branch and
case-of-known-constructor to the first one:
case (
case (x < 0) of
True -> True -- case-of-known-constructor used here
False ->
case (x >= width) of -- case-of-case used here
True ->
case True of -- what about this case?
True -> True
False -> (y >= height)
False ->
case (y < 0) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
We just got another chance to perform case-of-known-constructor:
case (
case (x < 0) of
True -> True
False ->
case (x >= width) of
True -> True -- case-of-known-constructor
False ->
case (y < 0) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
We have on more nested case
to eliminate. Let’s hit it:
case (x < 0) of
True ->
case True of
False -> e1
True -> e2
False ->
case (
case (x >= width) of
True -> True
False ->
case (y < 0) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
Do you see how expressions e1
and e2
got duplicated? Let’s apply
case-of-known-constructor in the first branch and case-of-case +
case-of-known-constructor in the second one:
case (x < 0) of
True -> e2
False ->
case (x >= width) of
True -> e2
False ->
case (
case (y < 0) of
True -> True
False -> (y >= height)
) of
False -> e1
True -> e2
One more case-of-case:
case (x < 0) of
True -> e2
False ->
case (x >= width) of
True -> e2
False ->
case (y < 0) of
True ->
case True of
False -> e1
True -> e2
False ->
case (y >= height) of
False -> e1
True -> e2
And one more case-of-known-constructor:
case (x < 0) of
True -> e2
False ->
case (x >= width) of
True -> e2
False ->
case (y < 0) of
True -> e2
False ->
case (y >= height) of
False -> e1
True -> e2
And we’re done! We arrived at the same expression that GHC compiled. Wasn’t that simple?
Summary
This should give you an idea of how GHC’s core-to-core transformations work. I’ve only shown you two of them - case-of-case and case-of-known-constructor - but there are many more. If you’re interested in learning others take a look at paper by Simon Peyton Jones and Andre Satnos “A transformation-based optimiser for Haskell”. If you want to learn more details than the paper provides see Andre Santos’ PhD thesis “Compilation by Transformation in Non-Strict Functional Languages”. You can also take a look at a discussion at GHC ticket 6135.