Posted on 17/08/2013

GHC gets new primops

I finished 7th week of my MSR internship in Cambridge, with 5 more weeks to go. I’m working on Cmm optimizations, but so far my project has been moving rather slowly and I’m not yet sure if the final results will be beneficial enough to be merged into HEAD. But I finally finished something I’ve been working on for the last couple of months - patches for new comparison PrimOps in GHC (Trac ticket #6135) have been merged on Wednesday. I created a wiki page which gives detailed information about motivation behind this change, discusses some implementation details and presents preliminary benchmarking results. This post is aimed at people who want to learn more about how the previous and current implementation of comparisons work.

So what is a Primitive Operation - PrimOp for short - in GHC? The developer wiki defines primops as “functions that cannot be implemented in Haskell, and are provided natively by GHC”. In other words these are special functions, which are not defined in the standard libraries - because these libraries are written in Haskell and, as we just said, PrimOps cannot be expressed in Haskell - but instead are built into the compiler. Whenever GHC encounters a PrimOp invocation in the source code it magically generates machine code for that PrimOp. There’s a lot of PrimOps in GHC (their definitions are here) and my task was to modify the ones responsible for comparing unboxed primitive values like Int# or Double#. Up till now all these comparisons returned a Bool indicating whether a given equality or ordering relation holds between the two parameters. This was determined by comparing them using a machine instruction like cmp, which set flags in flag register of the processor. Since the result of a primop was supposed to be a Bool (boxed, lifted value) and not some flags, we had to use another instruction ((From SETcc family.)) to examine flags set by cmp and based on them set a value of a general purpose register to 0# (representing False) or 1# (representing True). This gives us an Int#, so still not a Bool, but fear not! GHC has a tagToEnum# primop, which takes an unboxed integer and produces a value of an enumeration ((Algebraic data type with nullary constructors)). Calling tagToEnum# 0# means “create value represented by the first value constructor” (in case of Bool this is False), while calling tagToEnum# 1# means “create value represented by the second value constructor” (in case of Bool this is True) ((Note that tagToEnum# has type Int# -> a, which means it is a polymorphic primop - quite an unusual thing)). So we have our Bool value. Now we must examine whether it is True or False and act accordingly. Since Bool is both boxed and lifted, its values are represented by closures. Bool is a bit of a special case, because closures for its value constructors are allocated statically in the data area of a compiled program and not dynamically on the heap. Normally we would have to inspect a returned closure, but GHC has two optimisations which kick in here. One of them is pointer tagging, which allows us to determine a value constructor in a closure without dereferencing the pointer. This is done using pointer tagging. But we don’t even use pointer tagging in this case, because GHC is even smarter: it optimizes away closures generated by comparison primops and just uses an unboxed integer value generated by machine instructions doing comparison. This requires a special case in the code generator which is Not A Good Thing.

That’s how things were up till now. The key idea behind my change is that code generator no longer generates implicit call to tagToEnum#. Instead, the new primops return an Int# (either 0# or 1#) and whenever we really want a Bool we have to call tagToEnum# explicitly in the Haskell source code.

There is one thing that could be improved here. We still need that special case in the code generator to avoid generating Bool closures and inspecting results by checking pointer tags in situations where we make an explicit call to tagToEnum#. But Simon already has an idea how to resolve that problem. We just need to add some extra simplifier transformations that will optimize away tagToEnum# at the Core level, although I’m not yet sure if I’ll have enough time to implement that during my internship.

Back