The GHC Commentary - Cunning Prelude Code

GHC's uses a many optimsations and GHC specific techniques (unboxed values, RULES pragmas, and so on) to make the heavily used Prelude code as fast as possible.

Par, seq, and lazy

In GHC.Conc you will dinf

  pseq a b = a `seq` lazy b

What's this "lazy" thing. Well, pseq is a seq for a parallel setting. We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict in b, then b might be evaluated before a, which is all wrong.

Solution: wrap the 'b' in a call to GHC.Base.lazy. This function is just the identity function, except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the inlining and strictness information that comes in from GHC.Base.hi. And that makes lazy look lazy, and have no inlining. So the strictness analyser gets no traction.

In the worker/wrapper phase, after strictness analysis, lazy is "manually" inlined (see WorkWrap.lhs), so we get all the efficiency back.

This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an Int#. Now there is no seq# operator at all.

fold/build

There is a lot of magic in PrelBase.lhs - among other things, the RULES pragmas implementing the fold/build optimisation. The code for map is a good example for how it all works. In the prelude code for version 5.03 it reads as follows:

map :: (a -> b) -> [a] -> [b]
map _ []     = []
map f (x:xs) = f x : map f xs

-- Note eta expanded
mapFB ::  (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
{-# INLINE [0] mapFB #-}
mapFB c f x ys = c (f x) ys

{-# RULES
"map"	    [~1] forall f xs.	map f xs		= build (\c n -> foldr (mapFB c f) n xs)
"mapList"   [1]  forall f.	foldr (mapFB (:) f) []	= map f
"mapFB"	    forall c f g.	mapFB (mapFB c f) g	= mapFB c (f.g) 
  #-}

Up to (but not including) phase 1, we use the "map" rule to rewrite all saturated applications of map with its build/fold form, hoping for fusion to happen. In phase 1 and 0, we switch off that rule, inline build, and switch on the "mapList" rule, which rewrites the foldr/mapFB thing back into plain map.

It's important that these two rules aren't both active at once (along with build's unfolding) else we'd get an infinite loop in the rules. Hence the activation control using explicit phase numbers.

The "mapFB" rule optimises compositions of map.

The mechanism as described above is new in 5.03 since January 2002, where the [~N] syntax for phase number annotations at rules was introduced. Before that the whole arrangement was more complicated, as the corresponding prelude code for version 4.08.1 shows:

map :: (a -> b) -> [a] -> [b]
map = mapList

-- Note eta expanded
mapFB ::  (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
mapFB c f x ys = c (f x) ys

mapList :: (a -> b) -> [a] -> [b]
mapList _ []     = []
mapList f (x:xs) = f x : mapList f xs

{-# RULES
"map"	  forall f xs.  map f xs	       = build (\c n -> foldr (mapFB c f) n xs)
"mapFB"	  forall c f g. mapFB (mapFB c f) g    = mapFB c (f.g) 
"mapList" forall f.	foldr (mapFB (:) f) [] = mapList f
 #-}

This code is structured as it is, because the "map" rule first breaks the map open, which exposes it to the various foldr/build rules, and if no foldr/build rule matches, the "mapList" rule closes it again in a later phase of optimisation - after build was inlined. As a consequence, the whole thing depends a bit on the timing of the various optimsations (the map might be closed again before any of the foldr/build rules fires). To make the timing deterministic, build gets a {-# INLINE 2 build #-} pragma, which delays build's inlining, and thus, the closing of the map. [NB: Phase numbering was forward at that time.]

Last modified: Mon Feb 11 20:00:49 EST 2002