The GHC Commentary - Template Haskell

The Template Haskell (TH) extension to GHC adds a meta-programming facility in which all meta-level code is executed at compile time. The design of this extension is detailed in "Template Meta-programming for Haskell", Tim Sheard and Simon Peyton Jones, ACM SIGPLAN 2002 Haskell Workshop, 2002. However, some of the details changed after the paper was published.

Meta Sugar

The extra syntax of TH (quasi-quote brackets, splices, and reification) is handled in the module DsMeta. In particular, the function dsBracket desugars the four types of quasi-quote brackets ([|...|], [p|...|], [d|...|], and [t|...|]) and dsReify desugars the three types of reification operations (reifyType, reifyDecl, and reifyFixity).

Desugaring of Quasi-Quote Brackets

A term in quasi-quote brackets needs to be translated into Core code that, when executed, yields a representation of that term in the form of the abstract syntax trees defined in Language.Haskell.TH.Syntax. Within DsMeta, this is achieved by four functions corresponding to the four types of quasi-quote brackets: repE (for [|...|]), repP (for [p|...|]), repTy (for [t|...|]), and repTopDs (for [d|...|]). All four of these functions receive as an argument the GHC-internal Haskell AST of the syntactic form that they quote (i.e., arguments of type HsExpr.HsExpr Name, HsPat.HsPat Name, HsType.HsType Name, and HsDecls.HsGroup Name, respectively).

To increase the static type safety in DsMeta, the functions constructing representations do not just return plain values of type CoreSyn .CoreExpr; instead, DsMeta introduces a parametrised type Core whose dummy type parameter indicates the source-level type of the value computed by the corresponding Core expression. All construction of Core fragments in DsMeta is performed by smart constructors whose type signatures use the dummy type parameter to constrain the contexts in which they are applicable. For example, a function that builds a Core expression that evaluates to a TH type representation, which has type Language.Haskell.TH.Syntax.Type, would return a value of type

Core Language.Haskell.TH.Syntax.Type

Desugaring of Reification Operators

The TH paper introduces four reification operators: reifyType, reifyDecl, reifyFixity, and reifyLocn. Of these, currently (= 9 Nov 2002), only the former two are implemented.

The operator reifyType receives the name of a function or data constructor as its argument and yields a representation of this entity's type in the form of a value of type TH.Syntax.Type. Similarly, reifyDecl receives the name of a type and yields a representation of the type's declaration as a value of type TH.Syntax.Decl. The name of the reified entity is mapped to the GHC-internal representation of the entity by using the function lookupOcc on the name.

Representing Binding Forms

Care needs to be taken when constructing TH representations of Haskell terms that include binding forms, such as lambda abstractions or let bindings. To avoid name clashes, fresh names need to be generated for all defined identifiers. This is achieved via the routine DsMeta.mkGenSym, which, given a Name, produces a Name / Id pair (of type GenSymBind) that associates the given Name with a Core identifier that at runtime will be bound to a string that contains the fresh name. Notice the two-level nature of this arrangement. It is necessary, as the Core code that constructs the Haskell term representation may be executed multiple types at runtime and it must be ensured that different names are generated in each run.

Such fresh bindings need to be entered into the meta environment (of type DsMonad.DsMetaEnv), which is part of the state (of type DsMonad.DsEnv) maintained in the desugarer monad (of type DsMonad.DsM). This is done using the function DsMeta.addBinds, which extends the current environment by a list of GenSymBinds and executes a subcomputation in this extended environment. Names can be looked up in the meta environment by way of the functions DsMeta.lookupOcc and DsMeta.lookupBinder; more details about the difference between these two functions can be found in the next subsection.

NB: DsMeta uses mkGenSym only when representing terms that may be embedded into a context where names can be shadowed. For example, a lambda abstraction embedded into an expression can potentially shadow names defined in the context it is being embedded into. In contrast, this can never be the case for top-level declarations, such as data type declarations; hence, the type variables that a parametric data type declaration abstracts over are not being gensym'ed. As a result, variables in defining positions are handled differently depending on the syntactic construct in which they appear.

Binders Versus Occurences

Name lookups in the meta environment of the desugarer use two functions with slightly different behaviour, namely DsMeta.lookupOcc and lookupBinder. The module DsMeta contains the following explanation as to the difference of these functions:

When we desugar [d| data T = MkT |]
we want to get
	Data "T" [] [Con "MkT" []] []
and *not*
	Data "Foo:T" [] [Con "Foo:MkT" []] []
That is, the new data decl should fit into whatever new module it is
asked to fit in.   We do *not* clone, though; no need for this:
	Data "T79" ....

But if we see this:
	data T = MkT 
	foo = reifyDecl T

then we must desugar to
	foo = Data "Foo:T" [] [Con "Foo:MkT" []] []

So in repTopDs we bring the binders into scope with mkGenSyms and addBinds,
but in dsReify we do not.  And we use lookupOcc, rather than lookupBinder
in repTyClD and repC.

This implies that lookupOcc, when it does not find the name in the meta environment, uses the function DsMeta.globalVar to construct the original name of the entity (cf. the TH paper for more details regarding original names). This name uniquely identifies the entity in the whole program and is in scope independent of whether the user name of the same entity is in scope or not (i.e., it may be defined in a different module without being explicitly imported) and has the form <module>:<name>. NB: Incidentally, the current implementation of this mechanisms facilitates breaking any abstraction barrier.

Known-key Names for Template Haskell

During the construction of representations, the desugarer needs to use a large number of functions defined in the library Language.Haskell.TH.Syntax. The names of these functions need to be made available to the compiler in the way outlined Primitives and the Prelude. Unfortunately, any change to PrelNames triggers a significant amount of recompilation. Hence, the names needed for TH are defined in DsMeta instead (at the end of the module). All library functions needed by TH are contained in the name set DsMeta.templateHaskellNames.

Last modified: Wed Nov 13 18:01:48 EST 2002