The GHC Commentary - Primitives and the Prelude

One of the trickiest aspects of GHC is the delicate interplay between what knowledge is baked into the compiler, and what knowledge it gets by reading the interface files of library modules. In general, the less that is baked in, the better.

Most of what the compiler has to have wired in about primitives and prelude definitions is in fptools/ghc/compiler/prelude/.

GHC recognises these main classes of baked-in-ness:
Primitive types.
Primitive types cannot be defined in Haskell, and are utterly baked into the compiler. They are notionally defined in the fictional module GHC.Prim. The TyCons for these types are all defined in module TysPrim; for example,
  intPrimTyCon :: TyCon 
  intPrimTyCon = ....
Examples: Int#, Float#, Addr#, State#.

Wired-in types.
Wired-in types can be defined in Haskell, and indeed are (many are defined in GHC.Base). However, it's very convenient for GHC to be able to use the type constructor for (say) Int without looking it up in any environment. So module TysWiredIn contains many definitions like this one:
  intTyCon :: TyCon
  intTyCon = ....

  intDataCon :: DataCon 
  intDataCon = ....
However, since a TyCon value contains the entire type definition inside it, it follows that the complete definition of Int is thereby baked into the compiler.

Nevertheless, the library module GHC.Base still contains a definition for Int just so that its info table etc get generated somewhere. Chaos will result if the wired-in definition in TysWiredIn differs from that in GHC.Base.

The rule is that only very simple types should be wired in (for example, Ratio is not, and IO is certainly not). No class is wired in: classes are just too complicated.

Examples: Int, Float, List, tuples.

Known-key things.
GHC knows of the existence of many, many other types, classes and values. But all it knows is their Name. Remember, a Name includes a unique key that identifies the thing, plus its defining module and occurrence name (see The truth about Names). Knowing a Name, therefore, GHC can run off to the interface file for the module and find out everything else it might need.

Most of these known-key names are defined in module PrelNames; a further swathe concerning Template Haskell are defined in DsMeta. The allocation of unique keys is done manually; chaotic things happen if you make a mistake here, which is why they are all together.

All the Names from all the above categories are used to initialise the global name cache, which maps (module,occurrence-name) pairs to the globally-unique Name for that thing. (See HscMain.initOrigNames.)

The next sections elaborate these three classes a bit.

Primitives (module TysPrim)

Some types and functions have to be hardwired into the compiler as they are atomic; all other code is essentially built around this primitive functionality. This includes basic arithmetic types, such as integers, and their elementary operations as well as pointer types. Primitive types and functions often receive special treatment in the code generator, which means that these entities have to be explicitly represented in the compiler. Moreover, many of these types receive some explicit treatment in the runtime system, and so, there is some further information about primitives in the RTS section of this document.

The module TysPrim exports a list of all primitive type constructors as primTyCons :: [TyCon]. All of these type constructors (of type TyCon) are also exported as intPrimTyCon, stablePtrPrimTyCon, and so on. In addition, for each nullary type constructor the corresponding type (of type Type) is also exported; for example, we have intPrimTy :: Type. For all other type constructors, a function is exported that constructs the type obtained by applying the type constructors to an argument type (of type Type); for example, we have mkStablePtrPrimTy :: Type -> Type.

As it is inconvenient to identify type that receive a special treatment by the code generator by looking at their name, the module PrimRep exports a data type PrimRep, which lists all machine-manipulable implementation types. The module also exports a set of query functions on PrimRep that define properties, such as a type's byte size or whether a primitive type is a pointer type. Moreover, the function TysPrim.primRepTyCon :: PrimRep -> TyCon converts PrimRep values into the corresponding type constructor.

Wired in types (module TysWiredIn)

In addition to entities that are primitive, as the compiler has to treat them specially in the backend, there is a set of types, functions, etc. that the Haskell language definition flags as essential to the language by placing them into the special module Prelude that is implicitly imported into each Haskell module. For some of these entities it suffices to define them (by standard Haskell definitions) in a Prelude module and ensuring that this module is treated specially by being always imported .

However, there is a set of entities (such as, for example, the list type and the corresponding data constructors) that have an inbetween status: They are not truly primitive (lists, for example, can easily be defined by a data declaration), but the compiler has to have extra knowledge about them, as they are associated with some particular features of the language (in the case of lists, there is special syntax, such as list comprehensions, associated with the type). Another example, for a special kind of entity are type classes that can be used in a deriving clause. All types that are not-primitive, but about which the compiler nonetheless has to have some extra knowledge are defined in the module TysWiredIn.

All wired in type constructors are contained in wiredInTyCons :: [TyCon]. In addition to that list, TysWiredIn exports variables bound to representations of all listed type constructors and their data constructors. So, for example, we have listTyCon together with nilDataCon and consDataCon. There are also convenience functions, such as mkListTy and mkTupleTy, which construct compound types.

Known-key names (module PrelNames)

All names of types, functions, etc. known to the compiler are defined in PrelNames. This includes the names of types and functions exported from TysWiredIn, but also others. In particular, this module also fixes the names of all prelude modules; i.e., of the modules whose name starts with Prel, which GHC's library uses to bring some structure into the quite large number of Prelude definitions.

PrelNames.knownKeyNames :: [Name] contains all names known to the compiler, but the elements of the list are also exported individually as variables, such as floatTyConName (having the lexeme Float) and floatDataConName (having the lexeme F#). For each of these names, PrelNames derfines a unique key with a definition, such as

floatPrimTyConKey = mkPreludeTyConUnique 11

that is, all unique keys for known prelude names are hardcoded into PrelNames (and uniqueness has to be manually ensured in that module). To simplify matching the types of important groups of type constructors, PrelNames also exports lists, such as numericTyKeys (keys of all numeric types), that contain the unique keys of all names in that group. In addition, derivable type classes and their structure is defined by derivableClassKeys and related definitions.

In addition to names that have unique keys, PrelNames also defines a set of names without uniqueness information. These names end on the suffix _RDR and are of type RdrName (an example, is times_RDR, which represents the lexeme *). The names are used in locations where they pass through the renamer anyway (e.g., special constructors encountered by the parser, such as [], and code generated from deriving clauses), which will take care of adding uniqueness information.

Gathering it all together (module PrelInfo)

The module PrelInfo in some sense ties all the above together and provides a reasonably restricted interface to these definition to the rest of the compiler. However, from what I have seen, this doesn't quite work out and the earlier mentioned modules are directly imported in many places.

Last modified: Tue Dec 11 17:54:07 EST 2001