[[project @ 2005-05-04 15:19:37 by chak] chak**20050504151937 Updated and extended the section about the renamer. ] { hunk ./ghc/docs/comm/index.html 9 -
Name
. The Name
type is pervasive in GHC, and
+ is defined in basicTypes/Name.lhs
. Here is what a Name
+ looks like, though it is private to the Name module.
+
+ +++data Name = Name { + n_sort :: NameSort, -- What sort of name it is + n_occ :: !OccName, -- Its occurrence name + n_uniq :: Unique, -- Its identity + n_loc :: !SrcLoc -- Definition site + }+
n_sort
field says what sort of name this is: see
+ NameSort below.
+ n_occ
field gives the "occurrence name" of the
+ Name; see
+ OccName below.
+ n_uniq
field allows fast tests for equality of
+ Names.
+ n_loc
field gives some indication of where the
+ name was bound.
+ NameSort
of a Name
+ There are four flavours of Name
:
+
+++data NameSort + = External Module (Maybe Name) + -- (Just parent) => this Name is a subordinate name of 'parent' + -- e.g. data constructor of a data type, method of a class + -- Nothing => not a subordinate + + | WiredIn Module (Maybe Name) TyThing BuiltInSyntax + -- A variant of External, for wired-in things hunk ./ghc/docs/comm/the-beast/names.html 52 + | Internal -- A user-defined Id or TyVar + -- defined in the module being compiled hunk ./ghc/docs/comm/the-beast/names.html 55 -Every entity (type constructor, class, identifier, type variable) has -aName
. TheName
type is pervasive in GHC, -and is defined inbasicTypes/Name.lhs
. Here is what a Name looks like, -though it is private to the Name module. -- data Name = Name { - n_sort :: NameSort, -- What sort of name it is - n_occ :: !OccName, -- Its occurrence name - n_uniq :: Unique, -- Its identity - n_loc :: !SrcLoc -- Definition site - } -- --
- -- The
n_sort
field says what sort of name this is: see -NameSort below. -- The
n_occ
field gives the "occurrence name" of the Name; see -OccName below. -- The
n_uniq
field allows fast tests for equality of Names. -- The
n_loc
field gives some indication of where the name was bound. -The
- -There are three flavours ofNameSort
of aName
Name
: -- data NameSort - = External Module - | Internal - | System -- --
- - -- Here are the sorts of Name an entity can have: -
-
- -- Class, TyCon: External. -
- Id: External, Internal, or System. -
- TyVar: Internal, or System. -
- An
ExternalName
has a globally-unique -(module name,occurrence name) pair, namely the -original name of the entity, -describing where the thing was originally defined. So for example, -if we have -- module M where - f = e1 - g = e2 - - module A where - import qualified M as Q - import M - a = Q.f + g --then the RdrNames for "a", "Q.f" and "g" get replaced (by the Renamer) -by the Names "A.a", "M.f", and "M.g" respectively. - -- An
InternalName
-has only an occurrence name. Distinct InternalNames may have the same occurrence -name; use the Unique to distinguish them. - -
- An
ExternalName
has a unique that never changes. It is never -cloned. This is important, because the simplifier invents new names pretty freely, -but we don't want to lose the connnection with the type environment (constructed earlier). -AnInternalName
name can be cloned freely. - -- Before CoreTidy: the Ids that were defined at top level -in the original source program get
ExternalNames
, whereas extra -top-level bindings generated (say) by the type checker getInternalNames
. -This distinction is occasionally useful for filtering diagnostic output; e.g. -for -ddump-types. - -- After CoreTidy: An Id with an
ExternalName
will generate symbols that -appear as external symbols in the object file. An Id with anInternalName
-cannot be referenced from outside the module, and so generates a local symbol in -the object file. The CoreTidy pass makes the decision about which names should -be External and which Internal. - -- A
System
name is for the most part the same as an -Internal
. Indeed, the differences are purely cosmetic: --
-- Internal names usually come from some name the -user wrote, whereas a System name has an OccName like "a", or "t". Usually -there are masses of System names with the same OccName but different uniques, -whereas typically there are only a handful of distince Internal names with the same -OccName. -
- -Another difference is that when unifying the type checker tries to -unify away type variables with System names, leaving ones with Internal names -(to improve error messages). -
Occurrence names:
- -AnOccName
OccName
is more-or-less just a string, like "foo" or "Tree", -giving the (unqualified) name of an entity. + | System -- A system-defined Id or TyVar. Typically the + -- OccName is very uninformative (like 's') +
External
name has a globally-unique
+ (module name, occurrence name) pair, namely the
+ original name of the entity,
+ describing where the thing was originally defined. So for example,
+ if we have
+ +++module M where + f = e1 + g = e2 hunk ./ghc/docs/comm/the-beast/names.html 77 -Well, not quite just a string, because in Haskell a name like "C" could mean a type -constructor or data constructor, depending on context. So GHC defines a type -OccName (defined in basicTypes/OccName.lhs) that is a pair of -a FastString and a NameSpace indicating which name space the -name is drawn from: -+- data OccName = OccName NameSpace EncodedFS --The EncodedFS is a synonym for FastString indicating that the -string is Z-encoded. (Details in OccName.lhs.) Z-encoding encodes -funny characters like '%' and '$' into alphabetic characters, like "zp" and "zd", -so that they can be used in object-file symbol tables without confusing linkers -and suchlike. +module A where + import qualified M as Q + import M + a = Q.f + g
+ then the RdrNames for "a", "Q.f" and "g" get replaced (by the + Renamer) by the Names "A.a", "M.f", and "M.g" respectively. +
+InternalName
+ has only an occurrence name. Distinct InternalNames may have the same
+ occurrence name; use the Unique to distinguish them.
+ ExternalName
has a unique that never changes. It
+ is never cloned. This is important, because the simplifier invents
+ new names pretty freely, but we don't want to lose the connnection
+ with the type environment (constructed earlier). An
+ InternalName
name can be cloned freely.
+ ExternalNames
,
+ whereas extra top-level bindings generated (say) by the type checker
+ get InternalNames
. q This distinction is occasionally
+ useful for filtering diagnostic output; e.g. for -ddump-types.
+ ExternalName
will generate symbols that
+ appear as external symbols in the object file. An Id with an
+ InternalName
cannot be referenced from outside the
+ module, and so generates a local symbol in the object file. The
+ CoreTidy pass makes the decision about which names should be External
+ and which Internal.
+ System
name is for the most part the same as an
+ Internal
. Indeed, the differences are purely cosmetic:
+ -The name spaces are: -
OccName
+ An OccName
is more-or-less just a string, like "foo" or
+ "Tree", giving the (unqualified) name of an entity.
+
+ Well, not quite just a string, because in Haskell a name like "C" could + mean a type constructor or data constructor, depending on context. So + GHC defines a type OccName (defined in + basicTypes/OccName.lhs) that is a pair of a FastString + and a NameSpace indicating which name space the name is drawn + from: +
+++data OccName = OccName NameSpace EncodedFS+
+ The EncodedFS is a synonym for FastString indicating + that the string is Z-encoded. (Details in OccName.lhs.) + Z-encoding encodes funny characters like '%' and '$' into alphabetic + characters, like "zp" and "zd", so that they can be used in object-file + symbol tables without confusing linkers and suchlike. +
++ The name spaces are: +
+
+ The main externally used functions of the renamer are provided by the
+ module rename/RnSource.lhs
. In particular, we have
+
+++rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name) +rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name] +rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars)+
+ All of which execute in the renamer monad RnM
. The first
+ function, rnSrcDecls
renames a binding group; the second,
+ rnTyClDecls
renames a list of (toplevel) type and class
+ declarations; and the third, rnSplice
renames a Template
+ Haskell splice. As the types indicate, the main task of the renamer is
+ to convert converts all the RdrNames to Names, which includes a number of
+ well-formedness checks (no duplicate declarations, all names are in
+ scope, and so on). In addition, the renamer performs other, not
+ strictly name-related, well-formedness checks, which includes checking
+ that the appropriate flags have been supplied whenever language
+ extensions are used in the source.
+
+ A RdrName.RdrName is pretty much just a string (for an + unqualified name like "f") or a pair of strings (for a + qualified name like "M.f"): +
++++data RdrName + = Unqual OccName + -- Used for ordinary, unqualified occurrences hunk ./ghc/docs/comm/the-beast/renamer.html 55 -(This section is, like most of the Commentary, rather incomplete.) -+-The renamer sits between the parser and the typechecker. -Roughly speaking, It has the type: -
- HsModule RdrName -> HsModule Name --That is, it converts all the RdrNames to Names. + | Qual Module OccName + -- A qualified name written by the user in + -- *source* code. The module isn't necessarily + -- the module where the thing is defined; + -- just the one from which it is imported hunk ./ghc/docs/comm/the-beast/renamer.html 61 -RdrNames
+ | Orig Module OccName + -- An original name; the module is the *defining* module. + -- This is used when GHC generates code that will be fed + -- into the renamer (e.g. from deriving clauses), but where + -- we want to say "Use Prelude.map dammit". + + | Exact Name + -- We know exactly the Name. This is used + -- (a) when the parser parses built-in syntax like "[]" + -- and "(,)", but wants a RdrName from it + -- (b) when converting names to the RdrNames in IfaceTypes + -- Here an Exact RdrName always contains an External Name + -- (Internal Names are converted to simple Unquals) + -- (c) by Template Haskell, when TH has generated a unique name
+ The OccName type is described in The + truth about names. +
hunk ./ghc/docs/comm/the-beast/renamer.html 81 -A RdrNames is pretty much just a string (for an unqualified name -like "f") or a pair of strings (for a qualified name like "M.f"): -- data RdrName = RdrName Qual OccName - - data Qual = Unqual - - | Qual ModuleName -- A qualified name written by the user in source code - -- The module isn't necessarily the module where - -- the thing is defined; just the one from which it - -- is imported - - | Orig ModuleName -- This is an *original* name; the module is the place - -- where the thing was defined --The OccName type is described in "The truth about names". -
-The OrigName variant is used internally; it allows GHC to speak of RdrNames -that refer to the original name of the thing. +
+ Due to the tight integration of the renamer with the typechecker, both + use the same monad in recent versions of GHC. So, we have +
++++type RnM a = TcRn a -- Historical +type TcM a = TcRn a -- Historical+
+ with the combined monad defined as +
++++type TcRn a = TcRnIf TcGblEnv TcLclEnv a +type TcRnIf a b c = IOEnv (Env a b) c hunk ./ghc/docs/comm/the-beast/renamer.html 99 +data Env gbl lcl -- Changes as we move into an expression + = Env { + env_top :: HscEnv, -- Top-level stuff that never changes + -- Includes all info about imported things hunk ./ghc/docs/comm/the-beast/renamer.html 104 -Rebindable syntax
+ env_us :: TcRef UniqSupply, -- Unique supply for local varibles hunk ./ghc/docs/comm/the-beast/renamer.html 106 -In Haskell when one writes "3" one gets "fromInteger 3", where -"fromInteger" comes from the Prelude (regardless of whether the -Prelude is in scope). If you want to completely redefine numbers, -that becomes inconvenient. So GHC lets you say -"-fno-implicit-prelude"; in that case, the "fromInteger" comes from -whatever is in scope. (This is documented in the User Guide.) --This feature is implemented as follows (I always forget). -
-
+ env_gbl :: gbl, -- Info about things defined at the top level + -- of the module being compiled + + env_lcl :: lcl -- Nested stuff; changes as we go into + -- an expression + } +- Four HsSyn constructs (NegApp, NPlusKPat, HsIntegral, HsFractional) -contain a Name (i.e. it is not parameterised). -
- When the parser builds these constructs, it puts in the built-in Prelude -Name (e.g. PrelNum.fromInteger). -
- When the renamer encounters these constructs, it calls RnEnv.lookupSyntaxName. -This checks for -fno-implicit-prelude; if not, it just returns the same Name; -otherwise it takes the occurrence name of the Name, turns it into an unqualified RdrName, and looks -it up in the environment. The returned name is plugged back into the construct. -
- The typechecker uses the Name to generate the appropriate typing constraints. -
+ the details of the global environment type TcGblEnv
and
+ local environment type TcLclEnv
are also defined in the
+ module typecheck/TcRnTypes.lhs
. The monad
+ IOEnv
is defined in utils/IOEnv.hs
and extends
+ the vanilla IO
monad with an additional state parameter
+ env
that is treated as in a reader monad. (Side effecting
+ operations, such as updating the unique supply, are done with
+ TcRef
s, which are simply a synonym for IORef
s.)
+
+ As anticipated by the variants Orig
and Exact
+ of RdrName
some names should not change during renaming,
+ whereas others need to be turned into unique names. In this context,
+ the two functions RnEnv.newTopSrcBinder
and
+ RnEnv.newLocals
are important:
+
+++newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name +newLocalsRn :: [Located RdrName] -> RnM [Name]+
+ The two functions introduces new toplevel and new local names,
+ respectively, where the first two arguments to
+ newTopSrcBinder
determine the currently compiled module and
+ the parent construct of the newly defined name. Both functions create
+ new names only for RdrName
s that are neither exact nor
+ original.
+
+ A global RdrName
environment
+ RdrName.GlobalRdrEnv
is a map from OccName
s to
+ lists of qualified names. More precisely, the latter are
+ Name
s with an associated Provenance
:
+
+++data Provenance + = LocalDef -- Defined locally + Module + + | Imported -- Imported + [ImportSpec] -- INVARIANT: non-empty + Bool -- True iff the thing was named *explicitly* + -- in *any* of the import specs rather than being + -- imported as part of a group; + -- e.g. + -- import B + -- import C( T(..) ) + -- Here, everything imported by B, and the constructors of T + -- are not named explicitly; only T is named explicitly. + -- This info is used when warning of unused names.+
+ The part of the global RdrName
environment for a module
+ that contains the local definitions is created by the function
+ RnNames.importsFromLocalDecls
, which also computes a data
+ structure recording all imported declarations in the form of a value of
+ type TcRnTypes.ImportAvails
.
+
+ The function importsFromLocalDecls
, in turn, makes use of
+ RnNames.getLocalDeclBinders :: Module -> HsGroup RdrName -> RnM
+ [AvailInfo]
to extract all declared names from a binding group,
+ where HscTypes.AvailInfo
is essentially a collection of
+ Name
s; i.e., getLocalDeclBinders
, on the fly,
+ generates Name
s from the RdrName
s of all
+ top-level binders of the module represented by the HsGroup
+ RdrName
argument.
+
+ It is important to note that all this happens before the renamer
+ actually descends into the toplevel bindings of a module. In other
+ words, before TcRnDriver.rnTopSrcDecls
performs the
+ renaming of a module by way of RnSource.rnSrcDecls
, it uses
+ importsFromLocalDecls
to set up the global
+ RdrName
environment, which contains Name
s for
+ all imported and all locally defined toplevel binders. Hence,
+ when the helpers of rnSrcDecls
come across the
+ defining occurences of a toplevel RdrName
, they
+ don't rename it by generating a new name, but they simply look up its
+ name in the global RdrName
environment.
+
+ In Haskell when one writes "3" one gets "fromInteger 3", where + "fromInteger" comes from the Prelude (regardless of whether the + Prelude is in scope). If you want to completely redefine numbers, + that becomes inconvenient. So GHC lets you say + "-fno-implicit-prelude"; in that case, the "fromInteger" comes from + whatever is in scope. (This is documented in the User Guide.) +
++ This feature is implemented as follows (I always forget). +
HsExpr.SyntaxExpr
. Moreover, the association list
+ HsExpr.SyntaxTable
is set up by the renamer to map
+ rebindable names to the value they are bound to.
+ HsExpr.NegApp
, HsPat.NPat
,
+ HsPat.NPlusKPat
, HsLit.HsIntegral
, and
+ HsLit.HsFractional
) and
+ two constructs related to code>do expressions
+ (HsExpr.BindStmt
and
+ HsExpr.ExprStmt
) have rebindable syntax.
+ hunk ./ghc/docs/comm/the-beast/renamer.html 244 -Last modified: Tue Nov 13 14:11:35 EST 2001 +Last modified: Wed May 4 17:16:15 EST 2005 hunk ./ghc/docs/comm/the-beast/renamer.html 250 + }