[[project @ 2001-10-18 15:57:06 by simonpj] simonpj**20011018155707 Add stuff about variables ] { addfile ./ghc/docs/comm/the-beast/vars.html hunk ./ghc/docs/comm/index.html 58 +
  • The Real Story about Variables, Ids, TyVars, and the like hunk ./ghc/docs/comm/the-beast/vars.html 1 + + + + + The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like + + + +

    The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like

    +

    + + +

    Variables

    + +The Var type, defined in basicTypes/Var.lhs, +represents variables, both term variables and type variables: +
    +    data Var
    +      = Var {
    +	    varName    :: Name,
    +	    realUnique :: FastInt,
    +	    varType    :: Type,
    +	    varDetails :: VarDetails,
    +	    varInfo    :: IdInfo		
    +	}
    +
    + +

    +It's often fantastically convenient to have term variables and type variables +share a single data type. For example, +

    +  exprFreeVars :: CoreExpr -> VarSet
    +
    +If there were two types, we'd need to return two sets. Simiarly, big lambdas and +little lambdas use the same constructor in Core, which is extremely convenient. +

    +We define a couple of type synonyms: +

    +  type Id    = Var  -- Term variables
    +  type TyVar = Var  -- Type variables
    +
    +just to help us document the occasions when we are expecting only term variables, +or only type variables. + +

    The VarDetails field

    + +The VarDetails field tells what kind of variable this is: +
    +data VarDetails
    +  = LocalId 		-- Used for locally-defined Ids (see NOTE below)
    +	LocalIdDetails
    +
    +  | GlobalId 		-- Used for imported Ids, dict selectors etc
    +	GlobalIdDetails
    +
    +  | TyVar
    +  | MutTyVar (IORef (Maybe Type)) 	-- Used during unification;
    +	     Bool			-- True <=> this is a type signature variable, which
    +					--	    should not be unified with a non-tyvar type
    +
    + +

    Type variables (TyVar)

    + +The TyVar case is self-explanatory. The +MutTyVar case is used only during type checking. Then a +tupe variable can be unified, using an imperative update, with a type, +and that is what the IORef is for. The Bool +field records whether the type variable arose from a type signature, +in which case it should not be unified with a type (only with another +type variable). +

    +For a long time I tried to keep mutable Vars statically type-distinct +from immutable Vars, but I've finally given up. It's just too painful. +After type checking there are no MutTyVars left, but there's no static check +of that fact. + +

    Term variables (Id)

    + +A term variable (of type Id) is represented either by a +LocalId or a GlobalId: +

    +A GlobalId is +

    + +

    +A LocalId is: +

    +

    +The key thing about LocalIds is that the free-variable finder +typically treats them as candidate free variables. That is, it ignores +GlobalIds such as imported constants, data contructors, etc. +

    +An important invariant is this: All the bindings in the module +being compiled (whether top level or not) are LocalIds +until the CoreTidy phase. In the CoreTidy phase, all +externally-visible top-level bindings are made into GlobalIds. This +is the point when a LocalId becomes "frozen" and becomes +a fixed, immutable GlobalId. +

    +(A binding is "externally-visible" if it is exported, or +mentioned in the unfolding of an externally-visible Id. An +externally-visible Id may not have an unfolding, either because it is +too big, or because it is the loop-breaker of a recursive group.) + +

    Global Ids and implicit Ids

    + +GlobalIds are further categorised by their GlobalIdDetails. +This type is defined in basicTypes/IdInfo, because it mentions other +structured types like DataCon. Unfortunately it is *used* in Var.lhs +so there's a hi-boot knot to get it there. Anyway, here's the declaration: +
    +data GlobalIdDetails
    +  = NotGlobalId			-- Used as a convenient extra return value 
    +                                -- from globalIdDetails
    +
    +  | VanillaGlobal		-- Imported from elsewhere
    +
    +  | PrimOpId PrimOp		-- The Id for a primitive operator
    +  | FCallId ForeignCall		-- The Id for a foreign call
    +
    +  -- These next ones are all "implicit Ids"
    +  | RecordSelId FieldLabel	-- The Id for a record selector
    +  | DataConId DataCon		-- The Id for a data constructor *worker*
    +  | DataConWrapId DataCon	-- The Id for a data constructor *wrapper*
    +				-- [the only reasons we need to know is so that
    +				--  a) we can  suppress printing a definition in the interface file
    +				--  b) when typechecking a pattern we can get from the
    +				--     Id back to the data con]
    +
    +The GlobalIdDetails allows us to go from the Id for +a record selector, say, to its field name; or the Id for a primitive +operator to the PrimOp itself. +

    +Certain GlobalIds are called "implicit" Ids. An implicit +Id is derived by implication from some other declaration. So a record selector is +derived from its data type declaration, for example. An implicit Ids is always +a GlobalId. For most of the compilation, the implicit Ids are just +that: implicit. If you do -ddump-simpl you won't see their definition. (That's +why it's true to say that until CoreTidy all Ids in this compilation unit are +LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in +this module, so that the code generator will generate code for the (curried) function. +

    +Implicit Ids carry their unfolding inside them, of course, so they may well have +been inlined much earlier; but we generate the curried top-level defn just in +case its ever needed. + +

    LocalIds

    + +The LocalIdDetails gives more info about a LocalId: +
    +data LocalIdDetails 
    +  = NotExported	-- Not exported
    +  | Exported	-- Exported
    +  | SpecPragma	-- Not exported, but not to be discarded either
    +		-- It's unclean that this is so deeply built in
    +
    +From this we can tell whether the LocalId is exported, and that +tells us whether we can drop an unused binding as dead code. +

    +The SpecPragma thing is a HACK. Suppose you write a SPECIALIZE pragma: +

    +   foo :: Num a => a -> a
    +   {-# SPECIALIZE foo :: Int -> Int #-}
    +   foo = ...
    +
    +The type checker generates a dummy call to foo at the right types: +
    +   $dummy = foo Int dNumInt
    +
    +The Id $dummy is marked SpecPragma. Its role is to hang +onto that call to foo so that the specialiser can see it, but there +are no calls to $dummy. +The simplifier is careful not to discard SpecPragma Ids, so that it +reaches the specialiser. The specialiser processes the right hand side of a SpecPragma Id +to find calls to overloaded functions, and then discards the SpecPragma Id. +So SpecPragma behaves a like Exported, at least until the specialiser. + + +

    Global and Local Names

    + +Notice that whether an Id is a LocalId or GlobalId is +not the same as whether the Id has a Local or Global Name: + +The significance of Global vs Local names is this: + + + + +Last modified: Wed Aug 8 19:23:01 EST 2001 + + + + }