This document describes
the transformation that the compiler does to implement type classes.
Note: the transformation described here will soon be replaced by a design
documented in runtime/mercury_typeclass_info.h.
Tranformation of code using type classes
Every predicate which has a typeclass constraint is given an extra
argument for every constraint in the predicate's type declaration.
The argument is the "dictionary", or "typeclass_info" for the typeclass.
The dictionary contains pointers to each of the class methods.
Representation of a typeclass_info:
The typeclass_info is represented in two parts (the typeclass_info
itself, and a base_typeclass_info), in a similar fashion to the
type_info being represented in two parts (the type_info and the
type_ctor_info).
The base_typeclass_info contains:
- the sum of the number of constraints on the instance decl.
and the number of unconstrained type variables
from the head of the instance decl. (`n1')
- the number of constraints on the instance decl. (`n2')
- the number of constraints on the typeclass decl. (`n3')
- the number of parameters (type variables) from
the typeclass decl. (`n4')
- the number of methods from the typeclass decl. (`n5')
- pointer to method #1
- ...
- pointer to method #n5
The typeclass_info contains:
- a pointer to the base typeclass info
- type info for unconstrained type var #1 from the instance decl
- ...
- type info for unconstrained type var #(n1-n2) from the
instance decl
- typeclass info #1 for constraint on instance decl
- ...
- typeclass info #n2 for constraint on instance decl
- typeclass info for superclass #1
- ...
- typeclass info for superclass #n3
- type info #1
- ...
- type info #n4
The base_typeclass_info is produced statically, and there is one for
each instance declaration. For each constraint on the instance
declaration, the corresponding typeclass_info is stored in the second
part.
eg. for the following program:
:- typeclass foo(T) where [...].
:- instance foo(int) where [...].
:- instance foo(list(T)) <= foo(T) where [...].
The typeclass_info for foo(int) is:
- The base_typeclass_info:
- 0 (there are no unconstrained type variables and no constraints)
- 0 (there are no constraints on the instance decl)
- 0 (there are no constraints on the typeclass decl)
- 1 (this is a single-parameter type class)
- n5 (the number of methods)
- pointer to method #1
- ...
- pointer to method #n5
- The typeclass_info:
- a pointer to the base typeclass info
- type_info for int
The typeclass_info for foo(list(T)) is:
- The base_typeclass_info:
- 1 (no unconstrained tvars, 1 constraint on the instance decl)
- 1 (there is 1 constraint on the instance decl)
- 0 (there are no constraints on the typeclass decl)
- 1 (this is a single-parameter type class)
- n5 (the number of methods)
- pointer to method #1
- ...
- pointer to method #n5
- The typeclass_info contains:
- a pointer to the base typeclass info
- typeclass info for foo(T)
- type_info for list(T)
If the "T" for the list is known, the whole typeclass_info will be static
data. When we do not know until runtime, the typeclass_info is constructed
dynamically.
Example of transformation
Take the following code as an example (assuming the declarations above),
ignoring the requirement for super-homogeneous form for clarity:
:- pred p(T1) <= foo(T1).
:- pred q(T2, T3) <= foo(T2), bar(T3).
:- pred r(T4, T5) <= foo(T4).
p(X) :- q([X], 0), r(1, 0).
We add an extra argument for each type class constraint, and one
argument for each unconstrained type variable.
:- pred p(typeclass_info(foo(T1)), T1).
:- pred q(typeclass_info(foo(T2)), typeclass_info(bar(T3)), T2, T3).
:- pred r(typeclass_info(foo(T4)), type_info(T5), T4, T5).
We transform the body of p to this:
p(TypeClassInfoT1, X) :-
BaseTypeClassInfoT2 = base_typeclass_info(
1,
1,
0,
1,
n5, (ie. the number of methods)
...
... (The methods for the foo class from the list
... instance)
...
),
TypeClassInfoT2 = typeclass_info(
BaseTypeClassInfoT2,
TypeClassInfoT1,
<type_info for list(T1)>),
BaseTypeClassInfoT3 = base_typeclass_info(
0,
0,
0, (presuming bar has no superclasses)
1,
...
... (The methods for the bar class from the int
... instance)
...
),
TypeClassInfoT3 = typeclass_info(
BaseTypeClassInfoT3,
<type_info for int>),
q(TypeClassInfoT2, TypeClassInfoT3, [X], 0),
BaseTypeClassInfoT4 = baseclass_type_info(
0,
0,
0,
1,
...
... (The methods for the foo class from the int
... instance)
...
),
TypeClassInfoT4 = typeclass_info(
BaseTypeClassInfoT4,
<type_info for int>),
r(TypeClassInfoT4, <type_info for int>, X, 0).
Detecting duplicate instance declarations
We would like to catch duplicate instance declarations
(those that declare the same vector of possibly unground types
to be members of the same typeclass)
as early as possible.
Since duplicate declarations can occur in different modules,
the earliest practical time is link time.
We would therefore like to generate a name for the global variable
that holds the base_typeclass_info of an instance declaration
that depends only on the identity of the typeclass
and on the instance declaration's vector of argument types.
For the C backends, this is what we actually do.
As a result, duplicate instance declarations will result in a link error
for a multiply defined symbol if linking is done statically.
(With dynamic linking, multiply defined symbols
don't seem to cause any warnings or errors on the platforms we use,
unless both definitions occur in the same shared library
or both occur in the main program.)
Note that the names of the global variables
do in fact have module names in them,
but they are the names of the modules that declare the type class
and that declare the type constructors occuring in the argument types.
The name of the module that contains the instance declaration
need not be among these names.
For the IL and Java backends, the data structures we generate
must all be module qualified with the name of the module which generates them.
If two modules contain duplicate instance declarations,
we cannot catch that fact at link time.
We could catch them at runtime,
by having each module register its base_typeclass_infos
at module initialization time,
and detecting duplicate registrations.
However, we currently have no such mechanism in place.
Last update was $Date: 2004-03-01 04:24:15 $ by $Author: zs $@cs.mu.oz.au.