Compiler-Base
libraryThe Compiler-Base
library exports the
basic data structures used by the compiler. It also provides a number
of miscellaneous utility modules. Every other library relies on
Compiler-Base
library.
Several modules in Compiler-Base
library appear
to contain assorted utility routines used elsewhere in the
compiler.
Common
This module imports a number of standard modules and
re-exports some or all of their constants. Essentially, this
module is used to specify a local Dylan dialect for use by
the rest of the compiler. It includes
Dylan
, some of
Extensions
, a few names from
Table-Extensions
and several of the
I/O-related modules.
Utils
This appears to be the "grab bag" module. It holds a mix of completely unrelated minor classes and functions.
OD-Format
ModuleThe OD-Format
module was designed by Rob
MacLachlan and implements the Object
Description Format, a binary format for
storing persistent objects. This is used by the
*.lib.du files emitted by d2c,
though it might be potentially useful elsewhere.
The file gd/src/d2c/compiler/base/od-format.dylan contains an extensive discussion the module's goals and data format.
The basic requirement of the "Object Description Format" (ODF) is to allow an open-ended variety of objects to be dumped into a file, where:
instances may refer to each other in an arbitrary graph structure, and
some of the references may be to objects that are defined in distinct object description units (files, etc.)
In order to make the design interesting, the ODF attempts to satisfy various incompatible requirements:
To support efficient parsing and unparsing on a variety of architectures, but also to potentially serve as a cross-platform interchange format.
To work well on sequential media (like a socket), but also to potentially support random-access demand-loading of objects when possible (e.g. on files.)
We should consider making this into a standalone library at some point in the future.
<data-unit-name> | [Constant] |
Things that can be the name of a data unit.
Value
<symbol>
Description
<data-unit-name>
is a<symbol>
because is must be comparible with\==
.
$library-summary-unit-type | [Constant] |
A constant for
begin-dumping
Value
0
Description
This is currently the only type of OD format allowed. To build another format (e.g. DOOD), a new exported constant must be added to
OD-Format
and the file extension must be added to$unit-type-strings
(an internal constant toOD-Format
).
begin-dumping | [Function] |
Returns a dump state used by other OD functions.
Synopsis
begin-dumping (name, type, #key where) => (dump-state)
Parameters
name An instance of <data-unit-name>
. The thing to dump.type An instance of <integer>
. The "unit type code"where:
An instance of false-or(<string>)
. Dump file name. Defaults toname.<type translation>.du
.
Return Values
dump-state An instance of <dump-state>
.
Description
Creates a state (which acts very much like a buffer) for other dumping functions to use to build the dump file in memory. The
end-dumping
function actually writes the built information to file.
end-dumping | [Function] |
Flushes the dump state to file.
Synopsis
end-dumping (state) => ()
Parameters
state An instance of <dump-state>
.
Return Values
None.
Description
Actually output the file.
dump-od | [Open Generic] |
Write an object to the dump buffer.
Synopsis
dump-od (object, buffer) => ()
Parameters
object An instance of <object>
. The object to dump.buffer An instance of <dump-buffer>
. The in-memory buffer into which the object should be dumped.
Return Values
None.
Description
This routine is overloaded to dump individual types objects. Methods for primitive Dylan types can be found in the Section called The
Dylan-Dump
Module.
<dump-state> | [Class] |
Holder of the in-progress object dump.
Superclasses
<dump-buffer>
Initialization Keywords
None.
Description
Contains all information about the current object-dump in progress. It is illegal to use a dump-state after it has been passed to
end-dumping
. All instances of<dump-state>
are created by calls tobegin-dumping
.
<dispatcher> | [Class] |
Binary to in-memory instance transformer.
Superclasses
<object>
Initialization Keywords
initialize-from:
An instance of <object>
. Unless#f
, initialize instance's dispatch information from this dispatcher. Defaults to*default-dispatcher*
.
Description
A dispatcher is an object that tells the loader how to convert binary data into Dylan objects.
*default-dispatcher* | [Variable] |
Default dispatcher used for initialization.
Type
<dispatcher>
Description
Dylan-dump
loads this variable with dispatching information for several commonly-used Dylan types (see the Section called TheDylan-Dump
Module).
*data-unit-search-path* | [Variable] |
Paths to du files.
Type
<sequence>
Value
#[]
Description
This is used by
find-data-unit
to locate the du file to load.
<location-hint> | [Constant] |
A hint of where to find a data unit.
Value
<byte-string>
Description
This is currently an absolute(?) pathname.
<word> | [Constant] |
A portable, full-sized <integer>
.
Value
<integer>
Description
It seems mindy has a limited size for
<integer>
, so, for mindy,<word>
is a<general-integer>
to allow for proper bit manipulations that do not clobber the value via overflow.
find-data-unit | [Method] |
Locates then loads a du file.
Synopsis
find-data-unit (name, type, #key location-hint, check-hash, dispatcher) => (res)
Parameters
name An instance of <data-unit-name>
.type An instance of <integer>
. (Currently only$library-summary-unit-type
allowed).location-hint:
An instance of false-or(<location-hint>)
. File name for the dump file; used if the data unit cannot be found in the current*data-unit-search-path*
.check-hash:
An instance of false-or(<word>)
. Can be used to detect version mismatches between data units (used internally asexpected-hash
)dispatcher:
An instance of <dispatcher>
. Provides the dispatcher to be used for the loading process. Defaults to*default-dispatcher*
.
Return Values
res An instance of <data-unit>
. Contains the resulting transformed objects from the du file.
Description
Find-data-unit, when given a data unit name and type, returns the corresponding data unit. This is the main entry for causing loading to happen. Also used recursively.
To write a dumper for a class, one needs to think of a
name for the class (usually just take the <>'s off the
class name), register the name with
register-object-id
, and define a dumper. If
your data structure contains any object sharing or circular
usages, also see the Section called Circular Usage and the Section called Object Sharing. For
automatic generation of object dumpers and loaders, see
the Section called Autodump.
register-object-id | [Method] |
Associates a number for a type for the OD dumping functions.
Synopsis
register-object-id (name, id) => ()
Parameters
name An instance of <symbol>
. The name of the type.id An instance of <integer>
. The associated integral value.
Return Values
None.
Description
Equate the
name
with theid
. This function is not exported; rather, allregister-object-id
calls must occur in theOD-Format
module.Of course, this requirement makes it really tough for extending
OD-Format
for user defined types ...
add-od-loader | [Function] |
Register a method for reading a dumped object back into memory.
Synopsis
add-od-loader (dispatcher, name, func) => ()
Parameters
dispatcher An instance of <dispatcher>
. When loading .lib.du files, this is the*default-dispatcher*
name An instance of <symbol>
. The method will be activated when the loader encounters the type registered withregister-object-id
in the du file.func An instance of <function>
. The loader method.
Return Values
None.
Description
This function is the "other half" of
dump-od
. It is called to register a function for reloading dumped objects, which has the signature:method(state :: <load-state>) => res :: <object-type-read-in>If you use
add-od-loader
, you will need to define a corresponding method ondump-od
(see the Section called Dumping Objects).
add-make-dumper | [Method] |
Convenience method to add both a loader and dumper.
Synopsis
add-make-dumper (name, dispatcher, obj-class, slots, #key load-external, dumper-only, load-side-effect, dump-side-effect) => ()
Parameters
name An instance of <symbol>
. Type name.dispatcher An instance of <dispatcher>
. Where the loader and dumper are being added.obj-class An instance of <class>
. Must be the direct class of the instances we are to dump (and of course, must also be instantiable via make.)slots An instance of <list>
. The slots' descriptions of the class.load-external:
An instance of <boolean>
.dumper-only:
An instance of <boolean>
. If#t
, do not create the loader.load-side-effect:
An instance of false-or(<function>)
.dump-side-effect:
An instance of false-or(<function>)
.
Return Values
None.
Description
A convenient way to define both a loader and a dumper for objects of class
obj-class
.Slots
is a list which is in this format:#(slot-accessor-function-1, init-keyword-1, setter-function-1, slot-accessor-function-2, init-keyword-2, setter-function-2, ...)Note that the stuff is implicitly grouped in clumps of 3. Either of the init-keyword or the setter may be
#f
, but not both. The "make dumper" requests backpatching when it encounters an unresolved forward reference (due to circularities). Any slot which might have a forward reference must have a setter function (it can also have an init-keyword, which will be used if possible).A "make dumper" may be used when an object can be reconstructed from some accessor values by reinstatiating the object with
make
and backpatching some slots. This is pretty general, since the accessor and setter function are not required to be slot accessors and setters.
In the dump files, objects contain the following
parts: a header, raw data, sub-objects, and an
end-of-object marker. The header is a single machine word
which contains the class of the object. (Actually, it
contains the object-id, which is found by registry look-up.)
The header also indicates if there is any raw data and
sub-objects. The raw data section is zero or more machine
words which only have meaning to the object's loader. If
raw data is present, it is preceded by a byte-count (which
is a single machine word). If there are sub-objects, they
come next. The last object in a sub-object list is a
magical $end-object
, which denotes both the
end of the sub-object list and the last word of the current
object.
For more detail on the binary representation and on issues such as endian-ness and machine word size, see the Section called Binary Representation.
When an object is loaded, the loading code reads in the header. The object class id is extracted from the header. The dispatcher then provides a load function based on the class id, which reads the remainder of the data associated with the object and returns the reconstructed object.
When adding a method to dump-od
use the
methods described here to push out the objects correctly.
dump-definition-header | [Method] |
Dump an object definition header.
Synopsis
dump-definition-header (name, buf, #key subobjects, raw-data) => ()
Parameters
name An instance of <symbol>
.buf An instance of <dump-buffer>
. Probably best to use the<dump-state>
instance passed intodump-od
subobjects:
An instance of <object>
.#t
means this object has subobjects Defaults to#f
.raw-data:
An instance of <object>
. The Raw-Format field in an object-definition header encodes information needed to byte/word format translation, and in the degenerate case, indicates there is no raw data at all. Allowable format codes are$odf-no-raw-data-format
,$odf-byte-raw-data-format
,$odf-16bit-raw-data-format
,$odf-32bit-raw-data-format
,$odf-64bit-raw-data-format
,$odf-untranslatable-raw-data-format
, and$odf-word-raw-data-format
(which is 32 or 64 as appropriate, used for<word>
in format pseudo-objects). Defaults to$odf-no-raw-data-format
.
Return Values
None.
Description
The first dump call in
dump-op
for simple objects, this function dumps the object's header to the buffer. If one is dumping compound objects, first save the position of the buf before calling this method with some code like this:define method dump-od(obj :: <ratio>, buf :: <dump-state>) => (); let start-pos = buf.current-pos; dump-definition-header(#"ratio", buf, subobjects: #t); dump-od(obj.numerator, buf); dump-od(obj.denominator, buf); dump-end-entry(start-pos, buf); end method dump-od;This is necessary so that
dump-end-entry
(see the next documented method) writes out the terminal correctly.
dump-end-entry | [Method] |
Dump an "end of object" marker. This is only necessary if there are subobjects.
Synopsis
dump-end-entry (start-posn, buf) => ()
Parameters
start-posn An instance of <integer>
. Where the object began.buf An instance of <dump-buffer>
. Usually the<dump-state>
instance passed intodump-od
.
Return Values
None.
Description
Before dumping the compound object, be sure to save the buffer's current position (as in the sample code for
dump-definition-header
) because this method needs that information.
dump-simple-object | [Method] |
Dump an object that has only a constant number of subobjects (no raw data).
Synopsis
dump-simple-object (name, buf, #rest slots) => ()
Parameters
name An instance of <symbol>
. The instance's class name.buf An instance of <dump-buffer>
. Usually the<dump-state>
instance passed intodump-od
.slots Instances of <object>
. The values of the slots of this object.
Return Values
None.
Description
This method provides a simple way to dump objects of known size, so using this method, writing
dump-od
for<ratio>
would be:define method dump-od(obj :: <ratio>, buf :: <dump-state>) => (); dump-simple-object(#"ratio", buf, obj.numerator, obj.denominator); end method dump-od;
dump-word | [Method] |
Dump an integer into the buffer as one word.
Synopsis
dump-word (obj, buf) => ()
Parameters
obj An instance of <word>
. The integer to dump.buf An instance of <dump-buffer>
.
Return Values
None.
Description
This dumps an
<integer>
as part of of an object. If you want to dump an<integer>
as an<integer>
, use thedump-od
on<integer>
in the Section called TheDylan-Dump
Module.
dump-raw-data | [Method] |
Dumps a collection into the buffer.
Synopsis
dump-raw-data (obj, bsize, buf) => ()
Parameters
obj An instance of <collection>
. The collection to dump.bsize An instance of <integer>
. The number of elements in the collection.buf An instance of <dump-buffer>
.
Return Values
None.
Description
This take a collection, obj, and dumps its contents to buf. This method is usually called in connection with dumping a complex object, for example,
dump-od
for<byte-string>
in the Section called TheDylan-Dump
Module dumps strings thusly:define method dump-od(obj :: <byte-string>, buf :: <dump-buffer>) => (); dump-definition-header(#"byte-string", buf, raw-data: $odf-byte-raw-data-format); dump-raw-data(obj, obj.size, buf); end method;
The basic concept of writing a loader is to read in the
information you wrote out with the method on
dump-od
. This process (writing loaders) is a
bit different than writing dumpers, though. Here, you provide
a method to add-od-loader
(described in the Section called Defining Simple Dumpers and Loaders)
that is activated by a <dispatcher>
instance
when it encounters the object in the dump file to load.
A loader on <byte-string>
(the corresponding dump-od
method is found in
the Section called Functions for Writing Dumpers) would
look like this:
add-od-loader(*default-dispatcher*, #"byte-string", method (state :: <load-state>) => res :: <byte-string>; let next = state.od-next; let bsize = buffer-word(state.od-buffer, next); state.od-next := next + $word-bytes; let res = make(<byte-string>, size: bsize); load-raw-data(res, bsize, state); res; end method );
(In this case, the call to add-od-loader
is made at the top-level in
dylan-dump.dylan.) The functions used to
build the dispatch method are described below.
<load-state> | [Class] |
A load-state is an object that describes the current state of the loading process. Among other things, it contains an input <stream> and the buffer to the stream.
Superclasses
<object>
Initialization Keywords
stream:
An instance of <stream>
. The stream we're reading from. Only used for refilling the buffer. This slot is accessed via theod-stream
method.buffer:
An instance of <buffer>
. State of the stream buffer, which we hold for the duration of the load. It is accessed via theod-buffer
method.next:
An instance of <buffer-index>
. The index of the state's stream; it is accessed via theod-next
method and can be set (changed) using theod-next-setter
method.end:
An instance of <buffer-index>
. The end-index for the buffer and stream. This slot is accessed via theod-end
method.position-offset:
An instance of <integer>
. When added to the od-next pointer, this offset yields the byte offset of our current position from the data-unit start. This must be updated whenever we refill the buffer.dispatcher:
An instance of <dispatcher>
.
Description
A
<load-state>
object is passed into the dispatching method to reconstruct a dumped object (so, you do not need tomake
a<load-state>
instance). Use that object in conjuction with the functions described below.
$end-object | [Constant] |
An end-of-subobjects marker.
Type
<end-object>
Description
A magical object which is always the last sub-object in an object.
load-object-dispatch | [Method] |
Read an object header and invoke the appropriate loader. Return the object that the loader returns.
Synopsis
load-object-dispatch (state) => (res)
Parameters
state An instance of <load-state>
.
Return Values
res An instance of <object>
. The reconstructed object.
Description
Start loading some objects from a load-state. Dispatches to an appropriate loader method depending on the dispatcher and the entry type. Returns the object.
For example, to load a
<ratio>
instance, build a dispatching method like this:add-od-loader(*default-dispatcher*, #"ratio", method (state :: <load-state>) => res :: <ratio>; let npart = load-object-dispatch(state); let dpart = load-object-dispatch(state); assert-end-object(state); ratio(npart, dpart); end method );
load-raw-data | [Method] |
Loads data into a parameter.
Synopsis
load-raw-data (res, elsize, state) => (nnext)
Parameters
res An instance of <byte-string>
. A pre-allocated buffer where the raw data gets loaded.elsize An instance of <integer>
. Size of the buffer (or the maximum bytes to load).state An instance of <load-state>
.
Return Values
nnext An instance of <buffer-index>
. The position in the<load-state>
's buffer after loading the raw data.
Description
Loads raw data into res. It is up to the dispatching method to interpret that loaded raw data in order to reconstruct the object's state.
Usually a call to
load-raw-data
is to load a string, so reconstruction is easy (see the example at the top of this section). One thing of note is that this method is not functional: res is not returned, but is a destructively modified parameter.
load-subobjects-vector | [Method] |
Returns a vector of an object's subobjects.
Synopsis
load-subobjects-vector (state, size-hint) => (res)
Parameters
state An instance of <load-state>
.size-hint An instance of false-or(<integer>)
. If a number, then the number of subobjects MUST match.
Return Values
res An instance of <simple-object-vector>
. The subobjects of the object.
Description
Loads the dumped subobjects into a vector. Note that the subobjects may be unresolved when loaded with this function ... it is the responsibility of the user of this function to resolve those objects (see the Section called Circular Usage for more information on functions for object resolution such as
obj-resolved?
,actual-obj
, andrequest-backpatch
).So, getting a
<vector>
from this function still requires programming effort, e.g.:add-od-loader(*default-dispatcher*, #"simple-object-vector", method (state :: <load-state>) => res :: <simple-object-vector>; let contents = load-subobjects-vector(state, #f); let rsize = contents.size; let res = make(<simple-object-vector>, size: rsize); for (i from 0 below rsize) let el = contents[i]; if (el.obj-resolved?) res[i] := el.actual-obj; else res[i] := el; request-backpatch(el, method (actual) res[i] := actual end); end; end; res; end method );Fortunately, the
Dylan-dump
module provides this dispatching method. See the Section called TheDylan-Dump
Module.
load-sole-subobject | [Method] |
Loads the only subobject of this object.
Synopsis
load-sole-subobject (state) => (res)
Parameters
state An instance of <load-state>
.
Return Values
res An instance of <object>
. The resolved subobject.
Description
Load a one-slot object. The value is returned, and we check that there was in fact only one subobject.
assert-end-object | [Method] |
Reads from the buffer, and signals an error if the next thing read isn't an end-of-object marker.
Synopsis
assert-end-object (state) => ()
Parameters
state An instance of <load-state>
.
Return Values
None.
Description
Ensure the subobject list ends with the end-of-object token.
To deal with circular object usage graphs, several mechanisms are employed. Objects can be dumped as forward references and defined later. The loader for the object containing the forward reference has the option of delaying object creation until the forward referenced object is loaded, or requesting that the object be backpatched when the forward referenced object is finally loaded.
The following descriptions TBD:
new-local-id
label-next-object
dump-local-reference
<forward-ref>
obj-resolved?
actual-obj
request-backpatch
resolve-forward-ref
By default, it is assumed that the object being dumped has no object sharing, i.e., for all sub-objects the ref-count is one. For any object that has a ref-count greater than one, that object must be a sub-class of <identity-preserving-mixin>. (If it is ok to dump the object multiple times and load it back as separate objects, then this superclass may be omitted.)
The following descriptions TBD:
<identity-preserving-mixin>
maybe-dump-reference
To allow efficiency as well as potential interchange, objects are described by a word sequence which vaguely resembles the in-core representation. The word size is implementation defined, and dumper/loader methods dump/load data as words having the native byte ordering. The format is defined so that any necessary byte-swapping and word-size adjustment can be done as a pre-pass filter to the actual loader. The format is described in detail at the top of od-format.dylan.
The implication of the above is that data-unit files
produced from OD-Format
are not portable.
These examples show the bits for some literals, assuming with 32bit word size and big-endian order (which, by the way i386 is not big-endian), and assigning convenient values for the object class-IDs. See the actual code for the real definitions and class IDs.
For example, we could describe a byte-string like this:
0: <Header? = #b1, Etype = #b00, Subobjects? = #b0, Raw-Format = #x1, Class-ID = byte-string-odf-id> 1: string byte-count 2[ceiling(byte-count, word-bytes)]: <string-chars: *>
Assuming byte-string-$odf-id
is
#x666
, the actual bits for "Foo"
would be:
#x81000666 #x00000003 #x466F6F00
A list has subobjects, but no raw data, so it would look like:
0: <Header? = #b1, Etype = #b00, Subobjects? = #b1, Raw-Format = #x0, Raw-Size = #x00, Class-ID = list-odf-id> 1[N]: <header of subobject 0, subobject data> ... M: <Header? = #b1, Etype = #b01, Start-Offset: * = M>
If list-odf-id
were #x667
,
then #("Foo", "Bar")
would be:
0: #x90000667 1: #x81000666 2: #x00000003 4: #x466F6F00 5: #x81000666 6: #x00000003 9: #x42617200 7: #xA0000007
This demonstrates the nesting of objects, and shows how the end is marked. Note that:
"Foo"
is represented by exactly
the same bits here (it is position-independent), and
that
The entire sequence is position-independent, since the end-entry has a relative offset, and that
The basic subobject protocol doesn't say in advance how many subobjects there's going to be; you have to wait for the end header. This could be avoided for e.g. vectors by considering subobject 0 to be the length.
Dylan-Dump
ModuleThis module provides ODF dumping and loading routines for all of the primitive types.
Compile-Time-Values
ModuleThis module exports <ct-value>
, which
represents a value known at compile time. It also supplies
<literal>
and a number of subclasses to represent
literal strings, integers, lists and other primitive types.
Other compile time values can be found elsewhere. Literal
<type>
objects are handled by classes in the
CType
module. Literal <class>
objects are
handled by classes in the Classes
module. Literal <function>
objects are
handled by classes in the Compile
Time Functions
module.
Source
Moduled2c was designed to fetch source records from several sources, including flat text files and eventaully code databases. To permit this flexibility, d2c uses a very abstract interface for representing locations in source code.
The <source-location>
class is an open,
abstract class with no slots. All instances must provide a method
for describe-source-location
.
The source-location
generic takes an arbitrary
object as an argument and returns a source location. The mix-in
class <source-location-mixin>
provides the
init-keyword source-location:
and a method on
source-location
which returns the value of the keyword
or <unknown-source-location>
.
The <source-file>
class reads an entire
input file into memory and provides access to the contents. It's
tightly-coupled to the class
<file-source-location>
, which represents a
location within a source file. Separating these two classes is not
feasible. Among other things, this code generates the attractive,
multiline displays of exactly where an error occured.
Tokens
ModuleThis module provides classes and constants for representing tokens and syntax tables.
Header
ModuleDylan source files have RFC 822 headers, similar to those used by Internet e-mail and many other protocols. This module provides support for parsing headers at the top of a file and extracting the values of keywords. White space at the begining and ends of lines is removed automatically.
Platform
ModuleThis module contains code to parse platforms.descr, which contains descriptions of the various host platforms supported by d2c. A description of each of the parameters appears in platforms.descr itself.
Errors
ModuleThis module exports a number of classes and functions related to compiler errors and warnings.
Signature-Interface
and
Signature
ModulesThe Signature-Interface
module creates a
group of names which are actually implemented in the
Signature
module. These represent the formal
parameter and result lists of a Dylan function.
Names
ModuleWhen reporting errors, d2c often needs to provide names for various functions, anonymous local methods and other things found in Dylan programs. This module knows about several different kinds of names and how they should be displayed to the user.
It does not appear that these classes are used in as part of the compiler's own computations. A quick look suggests that this is merely interface and debugging code.
Definitions
ModuleI'm not quite sure what this module does, so take this description with a grain of salt. Better yet, figure out what this module does and fix this documentation.
The top-of-file comment for this module reads:
Abstract class that all definitions inherit from. A definition is the compilers handle on some run-time value. In addition to the obvious things, definitions exist for things like the type of a variable (when it isn't a compile-time constant).
It appears that named, top-level forms each get a definition. From the comment, it sounds as if the following code produces two definitions:
define variable foo :: run-time-expression() = #f;
One definition refers to the variable itself, and the other refers to the type value computed at initialization time. Is this a correct assumption?
Variables
ModuleThis module is in change of libraries, modules and
namespaces. It maintains the global table of libraries, processes
the various clauses in define library
and define
module
, and keeps track of where variable definitions
live. The magic Dylan-User
modules get set up
here as well.
Each <variable>
object also provides slots
for the associated definition, transformers and constant
evaluator. It would appear that this module defines some of the
more important data structures in the compiler.
Policy
ModuleThe Policy
represents an
optimization policy: a set of parameters
used for making tradeoffs between speed, safety and code size. A
quick inspection of the code shows that a
<policy>
object will not survive the dumping and
loading processes intact. Do these actually get used?
CType
ModuleDoes ctype
stand for "constant
type", "compile-time type", or something else
entirely?
This module defines the compiler's type model. This is used
by the Flow
module to represent the types of expressions, and by the
Compiler-Optimize
library during several phases of optimization.
Most of these types can be represented as Dylan objects.
Such types also inherit from <ct-value>
, so they
can be represented as compile-time values (see the Section called The Compile-Time-Values
Module).
All ctypes inherit from
<values-ctype>
. Single-value types (e.g., the
types of variables) are represented by subclasses of
<ctype>
. Multiple-value types (e.g., the return
values of functions) are represented as instances of
<multi-value-ctype>
.
Various subclasses of <ctype>
represent class types, union types and all the standard (and
non-standard) limited types supported by d2c. The class
<unknown-ctype>
represents a type we won't know
until runtime. The function empty-type
returns a
type union with no members. No object can ever be an instance of
this type. (Empty types are used extensively when computing the
intersections, unions and differences of types.)
Class types are defined in the Classes
module.
<multi-value-ctype>
is a bit tricky. It
represents zero or more required values of known types, zero or
more optional values of known types, and an optional
#rest
list—which may also be typed. (See the
source for more information.) The function
wild-ctype
returns a multi-value type representing
zero or more values of any type.
This module supports type intersections, type unions and type differences, primarily for use by the optimizer. All of the operations may be imprecise; if so, the second return value will typically be false.
The intersection of two types is described as:
Return as restrictive a type as we can discover that is no more restrictive than the intersection of Type1 and Type2. The second value is true if the result is exact. At worst, we arbitrarily return one of the arguments as the first value (trying not to return an unknown type).
The union of two types is described as:
Find a type which includes both types. The result is an unknown type if either of the arguments are unknown; otherwise the result is precise.
The difference of two types is described as:
Return our best guess at the type that describes all objects that are in Type1 but not in Type2. If we can't precisely determine this type, then return something more inclusive than it, but never more inclusive than Type1.
Other things in this module include subtype tests, type
extent computation and type specifiers. The various
"-dispatch
" functions have nothing to do
with the compiler's support for generic functions—they're
just hooks for extending various generic functions defined in this
module.
Transformers
ModuleTransformers are functions that are attached to function
definitions. They serve as a sort of procedural macro in the
optimizer. For instance, the do() function is transformed into
an ordinary loop construct so the optimizer can bang on it
some more. See the cheese
module for usage
of this.
Representation
ModuleThis module provides a very abstract interface for choosing
data representations in generated code. It currently appears to be
implemented by the C-Representation
module.
Classes
ModuleThis module contains the guts of the class system, which is
implemented with classes as a subtype of
<ctype>
. This module also computes class
precedence lists, unique class IDs and slot layouts.
Type-Dump
ModuleThis is an implementation module which contains ODF support
for various subclasses of <ctype>
.
C-Representation
ModuleThis modules knows about the different C data types. It also knows about heap representations, direct representations and general representations.
Compile-Time-Functions
ModuleFrom the top-of-file comment:
A ct-function is a kind of compile-time value used to represent a function. ct-functions contain various linkage-related information needed to call the function, but don't reference the FER for the function (e.g. the <function-literal>.) This information is used both by the backend and by the heap builder.
This is also where Peter Housel added the support for callback functions. This module includes support for general entries and generic entries, which are presumably the type-checked and dispatched entry points for functions, respectively.
Flow
ModuleThese classes define the flow model
used by the compiler. Together with the classes defined in the
Front
module, they make up the intermediate
representation used to communicate between the
Compiler-Convert
library, the Compiler-Front
library, the Compiler-Optimize
library and the
Compiler-CBack
library.
Unfortunately, the front-end and the back-end of the compiler
use a slightly different version of the intermediate
representation. To confuse matters further, both versions are
called the "front-end representation" (see the Section called The Builder-Interface
Module for more
details).
The actual flow model is fairly straightforward. Data-flow classes are defined in data-flow.dylan, and control-flow classes are defined in control-flow.dylan.
The data-flow classes represent variables, constants, operations and assignments. They use a generalization of the traditional three-address form. Expressions are broken down into a series of step-by-step assignments to temporary variables. The right-hand side of an assignment may contain a constant, a variable or a single operation (nesting is not allowed). The operation may take any number of arguments, each of which must be another variable or a constant.
When the flow classes are first created, a particular variable may be assigned to more than once. The optimizer, however, converts all variables to single static assignment (SSA) form. In SSA form, each variable is given a value at creation time and never modified. This restriction simplifies many optimization algorithms considerably.
Prior to SSA conversion, variables are represented by the
class <intial-variable>
. Each individual
assignment to a variable is represented as an
<initial-definition>
. After SSA conversion,
variables are represented by the class
<ssa-variable>
.
The control-flow model represents groups of assignments as
regions. The simplest kind of region is
<simple-region>
, which represents a
linear series of assignments. A
<compound-region>
holds several regions
which are executed in sequence. The various subclasses of
<join-region>
represent control constructs with
non-linear flow patterns. An <exit-region>
transfers control to an enclosing block; various subclasses are
used to return from functions, exit loops and perform other kinds
of funky control flow.
This part of the compiler is well-commented, so you may want
to read the source for further details. You'll also want to look at
the classes defined in Front
module, most of which extend the control- and data-flow
model in various useful ways.
The classes <join-operation>
,
<join-assignment>
and
<join-region>
are currently ignored by the rest
of the compiler. In each case the word "join" refers
to the Phi function used in traditional SSA form. These classes
will be used if anybody ever fixes the optimizer's SSA
conversion.
Prev | Home | Next |
d2c Internals Guide | Up | The Compiler-Parser
library |