The d2c compiler was developed by the Gwydion Project at CMU. This chapter explains the structure of d2c and gives a brief overview of how the various components interact.
d2c compiles one library at a time. Each library contains one or more modules. A module may be implemented by one or more files. Each file contains one or more source records, the logical unit of Dylan compilation.
By compiling many source records at once, d2c produces more efficient code at the cost of using more memory. In addition to processing an entire library's worth of source records, d2c also performs interlibrary optimizations by looking at certain information from previously compiled libraries.
In its current configuration, d2c is not well-suited to developing new Dylan code. It is, however, well-suited to building production versions of libraries. Obvious future projects include reducing the granularity of compilation and implementing a selectively-incremental compiler that can modify the object code of a running program. The basic architecture was designed with these projects in mind.
The actual compilation process procedes roughly as follows:
Load any libraries used by the current library. The compiler begins by reading in the
*.lib.du file for the
Dylan
library. As other libraries are
use
d, it loads the appropriate
*.lib.du files. These contain most of the
compiler data structures from the original versions of those
libraries.
Parse and macro-expand each of the input files. The parser is implemented by the Compiler-Parser
library. As each
top-level form gets parsed, it is passed to
process-top-level-form
, which is implemented by
the Compiler-Convert
library. At the end of this stage, the input files have
been completely parsed and macro-expanded.
Finalize the top-level forms. Top-level forms may contain forward dependencies within a
module. In most cases, these dependencies aren't necessitated by
the design of the Dylan language, but they are needed for for
efficient compilation. We need to study these dependencies if we
want to make incremental compilation work. The implementation of
finalize-top-level-form
lives in
Compiler-Convert
library.
Process the class hierarchy. Optimizing Dylan code requires careful attention to the
object model. Unfortunately, d2c uses algorithms which require
the entire class hiearchy to be computed before link time. (The
biggest problem here appear to be assigning unique IDs to
individual classes.) The code to analyze the class hierachy lives
in the Classes
module of the Compiler-Base
library.
Begin processing top-level definitions
individually. Up until this point, d2c performed each step on the
entire library before continuing. The next several steps,
however, are completed for each top-level form before moving on
to the next top-level form. This process is controlled by
compile-1-tlf
in the Compiler-Main
library
Convert the parse tree into the front-end
representation. For various unfortunate reasons, d2c uses the term
front-end representation
(FER) for what other compilers call the
intermediate representation. The
conversion process is controlled by methods on
convert-top-level-form
provided by the
Compiler-Convert
library. These take the parsed representation of a
top-level form and convert it to the FER
using the the Builder-Interface
module.
Optimize the FER of each top-level form. The compiler performs two kinds of optimizations:
simplications and required
optimizations. The simplications should produce output
equivalent to their input. The required optimizations,
however, clean up certain artifacts in the code and insert
type checks. It is an error to pass a top-level form to the
back end before performing the required optimizations. See
the Section called The Compiler-Optimize
Library for more
details.
Emit C code for the FER of each top-level form. The Compiler-CBack
library emits the actual C code required to
implement each top-level form.
Dump the local and global heaps. Each library has a local heap which gets linked into the
library itself. This contains as much static data as
possible. However, not all data can safely live here; some must
live in the executable application. The per-library heap is
called the local heap and the
per-application heap is called the global
heap. The latter is only dumped if the current
library will become an executable application. Both dumpers are
implemented by the Heap
module.
Save a *.lib.du file for the current
library if necessary. If d2c is not compiling an executable application, it
dumps the data structures used by this library into a
*.lib.du file. If another Dylan library
includes this library, the dumped data structures will be used to
perform inter-library optimizations. The dumped data is stored as
persistent objects in Object Description
Format (ODF). The dumping process
is controlled by the Compiler-Main
library. ODF is
implemented by the OD-Format
module.
Generate a Makefile and run the C compiler. The compilation driver in the Compiler-Main
library creates a Makefile
and C source files as needed during the compilation process. Once
all the necessary code has been generated and the heaps are
built, d2c invokes GNU
make on the Makefile.
Prev | Home | Next |
Mindy Internals Guide | The Compiler-Base
library |