SML# Document Version 4.0.0
6 Setting up SML# programming environment

6.2 Bootstrapping the SML# compiler

This section outlines the structure of SML# compiler and the method to building (bootstrapping) it. You do not have to understand this section for installing and using SML# compiler, but you may find this section informative in understanding various messages during compilation of SML# and also a structure of a compiler in general.

SML# system consists of a single compiler that performs separate compilation. Its interactive mode is realized by the top-level-loop performing the following steps:

  1. 1.

    compile the user input using the current static environment as its interface,

  2. 2.

    link the object file with the current system to generate a shared executable file,

  3. 3.

    dynamically load the shared executable in the current system, and call its entry point.

SML# is written in SML#, C, and C++. In addition, it uses the following tools during compiling the SML# compiler.

  • ml-lex,ml-yacc: a lexical analyzer generator and a parser generator.

  • SMLFormat: a printer generator.

  • The Standard ML Basis Library.

All of them are written in Standard ML.

SML# compiler compiles each SML# (which is a super set of Standard ML) source file (source.sml) into a system standard object file (sample.o). To generate an executable file, the compiled files are then linked by the standard linker (ld in Unix-family OS) invoked through C/C++ compiler driver command (gcc or clang). So, in order to build the SML# compiler, it is sufficient to have a C/C++ compiler and an SML# compiler. But of course, at the time when the SML# compiler is first built, an SML# compiler is not available. The standard step of solving this bootstrap problem is the following.

  1. 1.

    Compile SML# runtime library written in C/C++ and archive it as a static link library.

  2. 2.

    Obtain a pre-compiled LLVM IR source file of a minimal SML# compiler minismlsharp that is sufficient for compiling all the source files used in the SML# compiler. The pre-compiled files are typically generated by an older version of SML# compiler.

  3. 3.

    In the system where the target SML# compiler is installed, assemble the minismlsharp LLVM IR file, link them with the runtime library, and create a minismlsharp command.

  4. 4.

    By using this minismlsharp command, compile all of the tools and libraries, and the full-featured SML# compiler. This procedure is roughly performed as follows.

    1. (a)

      Compile the Basis Library.

    2. (b)

      Compile and link smllex command.

    3. (c)

      Compile and link ML-yacc library and smlyacc command.

    4. (d)

      Generate parser source code by smllex and smlyacc command.

    5. (e)

      Compile and link smlformat command.

    6. (f)

      Generate printer source code by smlformat command.

    7. (g)

      Compile all of the libraries bundled to the SML# distribution.

    8. (h)

      Compile and link smlsharp command.

  5. 5.

    The following files are installed to specified destination directories.

    • The static link library of the runtime library.

    • Interface files, object files, and signature files of the libraries bundled to the SML# distribution.

    • smllex, smlyacc, smlformat, smlsharp command.

As outlined above, there are complex dependencies among source files and commands. Furthermore, processing some of these files depend on the underlying OS. This is a typical situation in a large system development. One well established method to solve these dependency problems is to use configure script generated by GNU Autoconf and make command.

SML# compiler compiles each source file according to its interface file, which describes the set of files require by the source file. SML# compiler can also generate a list of files on which each source file depends in the Makefile format that can be processed by make command. SML# compiler does this task, when it is given one of the following switch.

  1. 1.

    smlsharp -MM smlFile. The compiler generates the dependency for the source file smlFile to be compiled in the Makefile format.

  2. 2.

    smlsharp -MMl smiFile. The compiler assumes that the file smiFile specifies the top level system, and generates the list of necessary object files in the Makefile format.

  3. 3.

    smlsharp -MMm smiFile. The compiler assumes that the file smiFile specifies the top level system, and generates a Makefile that builds the executable file of the entire program.

In the SML# project, we make a Makefile that performs the above described complicated sequence of compilation and linking steps using the above functionality of SML#. Invoking make command on Makefile re-compiles only the necessary files to build SML# compiler.