SML# Document Version 4.0.0
30 A parser generator smlyacc and smllex

30.4 The structure of a smllex input file

An input file of smllex has the following general structure.

user declarations
%%
LEX declarations
%%
descriptions of regular expressions and their attributes

  1. 1.

    The user declaration section The SML# code for user declarations section specifies the types that are used in lexical analysis and any other user level code which will be used in attribute specifications in the regular expression section.

    The mandatory specifications are the following.

       type lexresult = ...
       fun eof () = ...
    
    • lexresult type. The type of the value of returned by the lexer when the lexer accepts a regular expression. When using with the smlyacc, this is usually the token type defined in YaccInputFile.grm file.

    • eof function. This is the function that is called when the lexical analyzer detects the end of file. It usually returns the value (of type lexresult) that represent the end of file token.

  2. 2.

    The LEX declaration section

    This section specifies directives for smllex functions and auxiliary definitions for specifying regular expressions in the regular expression section. The syntax of auxiliary definitions for regular expressions are the same as the standard LEX system.

    smllex directives include the following.

    • %structure 宣言. smllex generate a lexical analyzer as a structure. This declaration specifies the name of the structure as follows.

          %structure MLLex
      
    • %arg 宣言. This specifies an extra argument to be passed to the lexical analyzer. It can be omitted if no extra argument is needed.

    • %full 宣言. This specifies that the generated lexer handles 8 bit character.

  3. 3.

    The regular expression definition section This section defines the set of regular expressions to be accepted.

For the details of the specification, consult the document src/ml-lex/doc/mllex.pdf in the SML# source code distribution.