SML# Document Version 4.0.0
22 SQL Expressions and Commands

22.4 SQL Value Expressions

An SQL value expression is an expression that occurs in an SQL command and evaluates to a value when the SQL command evaluates. An SML#’s SQL value expression constructs a fragment of the SQL value expression as its value. The SQL value expression constructed is almost literally identical to the SML#’s counterpart and evaluated on the database server. For example, the SML#’s SQL value expression

_sql(1 + #employee.salary)

is evaluated to the SQL value expression

1 + employee.salary

as a fragment of an SQL command to be sent to a database server.

If an SQL value expression contains a subexpression that can be evaluated by SML#, the subexpression evaluates to a value and the value is embedded in an SQL command to be sent to a database server. For example,

_sql(1 + 2 + #employee.salary)

is evaluated to the SQL value expression

3 + employee.salary

of a fragment of an SQL command.

The following set of expressions sqlexp, which includes a subset of value expressions of the standard SQL, is available in SML#:

  • SQL value expressions (top-level)
    sqlexp ::= sqlinfexp  | not sqlexp SQL’s logical nagation  | sqlexp and sqlexp SQL’s logical conjunction  | sqlexp or sqlexp SQL’s logical disjunction

  • SQL infix expressions
    sqlinfexp ::= sqlcastexp  | sqlinfexp vid sqlinfexp infix expresssions

  • SQL type cast expressions
    sqlcastexp ::= sqlappexp  | (vid) sqlcastexp type cast

  • SQL function application expressions
    sqlappexp ::= sqlatexp  | vid sqlatexp function applications  | sqlappexp sqlatexp function applications  | sqlappexp is (not)? sqlis SQL’s IS predicates sqlis ::= null | true | false | unknown

  • SQL atomic expressions
    sqlatexp ::= scon constant literals  | true SML#’s true literal  | false SML#’s false literal  | null SQL’s NULL literal  | #lab.lab Reference to a column in a named relation  | #.lab Reference to a column in an unnamed relation  | vid SML#’s variable reference  | op longvid SML#’s variable reference  | (sqlexp, , sqlexp) SML#’s tuples  | ((_sql)? sqlselect) SQL’s SELECT subqueries  | (_sql)? exists ((_sql)? sqlselect) SQL’s EXISTS subqueries  | ((_sql)? sqlcommand)  | ((_sql)? sqlclause)  | (sqlexp)  | ( ... exp ) embedded SQL value expressions

As seen in the definition, some SQL value expressions may begin with the keyword _sql. This is allowed just for the inner one of a nested SQL value expressions to be written in the same manner of the outermost one. These _sql keywords are simply ignored.

The type of an SQL value expression sqlexp is determined under the type τ of a set of tables and w of the database connection identification, both of which are given in the static context. The type of every subexpression in an SQL value expression is given under the same τ and w. In what follows, we write that the type of an expression e is (T,w)τ if e has type τ under τ and w. If τ and w do not need to be described, we simply write that the type of e is τ.

22.4.1 Expressions evaluated by SML#

An SQL value expression satisfying the following inductive condition is evaluated by SML# and their values are embedded in the SQL command to be sent to the database server.

  1. 1.

    A constant expression scon is evaluated by SML#.

  2. 2.

    Variable expressions vid and op longvid are evaluated by SML#.

  3. 3.

    A tuple (sqlexp1, , sqlexpn) is evaluated by SML# if sqlexpi is evaluated by SML# for any i.

  4. 4.

    A function application vid sqlatexp is evaluated by SML# if sqlatexp is evaluated by SML#.

  5. 5.

    An infix expression sqlinfexp1 vid sqlinfexp2 is evaluated by SML# if sqlinfexpi is evaluated by SML# for any i.

  6. 6.

    An SML#’s function application sqlappexp sqlatexp is evaluated by SML#. A syntax error occurs if either sqlappexp or sqlatexp is not evaluated by SML#.

For example, consider the following SQL query:

val q = _sql db => select SOME 1 + sum(#a.b) from #db.a group by #a.c;

The subexpression SOME 1 is evaluated by SML# and its value is embedded in the SQL query fragment (if SOME is not defined, it causes an undefined variable error). sum(#a.b) is not evaluated by SML# and therefore embedded in the query without any modification. The SQL query that is sent to the server when executing q is the following:

SELECT 1 + SUM(a.b) AS "1" FROM a GROUP BY a.c

The type of expressions that are evalutated by SML# must be one of the SQL basic types defined in Subsection 22.1.1.

Each subexpression evaluated by SML# in an SQL value expression is evaluated immediately at the time when the SQL value expression is evaluated. The evaluation order of such subexpressions are also same as SML# expressions. An SQL value expression is expansive, i.e, avoided to have polymorphic types, if it contains a function application in a subexpression evaluated by SML#. Since SQL queries are polymorphic in almost cases, SQL value expressions including function applicaions evaluated by SML# often cause “value restriction” warnings, as seen in the following example:

# _sql(1 + 2);
none:~1.~1-~1.~1 Warning:
(type inference 065) dummy type variable(s) are introduced due to value
restriction in: it
val it = _ : (?X1 -> int, ?X0) SQL.exp

To use an SQL expression polymorphically or to put it at the program toplevel, enclose it with fn () => to avoid the warning. For example, in contrast to the above, the following example does not cause the warning:

# fn () => _sql(1 + 2);
val it = fn : [’a, ’b. unit -> (’a -> int, ’b) SQL.exp]

22.4.2 SQL constant expressions

Any constant expression in an SQL value expression is evaluated by SML#. The SML#’s syntax is used to write SQL constants; therefore, they looks differently from the standard SQL. In particular, string literals are totally different than the standard one.

Note that neither the true nor false has SQL.bool3 type. They denotes a first-class boolean value and therefore they are distinguished from boolean expressions. See Section 22.1.2 for the distinction between boolean expressions and values.

The UNKNOWN literal defined in SQL99 feature ID T031 is not provided in SML#. This is due to the fact that PostgreSQL, the only implementation of the T031 feature, does not have the UNKNOWN litreral, and even if another implementation of T031 would appear in the future, UNKNOWN can be substituted by NULL without changing its meaning (this interchangable use of UNKNOWN and NULL has been criticized since it is not compatible with the SQL core).

22.4.3 SQL identifier expressions

All the vid and op longvid references to the SML# variables. A longvid with one or more structure identifiers must begin with the op keyword.

In the SQL value expressions, the infixity of some identifiers are declared as follows:

infix 7 %
infix 5 like ||
nonfix mod

The type and value of an identifier expression is varied depending on whether or not the identifier occurs in an expression evaluated by SML#. If an identifier occurs in an expression evaluted by SML#, its type and value are those of the identifier bound in the current context of the SML# program. Otherwise, the type and value of an identifier are those of the identifier bound in the SQL.Op structure.

For example, the two + identifiers occured in the expression

_sql(1 + 2 + #a.b)

have different meanings. The first + is the + bound in the current SML# environment, and the second + is SQL.Op.+.

An identifier occuring in an expression that is not evaluated by SML#, except for vid of (vid) sqlappexp 22.4.5), is embedded in the SQL query after translating it by the following rules:

  • All lower case alphabets are replaced with upper case alphabets.

The value of an identifier defined in the SQL.Op structure is referred only when the toy program of the SQL query is executed (see Subsection 22.8.3). Its type is used for typing SQL value expressions. For example, the type of _sql(1 + #a.b) in SML# can be determined by looking for the type of SQL.Op.+.

22.4.4 SQL function applications and infix expressions

SQL function application and infix expressions are parsed in the same way as those of SML#, except for builtin logical expressions. The associatibity of infix expressions is decided by SML#’s infix declarations. Thus, SML#’s SQL infix operators may have different associatibity than the standard SQL in accordance with the use of the infix declarations.

Unlike the standard SQL, the syntax of SML#’s function application expressions does not require the function arguments to be parenthesized, but requires at least one delimiter or space between the function and arguemnts. However, particularly in an SQL expressions, it is suggested to use parenthesizes for the arguments and omit the space between the function and arguments so that it looks much closer to the standard SQL. This is completely allowed in the grammar of Standard ML. For example, an aggregate function can be used in a SQL query like as follows:

fn db => _sql select avg(#e.age)
              from #db.employee as e
              group by ()

The subexpression avg(#e.age) is a function application expression that applies the SQL.Op.avg function to the argument #e.age.

Each of the SQL infix operators and functions, except for builtin logical operators, is provided as an SML# library function defined in the SQL.Op structure. The list of SQL functions and operators available in expressions that are not evaluated by SML# is shown in Section 22.9.2.

An SQL function application expression and SQL infix expression is typechecked in the same way as those of SML#.

22.4.5 Type cast expressions

In a SML#’s SQL value expression, the expression consisting of a single identifier surrounded by parenthesizes has a special meaning. By putting an identifier surrounded by parenthesizes in front of an expression, the value of the expression is applied to the function identified by the identifier. The identifier and its surrounding parenthesizes do not appear in the SQL query to be sent to the database server. An identifier surrounded by parenthesizes must be defined in the SQL.Op structure. In what follows, an expression prefixed with an identifier surrounded by parenthesizes is referred to as a type cast expression.

A type cast expression corresponds to implicit type cast in SQL, which is not supported by SML#’s type system. One of its typical usage is to deal with the option and numeric types. As described in Section 22.1, SML# distinguish values being possibly NULL from non-NULL values by using option type. In contrast, the standard SQL allows NULL in any type. Due to this difference, the SML# compiler reports a type error even for a query type-correct in the standard SQL. Similarly, SML# does not allow implicit numeric type conversion, which the standard SQL allows. For example, consider the following that queries people younger than the average in each deparment:

fn db => _sql select #e.department as department, #e.name as name
              from #db.empoloyee as e
              where #e.age < (select avg(#t.age)
                              from #db.employee as t
                              where #t.department = #e.department
                              group by ())

This expression is typechecked, but the type of age column of employee table is infered as SQL.numeric option. Thus, this query cannot be executed if the age column is of int. The source of this inference is that the result of the aggregate function avg is compared with #e.age. The result type of avg is SQL.numeric option. The comparison operator forces the two expressions to have the same type. Therefore, the type of #e.age is SQL.numeric option. In the standard SQL, age may have an arbitrary numeric type since its type is implicitly casted to NUMERIC.

For the above example, inserting Num function to the left side of the comparison operator allows age to have an arbitrary numeric type.

fn db => _sql select #e.department as department, #e.name as name
              from #db.empoloyee as e
              where (Num)#e.age < (select avg(#t.age)
                                   from #db.employee as t
                                   where #t.department = #e.department
                                   group by ())

This (Num) is ignored when an SQL query is composed; therefore, regardless of the existence of (Num), the SQL query to be sent to the database server is not changed. The (Num) is evaluated statically only for the typechecking of the SQL value expression.

For the set of identifiers provided for type cast expressions, see Subsection 22.9.1.

22.4.6 SQL logical expressions

The following logical operators are built in the syntax of expressions:

  • not sqlexp

  • sqlexp1 and sqlexp2

  • sqlexp1 or sqlexp2

  • sqlexp is (not)? sqlis

The and operator has an additional restriction on its syntax: to avoid conflict with other SML# syntax, and may occur only if it is parenthesized. For example, the following causes a parse error:

# fn () => _sql where #t.c >= 10 and #t.c <= 20;
(interactive):1.31-1.35 Error: Syntax error: deleting AND HASH

By surrounding the expression including the and operator with parenthesizes, the parse error is avoided, as seen in the following:

# fn () => _sql where (#t.c >= 10 and #t.c <= 20);
val it = fn : [’a#t: ’b, ’b#c: int, ’c.
               unit -> (’a list -> ’a list, ’c) SQL.whr]

The type of a logical expression is SQL.bool3 if its all subexpressions is of the SQL.bool3 type. Similarly, all SQL comparison operators provided as SML# library functions return a query of the SQL.bool3 type.

As described in Section 22.1.2, to avoid confusion on dealing with boolean values in the standard SQL, SML# distingishes the types of boolean expressions and boolean values. Note that boolean value expressions, such as a reference to a column of the BOOLEAN type and the boolean value literals like true, is not allowed to be a boolean expression. The following example causes a type error:

# fn () => _sql(true is false);
(interactive):1.14-1.26 Error:
(type inference 016) operator and operand don’t agree
operator domain: SQL.bool3
operand: ’HBP::bool

In contrast, the following does not cause type error

# fn () => _sql(#t.c = true is false);
val it = fn : [’a, ’b. unit -> (’a -> SQL.bool3, ’b) SQL.exp]

because #t.c = true is an boolean expression whereas true itself denotes an boolean value.

22.4.7 SQL column reference expressions

To reference a certain column of a certain table, write #lab1.lab2, where lab1 and lab2 are the names of the table and column, respectively. The type of a column reference is that of the column lab2 of the table lab1 as specified in the type of table sets given by the static context.

Unlike the standard SQL, which sometimes allows to omit the table name in a column reference, SML# requires the table name explicit in all the column reference expressions. In addition, column references must be prefixed with # to avoid the conflict with other syntax.

Also unlike the standard SQL, table names and column names in SML# are case-sensitive. For example,

SELECT Employee.name FROM EMPLOYEE

is a valid query in the standard SQL, but in SML#,

_sql db => select #Employee.name from #db.EMPLOYEE

causes an type error because the table name referenced is not found. In the SQL value expression to be sent to a server, the table and column names appears in the same case as the SML#’s counterpart.

Sometimes in SQL, a column in an anonymous table is referenced. A typical example of this situation is the ORDER BY clause that refers to a column computed by the SELECT clause. In SML#, such kind of column references is written as #.lab. Its type is the type of the column lab. For example, the SQL query

SELECT e.name AS name, e.age + 1 AS nextAge
FROM employee AS e
ORDER BY nextAge

is written in SML# as follows:

fn db => _sql select #e.name as name, #e.age + 1 as nextAge
              from employee as e
              order by #.nextAge

22.4.8 SQL Subqueries

The following two kinds of subqueries are allowed in SML#:

  • SELECT subqueries: ( sqlselect )

  • EXIST subqueries: exists ( sqlselect )

An additional _sql keyword may appear just before the first keyword of a subquery so that it looks like other SML#’s SQL expressions. These _sqls are simply ignored.

A SELECT subquery ( sqlselect ) must return a 1-column and 1-row result. The type system of SML# enforces the 1-column property of a subquery and therefore it is checked at compile time. The type of sqlselect must be

({1: τ} list, w) SQL.query

in SML#. If it is satisfied, the type of the subquery expression (sqlselect) is (T,w)τ for some τ. The number of rows in the result of the subquery is checked at runtime by a database server. If the subquery returns zero or more than one rows, the SQL.Exec exception is raised.

For an EXIST subquery exists ( sqlselect ), if sqlselect is of type

(τ list, w) SQL.query,

then the entire expression is of type (τ,w)SQL.bool3 for some τ.

22.4.9 Embedded SQL value expressions

To obtain a first-class SML# object of an SQL value expression sqlexp, write an expression _sql(sqlexp). The type of _sql(sqlexp) is (τ -> τ, w) SQL.exp if sqlexp is of type (τ,w)τ. For example, the type of

val q = _sql(1 + #employee.salary)

is

val q : [ ’a#{employee: ’b}, ’b#{salary: int}, ’w. (’a -> int, ’w) SQL.exp ].

To embed a fragment of SQL value expression obtained in such a way in another SQL value expression, use the (...exp) expression. For example, the following expression is constructed by embedding q in it:

_sql((...q) > 10)

This expression denotes the following SQL query:

1 + employee.salary > 10

The type of (...exp) is (τ,w)τ if the type of exp is (τ -> τ, w) SQL.exp. If τ is inconsistent with the type of tables referenced by the expression where the exp is embedded in, a type error occurs. For example, the following expression causes a type error:

_sql((...q) > 10 and #employee.salary = "abc")

The exp in the (...exp) notation is not an SQL value expression but an SML# expression. In exp, SQL keywords are not keywords but ordinary identifiers as well as ordinary SML# expressions.

Although an embedded SQL value expresssion constitutes a constant expression in the resulting SQL value expression, the constant expression is not evaluated by SML#. Whether or not an SQL value expression is evaluated by SML# is determined only by the static syntacitic structure of the expression. For example, the following function nat n does not return the value n, but returns the SQL value expression 1 + 1 + + 1 where the number of 1s is n

# fun nat 1 = _sql(1)
    | nat n = _sql(1 + (...nat (n - 1)));
val nat = fn : [’a, ’b. int -> (’a -> int, ’d) SQL.exp]
# SQL.expToString (nat 5);
val it = "(1 + (1 + (1 + (1 + 1))))" : string