In programming language formalizations, it is generally useful to distinguish static semantics and dynamic semantics.
-
The static semantics defines the constraints that must be satisfied by a program at compile time. E.g. every operator is applied to operands of the right types, every referenced variable exists, and so on. This may involve not only type checking, but also type inference.
-
The dynamic semantics defines how a program executes. It can be thought of as a high-level interpreter for the programming language.
Both static and dynamic semantics are defined over the abstract syntax (ASTs): the static semantics checks constraints on the various parts of an AST, possibly inferring additional parts of it (e.g. missing types); the dynamic semantics goes through each part of the AST, manipulating the computation state according to the language constructs encountered.
It is generally convenient to define a ‘defensive’ dynamic semantics, i.e. one where the aforementioned interpreter checks that all the preconditions of an operation are satisfied before performing the operation. These preconditions, in particular, include the fact that the operands of an operation have the right types. For example, in Leo, before executing +
, the interpreter checks that the two values are both integers of the same type, or both field elements, or both group elements. As another example, also in Leo, upon encountering a variable, the interpreter checks that the variable is in the current computation state, which means that a definition of the variable has been previously executed (in the current scope). Note that, in order to check that values have expected types, the values manipulated by the interpreter carry information about their types, e.g. a value is not just the integer 3, but the integer 3 of type u8
, for instance. Here ‘operation’ does not mean just the arithmetic unary and binary operators, but every manipulation of values performed by the program, e.g. accessing an array.
When the interpreter encounters an unsatisfied precondition (e.g. when trying to apply +
to an an address and an array), it stops with an error. There are different kinds of errors: type errors, undefined variables/functions, divisions by zero, array indices out of bounds, etc.
The static semantics is designed to rule out some of the above errors, such as type errors, and also, at least in certain cases in Leo, division by zero and similar errors (when constant propagation can be performed). How can we know whether the static semantics fulfills this goal? We can formally prove a theorem asserting that if the program satisfies all the static semantic constraints, then the dynamic semantic interpreter will never return certain kinds of errors when it executes the program, for all possible inputs to the program. Other errors, such as division by zero when constant propagation cannot be performed, cannot be ruled out by the static semantics.
Proving this theorem about static and dynamic semantics is commonly done in programming language formalization. It provides an important validation of the design of the language, and of the consistency between static and dynamic semantics; more precisely, the soundness of the static semantics with respect to the dynamic semantics.
The practical significance of this theorem is that an actual interpreter or compiler (i.e. not the high-level defensive interpreter that formally defines the dynamic semantics) can safely assume that, at run time, all the properties checked statically (according to the static semantics) hold, and do not need to be checked explicitly before performing an operation. In particular, this includes the fact that all values have the right types for the operations applied to them. This means that values do not need to carry type information at run time (as they do in the high-level defensive interpreter). For instance, going back to the previous example, the value 3 alone can be used, without explicit information about whether it is a u8
or a u16
or a field element.