The grammar productions are defined in EBNF. This is a common notation for grammars, and is described elsewhere. Please be aware of that the grammar files use their own "dialect" of EBNF, explained below.
Each production definition consists of a production name and a set of production alternatives. Each alternative in turn contains references to tokens or other productions. Self-referencing is also accepted. A simple example of a production definition is shown in the figure below.
Prod = "TokenString" OtherProd | TOKEN_NAME OtherProd ;
Figure 1. A simple example production. This production contains two alternatives illustrating all the valid ways of referring to a token or a production.
In the example above "=" is used for separating the production
name from the definition, instead of the standard EBNF ":=". Also
the definition must end with a ";" character. The production name
follows the same restrictions as the token name, i.e. it may only
contain characters from the set [a-zA-Z0-9_]
.
The grammar files also allow some constructs that are not part of standard BNF or EBNF. In particular this include parenthesizing with the "{", "}", "[", and "]" characters. See the figure below for an example of this. Normal grouping with "(" and ")" is of course also allowed.
Prod = "1"? | "2"* | "3"+ | {"4"} | ["5"] ;
Figure 2.
An example of the accepted forms of grouping and
repetitions allowed. The "?", "*", and "+" characters are
allowed after productions, tokens or normal parenthesis. The
"{...}
" parenthesis is a shortform for
"(...)*
", and the "[...]
" parenthesis
is a shortform for "(...)?
".
The EBNF dialect used in Grammatica currently does not allow
null
(i.e. empty) production alternatives. Instead,
the same result can be obtained by making all references to the
production optional with the [...]
construct or
similar.