Appendix A
BNF
BNF
General Notes
Dylan syntax can be parsed with an LALR(1) grammar.
This appendix uses some special notation to make the presentation of the grammar more readable.
- The opt suffix means that the preceding item is optional.
- A trailing ellipsis (…) is used in two different ways to signal possible
repetition.
- If there is only one item on the line preceding the ellipsis, the item may appear one or more times.
- If more than one item precedes the ellipsis, the last of these items is designated a separator; the rest may appear one or more times, with the separator appearing after each occurrence but the last. (When only one item appears, the separator does not appear.)
- Identifiers for grammar rules are written with uppercase letters when the identifier is used in the phrase grammar but defined in the lexical grammar.
- The grammar does not use distinct identifiers for grammar rules that differ only in alphabetic case.
In the following grammar, some tokens are used multiple ways. For example the hyphen,
is punctuation, a unary operator, and a binary operator; also,
-
,
is a begin-word and
a define-body-word. In some parsing implementations such multiple
meanings of a token may not be possible. However this is just an implementation issue since
the meaning of the grammar is clear. method
method
is used as punctuation
in local-methods and method-definition; since method
is not
a core reserved word, this typically has to be implemented by accepting
any macro-name and checking semantically that the word used is
The grammar as presented is not obviously LALR(1), since the required
changes would tend to obscure the readability for human beings (especially in macro
definitions and case-body). The grammar can be made LALR(1) through well-known standard
transformations implemented by most parser generators.method
.
Lexical Notes
In the lexical grammar, the various elements that come together to form a single token on the right-hand sides of rules must not be separated by whitespace, so that the end result will be a single token. This is in contrast to the phrase grammar, where each element is already a complete token or a series of complete tokens.
Arbitrary whitespace is permitted between tokens, but it is required only as necessary to separate tokens that might otherwise blend together.
Case is not significant except within character and string literals. The grammars do not reflect this, using one case or the other, but it is still true.