Menhir is an LR(1) parser generator for OCaml: it compiles LR(1) grammars down to OCaml code.
Menhir replaces ocamlyacc. Legacy grammars can be compiled by Menhir, with a few caveats, described in the reference manual (HTML; PDF).
Menhir is available through opam, OCaml's package manager.
Type opam install menhir
.
Menhir's source code is hosted in this repository (releases; changes).
There is a mailing list for announcements of new releases and discussion of problems, bugs, feature requests, and so on. Only subscribers can post.
Menhir has been designed and implemented by François Pottier and Yann Régis-Gianas.
Menhir has many features that make it superior to the traditional yacc-style parser generators that many people are familiar with.
?
, +
, and *
are
sugar for options, nonempty lists, and arbitrary lists.
Parameterized definitions are expanded away in a straightforward way.
%inline
keyword allows indicating that
a nonterminal symbol should be replaced with its definition at every
use site. This offers a second macro-expansion mechanism.
Together, these expansion mechanisms help write concise and elegant
grammars, while avoiding LR(1) conflicts. In other words, they extend
Menhir's expressive power far beyond LR(1),
while retaining the attractive features of LR(1):
determinism, performance, guaranteed unambiguity.
--table
mode only, Menhir supports incremental parsing.
This means that the state of the parser can be saved at any point (at no
cost) and that parsing can later be resumed from a saved state.
Furthermore, Menhir offers an inspection API which allows the
parser's current state and stack to be examined by the user. This opens
the door to a variety of advanced uses, including error explanation, error
recovery, context-dependent lexical analysis, and so on.
$1
, $2
,
etc., Menhir allows semantic values to be explicitly named. In fact,
Menhir now has
fairly nice syntax
for describing grammars.