Pest and ordered choices

I’m probably miles behind the rest of you here, but I ran into an interesting problem trying to express the ixml grammar in Pest [1].

The | (vertical bar) operator in Pest is an ordered choice. It has an aspect of short-circuit evaluation, looking at each option left-to-right and upon finding a match, immediately succeeding out of the whole expression. This means that rules like:

        -term: factor;
               option;
               repeat0;
               repeat1.
If expressed as-is in Pest, like this:
        term = _{ factor | option | repeat0 | repeat1 }

Against a rule like the right-hand side of
        Ixml: s, prolog? s .

The ‘prolog' nonterminal will get picked up as a plain ‘factor’ every time, short-circuiting out the ‘option’ path (where the literal ‘?’ is referenced). I confirmed that changing the order of terms fixes this immediate issue, but there are more complicated instances of this in the grammar. Particularly between terminal and nonterminal (tmark and mark share prefixes).

This seems like it makes Pest unsuitable for this implementation, though I need to sleep on it before any final decisions.

[1] https://pest.rs <https://pest.rs/>
[2] https://pest.rs/book/grammars/syntax.html#ordered-choice <https://pest.rs/book/grammars/syntax.html#ordered-choice>

Received on Wednesday, 6 July 2022 03:10:43 UTC