Re: Pest and ordered choices

Thank you for this.

I haven't used PEGs myself (because it seems so hard to predict from the
grmmar what language it defines - perhaps I am just tripped up by the
resemblance to conventional grammars), so I am a novice.

But it occurs to me to wonder: given an LL(1) grammar would a PEG parser
and a conventional parser be guaranteed to recognize the same set of
sentences?

If so, then would an LL(1) version of the ixml grammar be of interest to
potential users of PEG parsers?

If PEG parsers are in fact unsuitable for recognizing arbitrary
context-free grammars, then perhaps an LL(1) grammar for ixml would not
in fact be helpful, since it would get you past one brick wall only to
leave you looking at another.

But if others would also be intereted in an LL(1) grammar for ixml, it
might spur me to get one done.

Michael


M Joel Dubinko <micah@dubinko.info> writes:

> In case anyone else wants to play around with this, I put this very rough draft up on GitHub. [1]
>
> Thanks to a nice WebAssembly hack, you can play with it in a browser without having to configure any Rust environment. Just go to https://pest.rs and paste in the grammar from ixml.pest into the “Grammar” box, and a sample
> grammar in the Input box.
>
> I’ve been using a greatly simplified input, but feel free to put in all of ixml.ixml. :)
>
> With
>     ixml: s, prolog?, s .
> Or
>     ixml: s, prolog?, rule++RS, s.
>
> In particular, play around with the order of the choices in the ’term’ rule, as mentioned previously.
>
> It’s possible to get into a state where the parser says “expected comment” — which I am still figuring out. 
>
> j
>
> [1] https://github.com/mdubinko/hackles/blob/main/src/ixml.pest
>
>  On Jul 5, 2022, at 11:10 PM, M Joel Dubinko <micah@dubinko.info> wrote:
>
>  I’m probably miles behind the rest of you here, but I ran into an interesting problem trying to express the ixml grammar in Pest [1].
>
>  The | (vertical bar) operator in Pest is an ordered choice. It has an aspect of short-circuit evaluation, looking at each option left-to-right and upon finding a match, immediately succeeding out of the whole expression. This means that
>  rules like:
>
>          -term: factor;
>                 option;
>                 repeat0;
>                 repeat1.
>  If expressed as-is in Pest, like this:
>          term = _{ factor | option | repeat0 | repeat1 }
>
>  Against a rule like the right-hand side of
>          Ixml: s, prolog? s .
>
>  The ‘prolog' nonterminal will get picked up as a plain ‘factor’ every time, short-circuiting out the ‘option’ path (where the literal ‘?’ is referenced). I confirmed that changing the order of terms fixes this immediate issue, but there are
>  more complicated instances of this in the grammar. Particularly between terminal and nonterminal (tmark and mark share prefixes).
>
>  This seems like it makes Pest unsuitable for this implementation, though I need to sleep on it before any final decisions.
>
>  [1] https://pest.rs
>  [2] https://pest.rs/book/grammars/syntax.html#ordered-choice


-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Wednesday, 6 July 2022 16:06:47 UTC