Re: Grammars are not Regexps

Nice challenge; thank you.

To avoid the problem of having a grammar that accepts all the test cases
because it accepts everything, I suggest also the following negative
test cases:  a grammar for Pascal-style comments should reject all of
these:

(
(*
(**
(***
(**a
(**a*
(**aa
(*a
(*a*
(*aa
a
)
*)
**)

I'll post my initial solution in a separate answer, together with a
description of my test results and a corrected solution.

Michael


Steven Pemberton writes:

> I was working on some grammars this week, and tripped up a couple of
> times, getting ambiguous grammars, I think because I was thinking in
> regexp terms, and grammars are not regexps.
> The place where this was obvious was with repetition. Regexps are
> greedy, grammars aren't. So "x"* with a regexp will match a maximal
> number of x's, but not necessarily with a grammar.
>
>  I reduced it to a small example in order to work on it more, and I
>  thought maybe some of you would like to try it out too, before I give
> an example answer. I found it quite instructional.
>
> It's Pascal-style comments (XML comments would work similarly),
> comments surrounded by (* and *).
>
> Here is a test set:
>
> (**)
> (***)
> (****)
> (*****)
> (*abc*)
> (**abc*)
> (*abc**)
> (*abc*abc*)
> (**abc*abc*)
> (*abc**abc*)
> (*abc*abc**)
> (*abc* )(*abc*)
> (*abc*)(*abc*)
>
> (The last line contains two comments; the line before has a space
> before the first closing bracket).
>
> It was trickier than I expected at first, because as I said, although
> I was parsing the comments OK, the results were ambiguous.
>
> Steven


-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Thursday, 3 February 2022 23:59:01 UTC