Grammars are not Regexps

I was working on some grammars this week, and tripped up a couple of times, 
getting ambiguous grammars, I think because I was thinking in regexp terms, 
and grammars are not regexps.
The place where this was obvious was with repetition. Regexps are greedy, 
grammars aren't. So "x"* with a regexp will match a maximal number of x's, 
but not necessarily with a grammar.

  I reduced it to a small example in order to work on it more, and I 
thought maybe some of you would like to try it out too, before I give an 
example answer. I found it quite instructional.

It's Pascal-style comments (XML comments would work similarly), comments 
surrounded by (* and *).

Here is a test set:

(**)
(***)
(****)
(*****)
(*abc*)
(**abc*)
(*abc**)
(*abc*abc*)
(**abc*abc*)
(*abc**abc*)
(*abc*abc**)
(*abc* )(*abc*)
(*abc*)(*abc*)

(The last line contains two comments; the line before has a space before 
the first closing bracket).

It was trickier than I expected at first, because as I said, although I was 
parsing the comments OK, the results were ambiguous.

Steven

Received on Thursday, 3 February 2022 21:06:25 UTC