- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Thu, 03 Feb 2022 16:58:43 -0700
- To: Steven Pemberton <steven.pemberton@cwi.nl>
- Cc: public-ixml@w3.org
Nice challenge; thank you. To avoid the problem of having a grammar that accepts all the test cases because it accepts everything, I suggest also the following negative test cases: a grammar for Pascal-style comments should reject all of these: ( (* (** (*** (**a (**a* (**aa (*a (*a* (*aa a ) *) **) I'll post my initial solution in a separate answer, together with a description of my test results and a corrected solution. Michael Steven Pemberton writes: > I was working on some grammars this week, and tripped up a couple of > times, getting ambiguous grammars, I think because I was thinking in > regexp terms, and grammars are not regexps. > The place where this was obvious was with repetition. Regexps are > greedy, grammars aren't. So "x"* with a regexp will match a maximal > number of x's, but not necessarily with a grammar. > > I reduced it to a small example in order to work on it more, and I > thought maybe some of you would like to try it out too, before I give > an example answer. I found it quite instructional. > > It's Pascal-style comments (XML comments would work similarly), > comments surrounded by (* and *). > > Here is a test set: > > (**) > (***) > (****) > (*****) > (*abc*) > (**abc*) > (*abc**) > (*abc*abc*) > (**abc*abc*) > (*abc**abc*) > (*abc*abc**) > (*abc* )(*abc*) > (*abc*)(*abc*) > > (The last line contains two comments; the line before has a space > before the first closing bracket). > > It was trickier than I expected at first, because as I said, although > I was parsing the comments OK, the results were ambiguous. > > Steven -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Thursday, 3 February 2022 23:59:01 UTC