- From: Dave Pawson <dave.pawson@gmail.com>
- Date: Thu, 16 Dec 2021 11:48:51 +0000
- To: Norm Tovey-Walsh <norm@saxonica.com>
- Cc: ixml <public-ixml@w3.org>
Norm, you're using a*"#" where the spec uses "a"*"#" (if that makes any difference to your explanation)? It grated on me reading it as a regex! As regex I think it would be "a"*"#"* to allow 0 or more of either or both? but I'm no longer sure that is intended? I think it is. Assuming quotes, is (a#)* or [a#]* any clearer Tom? regards On Thu, 16 Dec 2021 at 11:23, Norm Tovey-Walsh <norm@saxonica.com> wrote: > > > I'm confused. To my thinking "a"*"#" would match any number of 'a' > > characters followed by one hash character. > > > > To me, a#a a#a#a appears wrong. Is zero or more 'left associative' if > > that's the right expression? How does the a# repetition match? > > > > Same applies to the zero or one example. > > That took me a while to get my head around too. I think it’s just part > of the definition of repeat0 and repeat1. Note that these are different: > > a*"#" > > and > > a*,"#" > > The former can be distinguished as “factor * sep” and has the semantics > that the spec describes for repeat0. The latter is “a*” followed by “#”. > > Later, “hints for implementors” observes that > > a*"#" > > can be rewritten. For example: > > X: a*"#" . > > can be rewritten as > > X: Y . > -Y: a+"#" ; . > > which can be further rewritten as > > X: Y . > -Y: Z ; . > -Z: a, ("#", a)* > > And on we go: > > X: Y . > -Y: Z ; . > -Z: a, A . > -A: "#", a, A ; . > > I find “the alternative that matches nothing” in the grammar quite hard > to read. I almost wish we had a special symbol for it, like ∅. Then we’d > have: > > X: Y . > -Y: Z ; ∅ . > -Z: a, A . > -A: "#", a, A ; ∅ . > > Unless I’ve got something wrong along the way, of course. > > All of this has turned out to be an interesting challenge for me to > implement because I don’t think the PEP parser can be made to accept > “nothing” as an alternative. Instead, I think (I think!) I’ve modified > it so that it understands an optional terminal or nonterminal. So I’ll > end up rewriting it something like this: > > X: Y . > -Y: Z? . > -Z: a, A . > -A: ("#", a, A)? . > > Where the optionality is addressed in the parser when candidate edges > are selected. > > But now we’re into about four levels of manual rewriting so I make no > claims of correctness! > > Be seeing you, > norm > > -- > Norm Tovey-Walsh > Saxonica -- Dave Pawson XSLT XSL-FO FAQ. Docbook FAQ.
Received on Thursday, 16 December 2021 11:49:15 UTC