Re: repetition

Norm, you're using a*"#" where the spec uses "a"*"#" (if that makes
any difference
to your explanation)?

It grated on me reading it as a regex!
As regex I think it would be "a"*"#"* to allow 0 or more of either or
both? but I'm no longer
sure that is intended? I think it is.
Assuming quotes, is (a#)* or [a#]* any clearer Tom?

regards



On Thu, 16 Dec 2021 at 11:23, Norm Tovey-Walsh <norm@saxonica.com> wrote:
>
> > I'm confused. To my thinking "a"*"#"  would match any number of 'a'
> > characters followed by one hash character.
> >
> > To me, a#a a#a#a appears wrong. Is zero or more 'left associative' if
> > that's the right expression? How does the a# repetition match?
> >
> > Same applies to the zero or one example.
>
> That took me a while to get my head around too. I think it’s just part
> of the definition of repeat0 and repeat1. Note that these are different:
>
>   a*"#"
>
> and
>
>   a*,"#"
>
> The former can be distinguished as “factor * sep” and has the semantics
> that the spec describes for repeat0. The latter is “a*” followed by “#”.
>
> Later, “hints for implementors” observes that
>
>   a*"#"
>
> can be rewritten. For example:
>
> X: a*"#" .
>
> can be rewritten as
>
> X: Y .
> -Y: a+"#" ; .
>
> which can be further rewritten as
>
> X: Y .
> -Y: Z ; .
> -Z: a, ("#", a)*
>
> And on we go:
>
> X: Y .
> -Y: Z ; .
> -Z: a, A .
> -A: "#", a, A ; .
>
> I find “the alternative that matches nothing” in the grammar quite hard
> to read. I almost wish we had a special symbol for it, like ∅. Then we’d
> have:
>
> X: Y .
> -Y: Z ; ∅ .
> -Z: a, A .
> -A: "#", a, A ; ∅ .
>
> Unless I’ve got something wrong along the way, of course.
>
> All of this has turned out to be an interesting challenge for me to
> implement because I don’t think the PEP parser can be made to accept
> “nothing” as an alternative. Instead, I think (I think!) I’ve modified
> it so that it understands an optional terminal or nonterminal. So I’ll
> end up rewriting it something like this:
>
> X: Y .
> -Y: Z? .
> -Z: a, A .
> -A: ("#", a, A)? .
>
> Where the optionality is addressed in the parser when candidate edges
> are selected.
>
> But now we’re into about four levels of manual rewriting so I make no
> claims of correctness!
>
>                                         Be seeing you,
>                                           norm
>
> --
> Norm Tovey-Walsh
> Saxonica



-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.

Received on Thursday, 16 December 2021 11:49:15 UTC