- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Thu, 26 May 2022 12:27:51 +0000
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "Norm Tovey-Walsh" <norm@saxonica.com>
- Cc: public-ixml@w3.org
On Tuesday 24 May 2022 16:36:46 (+02:00), C. M. Sperberg-McQueen wrote:
>
> Norm Tovey-Walsh writes:
>
> >> By the way, note that the following is now legal ixml:
> >>
> >> values: value+++",".
>
> > What does that mean?
>
> For what it's worth, I take it to mean
>
> <repeat1>
> <nonterminal name="value"/>
> <sep>
> <insertion string=","/>
> </
> </
This is exactly right. An insertion matches zero characters on input, but
produces characters on output. For instance:
data: ~[]+++" ".
would insert a space between each character.
abc => <data>a b c</data>
values: [Nd]+++", ".
Would insert a comma and a space between each digit.
123 => <values>1, 2, 3</values>
Steven
> If it is, then there is no separator, and the grammar of which this
> fragment is part will work best for values with fixed length or values
> which somehow can be parsed without delimiters. If every value must
> begin with a letter and end with a digit, then a1bc23def456 can be
> uniquely parsed without delimiters, right?
>
> For things like integers or decimal numbers, this grammar would make
> sense perhaps in a stress test checking how well the processor deals
> with finite, but fast-growing, ambiguity. Given
>
> value = ['0'-'9']+.
>
> and the input '12345', I'll get one parse with five values, one parse
> with one value, four each with two or four values, and seven with three
> values. So seventeen overall.
>
> (Hmm. This is not Pascal's Triangle. But I'm sure there is a
> formula for how many ways there are to partition a sequence of length n
> into k contiguous subsequences. I just can't remember.)
>
>
> Michael
>
--
Received on Thursday, 26 May 2022 12:28:21 UTC