Re: Permit (greedy) conflicting wildcards

On 15 Mar 2007, at 08:22 , Pete Cordell wrote:

> Just reading David Orchard's document "Guide to Versioning XML  
> Languages
> using XML Schema 1.1"
> (http://www.w3.org/TR/2006/WD-xmlschema-guide2versioning-20060928/).

Thank you!  Readers is what that document (like all our documents)
most needs.

> It says that under the current interpretation of XSD 1.1 the  
> following (slightly simplified from David's document) is illegal  
> due to the minOccurs="0" of middle name allowing the two adjacent  
> wildcards to conflict:
> ...
> It then says that new wording in XSD1.1 has been added to make the  
> following
> legal:
>
...
>
> Replacing the xs:anys with xs:element declarations, UPAC wise I don't
> think the following would be legal:
>
>    <xs:sequence>
>      <xs:element name="given" type="xs:string"/>
>      <xs:element name="any" minOccurs="0" maxOccurs="unbounded"/>
>      <xs:sequence minOccurs="0">
>          <xs:element name="middle" type="xs:string" />
>          <xs:element name="any" minOccurs="0" maxOccurs="unbounded"/>
>      </xs:sequence>
>      <xs:element name="family" type="xs:string"/>
>    </xs:sequence>

Like Noah Mendelsohn, I'm having trouble seeing the problem
here.  An 'any' element which follows a 'middle' element
matches only the second 'any' particle, and any other 'any' element
matches only the first.  You can't get to the second
'any' particle except by having a 'middle' element, because
within the optional group, a 'middle' element is required
unconditionally.

> So I don't see why the second example should be considered anymore
> intrinsically legitimate than the first example.

On that, at least, we're agreed.  The two examples you give
have the same legitimacy.

> As the second example seems a bit of a fudge, and is non-intuitive and
> messy, I propose that the rules be changed to make the first  
> example legal.
> Basically a wild card should be allowed to be greedy and gobble up  
> anything
> until it encounters something that does match the wild card spec,  
> or is an
> immediately accessible element name on the path following the  
> wildcard.

Speaking for myself, I too would like the first example
to be legal.  I'd like to get rid of the UPA constraint
entirely.  If it's essential that the PSVI be entirely
deterministic, as some would prefer, then the requirement
could more usefully be a requirement that the element-appinfo
mapping be deterministic:  different appinfo on locally
declared elements is currently the only way UPA affects
the PSVI -- and that's only because the 1.0 spec made an
error in formulating the Element Declarations Consistent
rule, which it now appears to be too late to fix.

Requiring a greedy match would be one way to resolve
particle competition, agreed, although surely there
will be situations in which some schema authors want
a non-greedy match.  The difficulty is that finite state
automata and regular expressions with greedy
matching seem not to behave the same way FSAs and
regexes without greedy matching behave. Are the same
closure properties proven?  Do the same theorems hold?

The bitter experience of having SGML require deterministic
regexes, and then discovering somewhat later that
they do NOT have the same closure properties as normal
regexes, has made me rather cautious.


> I would even say, if someone wants to do:
>
>    <xs:sequence>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>    </xs:sequence>
>
> they should be allowed to do it and it wouldn't be an error.  Although
> helpful tools might care to issue a warning that they're wasting  
> their time!

+1.  But on this, I repeat, I do not speak for the WG.

--C. M. Sperberg-McQueen

Received on Friday, 16 March 2007 01:37:35 UTC