W3C home > Mailing lists > Public > xmlschema-dev@w3.org > March 2007

Re: Permit (greedy) conflicting wildcards

From: Pete Cordell <petexmldev@tech-know-ware.com>
Date: Mon, 19 Mar 2007 12:28:47 -0000
Message-ID: <000a01c76a22$2da10c70$c900a8c0@Codalogic>
To: <xmlschema-dev@w3.org>

My current understanding of what element information items wildcards accept 
in XSD1.1 is that the wildcard can consume whatever does not result in a UPA 
conflict with the proviso that elements win if there is an element/wildcard 
conflict.

UPA assessment is often something that is quite hard to get right and people 
have reported on various lists difference between tools in assessment of UPA 
constraints.  Also, UPA assessment can be time consuming and I believe some 
tools have the option to switch it off as a result.  Hence it seems unwise 
to make what a wildcard can accept be dependent on UPA assessment.

More importantly, if you had a schema of the form:

    <xs:sequence>
      <xs:element name="given" type="xs:string"/>
      <xs:any namespace="##any" processContents="lax"
              minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="middle" type="xs:string"/>
      <xs:any namespace="##any" processContents="lax"
              minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="family" type="xs:string"/>
    </xs:sequence>

then I think under the current rules, the following input would be valid:

<given>abc</given>
<middle>abs</middle>
<given>ag</given>
<family>jhjh</family>

where the second given is consumed by the wildcard after <middle>.

I've also seen something in XSD1.1 to suggest that the following would be 
valid:

<given>abc</given>
<given>ag</given>
<middle>abs</middle>
<family>jhjh</family>

It seems wrong to me to allow these to be valid instances.

It would be more sensible, and easier to implement, if all wildcards in a 
group were not permitted to match any of the named elements (or elements 
named in their substitution groups) irrespective of where they appear in 
relation to the other elements.

You could say that each wildcard implicitly has a notQName member which is 
augmented with the QNames of each of the elements in the particle (plus any 
non-elemental particles' particles, recursively) and any members of said 
elements substitution groups.

Any comments?

Pete.
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx/
http://www.codalogic.com/lmx/
=============================================

----- Original Message ----- 
From: "Pete Cordell" <petexmldev@tech-know-ware.com>
To: <xmlschema-dev@w3.org>
Sent: Thursday, March 15, 2007 2:22 PM
Subject: Permit (greedy) conflicting wildcards


>
> Just reading David Orchard's document "Guide to Versioning XML Languages
> using XML Schema 1.1"
> (http://www.w3.org/TR/2006/WD-xmlschema-guide2versioning-20060928/).
>
> It says that under the current interpretation of XSD 1.1 the following 
> (slightly simplified from David's document) is illegal due to the 
> minOccurs="0" of middle name allowing the two adjacent wildcards to 
> conflict:
>
>    <xs:sequence>
>      <xs:element name="given" type="xs:string"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:element name="middle" type="xs:string" minOccurs="0"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:element name="family" type="xs:string"/>
>    </xs:sequence>
>
> It then says that new wording in XSD1.1 has been added to make the 
> following
> legal:
>
>    <xs:sequence>
>      <xs:element name="given" type="xs:string"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:sequence minOccurs="0">
>          <xs:element name="middle" type="xs:string" />
>          <xs:any namespace="##any" processContents="lax"
>                minOccurs="0" maxOccurs="unbounded"/>
>      </xs:sequence>
>      <xs:element name="family" type="xs:string"/>
>    </xs:sequence>
>
> Replacing the xs:anys with xs:element declarations, UPAC wise I don't
> think the following would be legal:
>
>    <xs:sequence>
>      <xs:element name="given" type="xs:string"/>
>      <xs:element name="any" minOccurs="0" maxOccurs="unbounded"/>
>      <xs:sequence minOccurs="0">
>          <xs:element name="middle" type="xs:string" />
>          <xs:element name="any" minOccurs="0" maxOccurs="unbounded"/>
>      </xs:sequence>
>      <xs:element name="family" type="xs:string"/>
>    </xs:sequence>
>
> So I don't see why the second example should be considered anymore
> intrinsically legitimate than the first example.
>
> As the second example seems a bit of a fudge, and is non-intuitive and
> messy, I propose that the rules be changed to make the first example 
> legal.
> Basically a wild card should be allowed to be greedy and gobble up 
> anything
> until it encounters something that does match the wild card spec, or is an
> immediately accessible element name on the path following the wildcard.
>
> I would even say, if someone wants to do:
>
>    <xs:sequence>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>      <xs:any namespace="##any" processContents="lax"
>              minOccurs="0" maxOccurs="unbounded"/>
>    </xs:sequence>
>
> they should be allowed to do it and it wouldn't be an error.  Although
> helpful tools might care to issue a warning that they're wasting their 
> time!
>
> Cheers,
>
> Pete.
> --
> =============================================
> Pete Cordell
> Tech-Know-Ware Ltd
> for XML to C++ data binding visit
> http://www.tech-know-ware.com/lmx/
> http://www.codalogic.com/lmx/
> =============================================
>
>
>
> 
Received on Monday, 19 March 2007 12:29:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:56:12 UTC