Re: Permit (greedy) conflicting wildcards from Pete Cordell on 2007-03-20 (xmlschema-dev@w3.org from March 2007)

From: Pete Cordell <petexmldev@tech-know-ware.com>
Date: Tue, 20 Mar 2007 19:58:54 -0000
To: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
Cc: <noah_mendelsohn@us.ibm.com>, <xmlschema-dev@w3.org>
Message-ID: <01b601c76b2a$364887f0$c900a8c0@Codalogic>
Original Message From: "C. M. Sperberg-McQueen" <cmsmcq@...>
> On 20 Mar 2007, at 05:53 , Pete Cordell wrote:
>
>> Original Message From: <noah_mendelsohn@us.ibm.com>
>>
>> This seems an eminently sensible justification to me, and I would 
>> imagine what many people would expect.  What is the justification  for 
>> the currently specified set of rules?
>
> Here's one possible motivation:  when you write a
> wildcard that matches an element named 'given', the
> wildcard matches an element with that name.  If you
> didn't want the wildcards in your content model to
> match elements in your target namespace, why did you
> not write a wildcard that didn't match them?

Well, the spec already says that wildcards don't match just any element due 
to a named element winning the conflict between a wildcard and an element. 
If it was desired to insist that xs:any, really meant any element according 
to its spec, then UPA conflicts could be avoided by insisting that the 
snippet mentioned earlier was written as:

    <xs:sequence>
      <xs:element name="given" type="xs:string"/>
      <xs:any namespace="##any" processContents="lax"
              notQName="middle"
              minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="middle" type="xs:string"/>
      <xs:any namespace="##any" processContents="lax"
              notQName="family"
              minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="family" type="xs:string"/>
    </xs:sequence>

But the spec has admitted that this is not particularly helpful and included 
some default rules about what an xs:any can really match.  I think those 
default rules should be more helpful and give a less surprising result by 
saying that a wildcard can't match anything with the same name as an 
explicitly named element (or a member of their substitution groups).

If the user really does want specifically named items in the position of 
wildcards, then they can do as Noah suggested.

>> To me, I think many people would be surprised that the rules  allowed the 
>> example instance above to be valid.  When doing  language design, "No 
>> surprises" seems like a good mantra.
>
> Me, I think I'd be surprised if a wildcard which is
> written to match any element at all were to fail to
> match some element.  "Say what you mean" is also
> a good rule.

But if you naively looked at the following segment I think you'd ask "why 
doesn't that wildcard consume my middle element?"

       <xs:any namespace="##any" processContents="lax"
               minOccurs="0" maxOccurs="unbounded"/>
       <xs:element name="middle" type="xs:string"/>

So it's already moved away from "Say what you mean," and you're going to 
have to look at the spec or reference book to find out why.  What's 
necessary is to decide a sensible set of rules, and I don't think the spec 
is there yet.

BTW - under the current XSD 1.1 rules, if I had:

    <xs:sequence>
      <xs:element name="value" type="xs:int"/>
      <xs:any namespace="##any" processContents="lax"
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>

would the following instance be valid:

    <value>12</value>
    <value>twelve</value>

I think it would, because the wildcard does not check that the type is valid 
according to the named element (is that correct?).  So I think this is a 
further surprising result using the existing rules (because similarly named 
elements should have similarly typed values).  Using the simpler rule would 
avoid this.

Cheers,

Pete.
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx/
http://www.codalogic.com/lmx/
=============================================
Received on Tuesday, 20 March 2007 19:59:45 UTC