Concerns over XML schema 1.1 wildcarding

Just reading David Orchard's document "Guide to Versioning XML Languages 
using XML Schema 1.1" 
(http://www.w3.org/TR/2006/WD-xmlschema-guide2versioning-20060928/).

The permitted wildcarding seems to have gone from one extreme to another and 
I believe it makes life very difficult for databinding tools.

I think the example that best illustrates the problem is:

   <xs:group name="middle">
      <xs:sequence>
          <xs:element name="middle" type="xs:string" />
          <xs:any namespace="##any" processContents="lax"
                minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
   </xs:group>

   <xs:group name="family">
       <xs:sequence>
         <xs:element name="family" type="xs:string"/>
         <xs:any namespace="##any" processContents="lax"
              minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
   </xs:group>

  <xs:group name="name">
    <xs:sequence>
      <xs:any namespace="##any" processContents="lax"
              minOccurs="0" maxOccurs="unbounded"/>

       <xs:group ref="given"/>
       <xs:group ref="middle" minOccurs="0"/>
       <xs:group ref="family"/>

    </xs:sequence>

As you no doubt know, under the existing 1.0 UPAC rules each particle (let's 
focus on elements for simplicity) can basically be processed along the lines 
of:

    while( input_name == my_name && count < maxOccurs )
        store_input_as_one_of_mine();

For an xs:any, it would look like:

    while( matches_wildcard_spec( input_name ) && count < maxOccurs )
        store_input_as_one_of_mine();

With the new UPAC rules, this would have to be modified to something like:

    while( matches_wildcard_spec( input_name ) && ! is_a_named_member()
                && count < maxOccurs )
        store_input_as_one_of_mine();

As we represent groups as C++ classes, this is particularly problematic for 
wildcards that are in groups because the set of named members will depend on 
the variously nested particles in which the group is referenced.  Hence the 
list of named members is no longer a property of the group (and so wholly 
contained by the C++ class that represents it), but something that has to be 
computed for each situation that the group may be referenced from.  This is 
a very much more complicated situation.

On the other hand, any number of wildcards appearing before an end tag is no 
problem.  Isn't it sufficient for extensibility purposes to restrict 
wildcards that permit the target namespace to only appear at the end of the 
composite?

Cheers,

Pete.

(By comparison, is the case (applicable to both XSD 1.0 and XSD 1.1) when a 
parser is processing an input stream and its cursor position (for want of 
something to call it) is immediately prior to the reference to the group. 
Here the parser has to decide whether to enter the group or not.  But in 
this case all of the information it needs to make this decision is a 
property of the group and hence the class and hence much easier to 
implement.)
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx/
http://www.codalogic.com/lmx/
=============================================

Received on Thursday, 15 March 2007 14:56:13 UTC