W3C home > Mailing lists > Public > xmlschema-dev@w3.org > September 2005

RE: key-keyref constraints: conflicts in node-table

From: Michael Kay <mike@saxonica.com>
Date: Mon, 26 Sep 2005 20:21:07 +0100
To: "'Kasimier Buchcik'" <K.Buchcik@4commerce.de>
Cc: "'XML-SCHEMA'" <xmlschema-dev@w3.org>
Message-ID: <E1EJyXN-0008TR-Re@maggie.w3.org>

> 
> A strategy which would follow this assumption could look like
> (but only semantically, since this is not streamable):
> 
> 1. Build the sum of all qualified node sets of all scope elements
>   of the referenced IDC key/unique definition in the
>   descendant-or-self axis, starting with a scope element
>   of the keyref.
> 
> 2. Remove all nodes with identical key-sequences from this set.
> 
> 3. Find a match for the key-sequence of a qualified node of the keyref
>   in this set.
> 
> 4. If no match was found then we get an error, since either
>   no matching key-sequence existed, or a duplicate existed and
>   was removed.
> 
> Does this make sense?

I think this is essentially what Saxon is doing today (except that I found a
bug in the handling of <xs:selector xpath="."/> which is proving very hard
to fix without stopping everything else from working!). I assume you mean in
(2) that if you find 2 duplicates, you remove both. (It's nice however to
give different errors for "no match" and "more than one match".) But what
worries me is that the way the rule is phrased, removal of duplicates
depends on whether they bubbled up from a descendant, or were found at the
top level. I can't see any reason why anyone would want to specify it that
way, but that's not necessarily a good way of trying to work out what this
absurdly complex spec actually says.

Michael Kay

> 
> A streaming implementation would "bubble" up the node-table entries.
> There's a nice posting from Jeni Tennison about IDCs at:
> http://lists.w3.org/Archives/Public/xmlschema-dev/2001Nov/0070.html.
> 
> 
> However, after rereading the spec pieces you mentioned, I'm not sure
> anymore if my interpretation was correct :-( So, teachers, see my hand
> rising.
> 
> A test case for the lazy ones. The results of the following case
> differ from processor to processor (just play with the commented-out
> DEFINITIONs of the instance).
> 
> key.xsd
> -------
> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
> 
>   <xsd:element name="SECTION">
>     <xsd:complexType>
> 	  <xsd:sequence>
> 	    <xsd:element ref="SECTION" minOccurs="0"/>
>         <xsd:element name="DEFINITION" minOccurs="0" maxOccurs="5">
>           <xsd:complexType>
>             <xsd:attribute name="term" type="xsd:string"/>
>           </xsd:complexType>
>         </xsd:element>
>         <xsd:element name="TERMREF" type="xsd:string" 
> minOccurs="0" maxOccurs="5"/>					
>       </xsd:sequence>
>     </xsd:complexType>
> 
>     <xsd:key name="defKey">
>       <xsd:selector xpath="DEFINITION"/>
>       <xsd:field xpath="@term"/>
>     </xsd:key>
> 
>     <xsd:keyref name="termRef" refer="defKey">
>       <xsd:selector xpath="TERMREF"/>
>       <xsd:field xpath="."/>
>     </xsd:keyref>
>   </xsd:element>
> </xsd:schema>
> 
> key.xml
> -------
> <SECTION xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>   xsi:noNamespaceSchemaLocation="key.xsd">
>   <SECTION>
>     <SECTION>
> 	  <!--DEFINITION term="zappa"/-->
>     </SECTION>
>     <DEFINITION term="zappa"/>
>   </SECTION>
>   <!--DEFINITION term="zappa"/-->
>   <TERMREF>zappa</TERMREF>	
> </SECTION>
> 
> Just one result:
> Xerces-J is not happy with this instance, but gets happy if we
> uncomment the first (in document order) DEFINITION and
> comment-out the other DEFINITIONs. Dunno what's happening here.
> 
> I noticed that I have to lay hands on Libxml2's implementation
> anyway, as its "bubbling" mechanism seems to interfere with
> evaluation of uniqueness of IDC keys in such recursive
> structures:
> If we uncomment only the first two DEFINITIONs, I get:
> 
> Element 'DEFINITION': Duplicate key-sequence ['zappa'] in
> key identity-constraint 'defKey'.
> 
> This is due to: when the 3rd SECTION is finished, it bubbles
> up its node-table to the 2nd SECTION, thus we have 1 entry for
> "zappa" there. Now we hit the 2nd DEFINITION, which wants to
> add its "zappa" in the node-table of the 2nd SECTION as well,
> and we get a uniqueness violation. It seems I cannot use the
> node-table for evaluation of uniqueness that easily. Pity.
> 
> Regards,
> 
> Kasimier
> 
> 
Received on Monday, 26 September 2005 19:21:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:51 GMT