Re: CR-49: XPath subset: Use subset not full XPath for Key and KeyRef from Andy Clark on 2001-02-16 (www-xml-schema-comments@w3.org from January to March 2001)

From: Andy Clark <andyclar@us.ibm.com>
Date: Fri, 16 Feb 2001 14:44:10 +0900
To: Jim Trezzo <jim.trezzo@oracle.com>
Cc: www-xml-schema-comments@w3.org, "Trezzo,Jim" <JTREZZO@US.ORACLE.COM>
Message-ID: <OF0E967050.7348EF94-ON872569F5.001A8EED@LocalDomain>

Jim,

The subset of XPath proposed for XML Schema by the working group does not
go far enough in limiting the allowed expressions. In fact, further
thoughts on the problem make me wonder whether using XPath at all for
identity constraints in Schema is a good idea. The following problems
highlight some of my concerns.

Problem 1: Qualifying Elements and Attributes

I asked for a clarification on this issue from the general XML Schema
community on the xmlschema-dev mailing list with no response. Perhaps I can
get a response from the www-xml-schema-comments readers.

The problem involves an inconsistency between how element names are
referenced when the grammar specifies a target namespace. In this case, all
references to elements must be fully qualified. For example:

  <schema xmlns='...' targetNamespace='NS' xmlns:a='NS'>
    <element name='foo'>
      <...>
        <element ref='a:bar'/>
      </...>
    </element>
    <element name='bar'>
      <.../>
    </element>
  </schema>

However, an example of using identity constraints in the primer (report.xsd
in section 5) does *not* qualify the XPaths used when defining selectors
and fields. For example:

  <schema xmlns='...' targetNamespace='NS' xmlns:a='NS'>
    <element name='foo'>
      <.../>
      <unique>
        <selector xpath='bar'/>
        <field xpath='@baz'/>
      </unique>
    </element>
    <.../>
  </schema>

Is this inconsistency intentional? In other words, when the grammar has a
target namespace, should we assume that unqualified steps in the XPaths are
in the target namespace? What about when the content models are a mix of
target namespaces?

Problem 2: Ambiguous Element Step

Allowing the use of the descendant:: axis in XPaths used for selectors and
fields has several problems. The first relates to the ability of
implementations to support identity constraints in an efficient manner.
Without this axis, implementation of identity constraints is both
straightforward and efficient (even for streaming XML such as SAX).
However, allowing the descendant axis complicates the implementation for
serial processors.

The second problem relates to the fact that field values must be compared
in the value space based on the attribute/element's datatype. However, the
descendant axis introduces ambiguity. For example:

  <element name='foo'>
    <...>
      <element name='bar'>
         <...>
           <element name='bar'/>
         </...>
      </element>
    </...>
    <unique>
      <selector xpath='.//bar'/>
      <field xpath='@baz'/>
    </unique>
  </element>

Which bar element does the selector match? Both bar elements could declare
a baz attribute with different types.

Problem 3: Anonymous Element Names

This is perhaps the most troubling because XPath is not capable of
distinquishing between globally declared elements and anonymous elements in
a target namespace where the form default is unqualified.

-AndyC

Received on Friday, 16 February 2001 00:47:15 UTC