Re: CR-49: XPath subset: Use subset not full XPath for Key and KeyRef from Andy Clark on 2001-02-20 (www-xml-schema-comments@w3.org from January to March 2001)

From: Andy Clark <andyclar@us.ibm.com>
Date: Tue, 20 Feb 2001 12:20:06 +0900
To: ht@cogsci.ed.ac.uk (Henry S. Thompson)
Cc: Jim Trezzo <jim.trezzo@oracle.com>, www-xml-schema-comments@w3.org, "Trezzo,Jim" <JTREZZO@US.ORACLE.COM>
Message-ID: <OF8000BE90.A339B941-ON872569F9.000DE967@LocalDomain>

Henry,

>> Allowing the use of the descendant:: axis in XPaths used for selectors
and
>> fields has several problems. The first relates to the ability of
>> implementations to support identity constraints in an efficient manner.
>> Without this axis, implementation of identity constraints is both
>> straightforward and efficient (even for streaming XML such as SAX).
>> However, allowing the descendant axis complicates the implementation for
>> serial processors.
>
> I don't understand this point.  Implementing the .//name is trivial in
> streaming mode, and corresponds exactly to what you already have to do
> to implement ID/IDREF.

I guess the real problem is the terseness of the XPath subset
description. It's unclear from the text whether the following
would be allowed:

  x//y/z
  x//y//z

In other words, is the descendant axis only allowed to be
used before the last element step? If so, then I agree that
the implementation is simplified.

>> The second problem relates to the fact that field values must be
compared
>> in the value space based on the attribute/element's datatype. However,
the
>> descendant axis introduces ambiguity. For example:
>> [...]
>
> It matches both, that's the whole point of having a scoped selector.
> And what's the problem, anyway?  You've asserted by the above that
> 'bar' elements are unique wrt their 'baz' attribute's value.  So you
> keep a table of bar attribute's 'baz' attribute's values, and throw an
> error if you hit one twice.  Seems straightforward to me.

But they have different types and it's unclear which type
should be used for the value space comparison. How can you
keep a collection of values of different types and still
perform the correct comparison to ensure uniqueness? I'm
sorry to keep belaboring this point but it doesn't seem to
make sense to me.

And I have other questions regarding the subset...

Regarding field examples:

1) Why is there a sample field specified as "ancestor::x/@"
   when text at the bottom of the subset states directly
   that the verbose form is not allowed and no reverse
   axes are allowed? Does the subset intend to allow
   people to specify fields outside of the element scope
   by using the ancestor axis?
2) If ancestor is allowed, why not ".." as long as it is
   followed only by "@" or "x/@"?
3) If ancestor is allowed, then we have another ambiguity
   because there may be multiple elements on the ancestor
   axis that match the path. Are all accepted (even if
   they have different types)? Is only the first matched
   value selected?
4) Why aren't predicates allowed for fields?

Regarding selector examples:

5) Did the subset intentionally leave out "." as a valid
   selector expression?
6) Why the explicit comment to "note only 1 level" for
   the "x/y" example?
7) Is there a good reason that "[y]" is allowed? The
   implementation is required to buffer the document in
   order to support this. Take the following example:

   selector: x[y]
   field: a
   field: b

   <x>
    <a>1</a>
    <b>2</b>
    <y/>
   </x>

   The implementation doesn't know that the values of
   these particular fields should be stored until it
   has seen the entire subtree. It gets more complicated
   if the descendant axis is used in the fields which
   may match multiple elements.
8) Why must "[@y]" appear only at the far right end? It
   seems that this is easy to implement at any child
   step.
9) Is there any plan to expand the allowed predicates
   to include things like "[4]", "[position()<4]",
   "[@a='hello']", etc?

In general:

10) Is there any explicit limit to the length of XPath
    expression used for selectors and fields? It seems
    no but there are places where an explicit length is
    mentioned (e.g. "note only 1 level" comment).

Is there a newer subset description available? If so, please
send that to me so that I can direct my questions to the most
recent document.

-AndyC

Received on Monday, 19 February 2001 22:20:19 UTC