Comment on XPath from Rick Jelliffe on 1999-10-12 (www-xpath-comments@w3.org from October to December 1999)

From: Rick Jelliffe <ricko@gate.sinica.edu.tw>
Date: Tue, 12 Oct 1999 18:33:21 +0800
To: <www-xpath-comments@w3.org>
Message-ID: <001201bf149d$34b59680$8d066d8c@sinica.edu.tw>

XPath should allow addressing of everything in the Information Set of a
validated document. Of course, there is legitimate discussion over what
constitutes "information set", but for the purposes of this comment it
includes everything needed to create a local graph of the document and all
identifiers associated as notations or unparsed external entities to nodes.

I am a little surprised that XPath is not ID-aware to any great extent:
    1) there is no ID axis
    2) when there is no DTD, there can be no IDs (s5.2.1).

The first, I imagine, is because there could be cycles in the node list. Are
there other reasons?  This could be overcome by specifying that no node can
appear twice in a node-list using an ID axis.  A node list axis would be
nice because it allows exploration of graphs rather than just trees.
Building this tree bias into XPath without need will unduly constrain
specifications that use XPath.

I suppose the second constraint is because of uncertainty until XLink and
XML Schemas arrive. But I wonder if XPath needs to refer to DTDs at all in
this regard. There may be the feeling that IDs and IDREFs will disappear,
but I don't think XPath should assume that to be true.

(To digress, I think that core XML or xml:link should have a pre-defined
attribute xml:id to allow linking independent of any schema. It should be a
markup issue, not a schema issue. )

On a similar tack, I would have liked to have seen paths from entity
attributes to identifiers, from PI targets to notation identifiers, and from
notation attributes to notation identifiers. Given that DOM 2 allows these
as an "extended interface" I think XPath should allow addressing them.

My recommendations are:

1) The restriction in 5.2.1 be replaced by a more general warning that will
not require a rewrite when XML Schemas become available?

2) There should be an ID axis. This of course has interesting implications
for XLink.

3) There should also be an "identifier" axis (or perhaps axes for "entity"
and "notation") for attributes and PIs, with the same general restriction as
IDs have (i.e., that some schema such as a DTD somehow must have supplied
the information)

XPath should not implement judgements about which parts of XML are useful or
proper or desirable or have a future. Unless all structures in an XML
document are supported by all W3C specs, those structures will be
impoverished to the penalty of users.  (In the light of the recent HTML
debacle, I think each Working Group has to make extra care to justify that
they are not attempting to rework another WG's specification by stealth.)

Rick Jelliffe
Computing Centre
Academia Sinica (W3C Member)

Received on Tuesday, 12 October 1999 06:37:33 UTC