XPath 1.0 change proposal from James Clark on 2013-03-14 (www-xpath-comments@w3.org from January to March 2013)

From: James Clark <jjc@jclark.com>
Date: Thu, 14 Mar 2013 20:52:13 +0700
To: www-xpath-comments@w3.org
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Message-ID: <CANz3_EYxh7tcbBdQBFjGGNNy6Uj4h30_ntN2XUEdk417xf7veA@mail.gmail.com>

Michael kindly pointed me to his change proposal for XPath 1.0, which he
tells me the XSLT WG is planning to consider its next meeting, and invited
me to send my comments to this list.

Although I appreciate Michael's work on formalizing the XPath 1.0 data
model, I do not think that at this stage a major rewrite of the XPath 1.0
data model is a good idea.  I would suggest that, after nearly 14 years, an
extremely conservative policy should be adopted towards changes: changes
should be made only when there is a genuine error that is manifested in
discrepancies between implementations or inconsistencies between
implementations and the spec.

The change proposal claims that it was a goal of XPath 1.0 for the data
model be defined without dependencies on XML 1.0.  I find this claim
bizarre given that XML 1.0 is referenced normatively and the data data
model definition is full of references to XML 1.0. The change proposal
seems to be claiming that XPath 1.0 is full of bugs in need of correction
because it does not meet a goal that it never had.

The change proposal also claims that it is a goal of XPath 1.0 that the
data model be defined formally.  This is clearly not the case.  XPath 1.0
does not make the slightest attempt to be formal.  Rather it aims to be
succinct and readily understandable.  The level of formality in the data
model definition is similar to that of the rest of the spec and of
companion specs (XML 1.0, XML Namespaces, XSLT 1.0).  It is also virtually
impossible to be really rigorous about the construction of the data model
from the XML document, without specifying this in the XML spec itself: for
each syntax production the XML spec would need to explain how to
corresponding data model was constructed.

I am also not convinced that in many cases the proposed wording changes are
in fact improvements.  If the WG does decide to go ahead with this change,
I can make some more detailed comments.  But for the moment, I would just
mention a couple of points.

XPath 1.0 does not constrain the root node to have exactly one element
child. In the case where the data model is constructed from an XML
document, there will of course be exactly one child.  But in other cases
(eg querying into a DOM DocumentFragment) it would be unhelpful to impose
such a restriction.  (XPath 1.0 is generally fairly loose -- for example,
it does not define conformance -- so as to provide maximum flexibility to
referencing specs.)

The reason why the spec uses terminology like "There is an element node for
every element" instead of referencing particular productions is because of
entity expansion.  For example, given

<!DOCTYPE doc [
<!ENTITY e "<x>foo</x>">
]>
<doc>&e;&e;</doc>

I am comfortable with saying (somewhat vaguely) that there are three
elements.  I am much less comfortable saying that there are three
occurrences of the "element" production (in fact, I would say it is clear
that there are only two occurrences of the "element" production).

James

Received on Thursday, 14 March 2013 13:53:00 UTC