Re: XPath 1.0 change proposal from James Clark on 2013-03-16 (www-xpath-comments@w3.org from January to March 2013)

From: James Clark <jjc@jclark.com>
Date: Sat, 16 Mar 2013 13:23:46 +0700
To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Cc: www-xpath-comments@w3.org
Message-ID: <CANz3_Eah=wpYw9aOfUu_dukPzepWq7TQiRrtJHiVTW=ocY-N6w@mail.gmail.com>
On Fri, Mar 15, 2013 at 10:26 PM, C. M. Sperberg-McQueen <
cmsmcq@blackmesatech.com> wrote:
...

> You seem to be establishing the principle that nodes share properties with
> the corresponding constructs in the XML document (however one might
> choose to define them) if and only if the definition of the data model
> explicitly mentions those properties.
>

Nodes (a concept defined in XPath) have precisely the properties that XPath
says they do. XPath specifies these properties in many cases by referencing
the XML and XML Namespaces Recommendations.


> On this reading, the normative reference to the XML spec seems to have
> no function.
>

You've lost me. The normative reference is fundamental: the data model
section is specifying how to construct a data model instance from a
well-formed XML document, which is defined in the XML spec.  It also relies
upon it for, amongst other things, the definition of document order.


> And this principle cuts away the ground underneath every argument thus
> far brought forward for the claim that the parent and sibling relations
> should be thought of as acyclic, even though the text does not say so.
>

XPath tells you, when you construct the node tree from a well-formed XML
document, which nodes are parents/siblings of which other nodes.  That is a
completely sufficient specification.  It will in fact be the case that,
when you do so, that the parent/sibling relation so defined will be
acyclic, but there is absolutely no need for XPath to say.   If it would
ease your concerns to add a sentence saying a node will never be a
descendant of itself, I would have no problems with that.

>
> > 7 "The namespace nodes are defined to occur before the attribute
> > nodes."  Contradicts the normative statement of document order.
> >
> > This is giving you the definition of document order for attribute nodes.
>
> So - no textual demarcation between the sentences that are (on your
> reading) merely fleshing out / repeating the normative statement of
> document order, and this one, which modifies it by contradicting it?
>
> Well, bad drafting is not a criminal offense.  It can happen to the best.
>

The drafting here could definitely be improved.   If the WG thinks this
drafting is so bad that it will cause real confusion, I would suggest the
following minimal impact change to deal with it.

There is an ordering, *document order*, defined on all the nodes in the
document. For nodes other than attribute and namespace nodes, this order
corresponds to the order in which the first character of the XML
representation of the node occurs in the XML representation of the document
after expansion of general entities.


> > 10 "Nodes never share children: if one node is not the same node as
> > another node, then none of the children of the one node will be the
> > same node as any of the children of another node."  Follows from
> > assumption (a).
> >
> > That is addressing a misinterpretation that could arise because of
> general entity expansions.
>
> But it fails to address the problem of distinctiveness adequately -- it
> only addresses the case where the entity references occur directly
> within different parents.
>

I think I agree with you here (yeah!).  If I have understood you correctly,
given the following XML document:

<!DOCTYPE doc [
<!ENTITY e "<p/>">
]>
<doc>&e;&e;</doc>

your point is that it is unclear whether the first and second child nodes
of the "doc" element are distinct.  I would address this by adding a
sentence (following the quoted sentence above) along the lines of:

"The identity of a child node is determined by its position amongst its
siblings: the i-th child node is the same node as the j-th child node if
and only if i is equal to j."

 > 11 "Every node other than the root node has exactly one parent, which
> > is either an element node or the root node."  Follows from XML 1.0
> > (assuming the usual usage of the word "parent" in XML contexts).
> >
> >
> > This is giving a precise definition of the term parent, which is a
> crucial for XPath.
>
> No, not precise at all.  It is (on the usual reading of the spec) crucial
> for
> XPath that the parent relation be acyclic.  Nothing here says so, implies
> it, or even entails it.
>

The data model section tells you, when you construct the node tree from a
well-formed XML document, which nodes are parents of which other nodes.
 When you so construct the node tree, the parent relation will always be
acyclic.  It is an extensional not an intensional definition.  If you
disagree, please given an example of a well-formed XML document for which
it is unclear in the constructed node tree which nodes are parents of which
other nodes, or in which the relationship is cyclic (ie an element is an
ancestor of itself).


> > I hope I have convinced you that the data model section is intended to
> do nothing more than
> >
> > - explain how to construct the instance of the data model from an XML
> document
> > - define for such instances various key terms (parent, child, document
> order, expanded-name, string-value etc) which are used in the rest of the
> spec
>
> There are several problems here.
>
> First, the intent of the WG or the original editors can be reconstructed
> (when and
> to the extent that it can be reconstructed) by appeal to contemporary
> documents
> or other historical evidence.  Nothing in your mail speaks directly to the
> question of
> intent, and any statement about the intent of the text is a non sequitur.
>
> Second, you seem to be falling victim to the intentional fallacy, an
> elementary
> error on textual interpretation which is common enough.  The conscious or
> unconscious intent of the authors of a text can be of historical interest,
> but it
> does not determine what the text means, if for no other reason than that
> humans
> do not always succeed in doing what they intend to do.
>

OK, so I should have said "does" instead of "intended to do".  I believe I
have demonstrated that my reading is completely consistent with all the
text in the data model section.  You have demonstrated that your reading is
not.  Normal principles of textual interpretation should therefore favour
my reading.

It's passably clear that this discussion is not going to persuade either of
> us to
> change our minds and that it's unlikely to provide any illumination to any
> third
> parties.


Regrettably I do not seem to have been able to convince you of anything.
However, you have convinced me that there is a defect related to node
identity which should be corrected.  You have also identified a couple of
places where it would be perfectly reasonable for the WG to decide to add a
clarifying phrase or sentence.

  I believe that the XPath 1.0 spec has a number of simple errors, which are
> easily fixed;


The word count of the new text you are proposing to add to Section 5 of
XPath is approximately 50% of the word count of the current text of Section
5. It is a substantive piece of original work adding to XPath a
formalization of the data model that makes it independent of the XML 1.0
and XML Namespaces Recommendations.  This is an intricate, complex piece of
work, which you have developed over a number of years.  It is not simple,
and it goes way beyond anything that could be described as fixing an error.
 Furthermore, your formalization is not the only possible one: there are
other completely different ways to do this formalization (for example, by
viewing the tree as a map from arrays of node positions to node
properties).  In my view it would be a major abuse of the W3C process to
get this new, substantial, original work into a Recommendation status
document by treating it as an errratum, thereby bypassing the extensive
review and consideration that the W3C process would normally apply to such
a piece of work.

you deny that there are errors and assert at the same time that
> fixing them would involve a much more extensive revision.
>

I do not deny that there are errors.  I do say that your proposed revision
is a much more invasive, risky one than is necessary to fix those errors.

Recommendations are not academic papers.  They serve practical goals: to
allow implementors to independently create interoperable implementations,
and to allow users to predict how those implementations will behave.  XPath
1.0 is a very mature Recommendation with an extensive implementation track
record.  I therefore believe a cautious, conservative approach to changes
should be adopted.  Proposed changes should be separated out into small
(changing a sentence or a phrase or two) changes, where each is precisely
targeted to fix a specific point where there is something genuinely unclear
to implementors or users. Each such change should have ideally have a
concrete test case associated with it.  I have given several examples above
of the kinds of changes that I think might be appropriate at this stage.

James
Received on Saturday, 16 March 2013 06:24:35 UTC