Re: XPath 1.0 change proposal from C. M. Sperberg-McQueen on 2013-03-16 (www-xpath-comments@w3.org from January to March 2013)

From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
Date: Sat, 16 Mar 2013 07:18:39 -0600
To: James Clark <jjc@jclark.com>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, www-xpath-comments@w3.org
Message-Id: <B18AF907-DC96-42D9-B017-1EE567FFF118@blackmesatech.com>
On Mar 16, 2013, at 12:23 AM, James Clark wrote:

> 
> 
> Nodes (a concept defined in XPath) have precisely the properties that XPath says they do. XPath specifies these properties in many cases by referencing the XML and XML Namespaces Recommendations.

On your reading, do the normative clauses of XML and Namespaces
govern all instances of the data model or only those which are
created by parsing a namespace-well-formed XML document?

If the former, how do you reconcile that with your claim that an
instance of the data model can have more than one outermost
element dominated by the same document node?

If the latter, then are there any rules that in your view prevent
an instance of the data model from having cycles in the parent
or next-sibling relations?

> On this reading, the normative reference to the XML spec seems to have
> no function.
> 
> You've lost me. The normative reference is fundamental: the data model section is specifying how to construct a data model instance from a well-formed XML document, which is defined in the XML spec.  It also relies upon it for, amongst other things, the definition of document order. 

Thank you; you have at least clarified that you read the statement about
document order matching that of the first character of the representation
of a node as a normative statement and not as a restatement of things 
said normatively elsewhere.

>  
> XPath tells you, when you construct the node tree from a well-formed XML document, which nodes are parents/siblings of which other nodes.  

Where? How?

XPath tells the reader that the data model instance has one node for every 
element in the XML document, one comment node for every comment in the 
XML document, one text node for every sequence of adjacent data characters,
etc.  

Where does it tell the reader how to identify the parent of a node?  There
is, to be sure, a definition of parenthood in the XML spec, and perhaps the
use of the term 'parent' is assumed to be sufficient to constitute a reference
to that definition, but that definition applies only to elements; nothing in the 
XML spec defines the notion of parent for attributes, comments, or character 
data.

Moreover, in telling the reader that the set of element nodes has the same
cardinality as the set of elements in the XML document (and similarly for
other node types), the XPath spec does not in fact tell the reader how many 
nodes there are to be in the XPath data model instance.  Instead, it assumes 
-- wrongly -- that that information follows from the XML spec.  But the XML 
spec has no normative statements that rely on elements being able or unable 
to appear more than once in the document, and so no need for a general 
account of element, comment, or data character identity or distinctness,
or of element sets, comment sets, sets of data characters, or of the
cardinality of those undefined sets.  Having no need for such an account, 
it does not provide one.  XPath on the other hand does have normative 
statements that rely on node identity and distinctness, so it needs to 
provide some well grounded account of the matter.  It can do so by
defining the data model without logical dependencies on XML; on your reading
it does not do so, but relies on the notion of element identity as defined in the 
XML spec to determine element-node identity (and similarly for the other
node types), thus botching its task.


> That is a completely sufficient specification.  It will in fact be the case that, when you do so, that the parent/sibling relation so defined will be acyclic, but there is absolutely no need for XPath to say.  

First, that is true for the parent relation as deifned in the XML spec.
It is not guaranteed by the XML spec for the sibling relation.

Second, you seem here to want to require that XPath explicitly restrict its 
operation to XML documents in the serialized form defined in the XML 
specification, but I don't think anyone associated with the creation of the 
spec, let alone any reader, has ever thought that XPath imposed such 
a restriction. Instead, the XSL working group immediately developed and
promulgated the view that XSLT operates on trees as defined in the
XPath spec, and not necessarily on trees created by parsing an XML
document.

For XPath data model instances not created from a parsed serialization
of an XML document, where is the acyclic nature of the parent and
sibling relations specified?

XPath nodes have only those properties assigned by the XPath
spec, you know.  On your reading, it assigns certain properties to
data model instances created from XML documents -- but where
does it describe the properties of data model instances created in
other ways?

If I have understood your recent emails correctly, you hold that when 
a data model instance is created from an XML document, then the
root node is guaranteed to have exactly one element-node child, but
when a data model instance is created by other means, there is no
such guarantee.

By analogy, I suppose one can infer that when a data model instance
is created from an XML document, then the parent relation (assuming
that the parent relation of XPath mirrors the parent relation of XML
for elements) will always be acyclic, but when the data model instance
is created by other means, there is no such guarantee.  

So logically speaking, you seem to be taking the position that the
parent relation in XPath data model instances is not guaranteed to 
be acyclic.  (The statements in 2.2 that assume acylicity are thus
perhaps to be taken as errors.)

This logical problem goes away, of course, if XPath requires that
data model instances be such that they could in principle have been
created from an XML document, though I haven't seen such a statement
in the spec.  But that can't be your view, given that you don't believe
data model instances are required to have a single outermost element.
(That in turn suggests that the sentence in section 2 reading 
"/ selects the document root (which is always the parent of the 
document element)" is wrong to use the singular, and should read
"... (which is always the parent of the document element, or
elements)".)


> If it would ease your concerns to add a sentence saying a node will never be a descendant of itself, I would have no problems with that.

That would help, which is why my change proposal includes such a statement.
But I think "will" is the wrong modal verb.  Since it does not follow from any 
normative statement elsehwere in the specification, this sentence needs
to be formulated in a clearly normative way, not in a way that suggests it
is a redundant restatement of normative statements elsewhere.

> 
> 
...

> 
> > 11 "Every node other than the root node has exactly one parent, which
> > is either an element node or the root node."  Follows from XML 1.0
> > (assuming the usual usage of the word "parent" in XML contexts).
> >
> >
> > This is giving a precise definition of the term parent, which is a crucial for XPath.
> 
> No, not precise at all.  It is (on the usual reading of the spec) crucial for
> XPath that the parent relation be acyclic.  Nothing here says so, implies
> it, or even entails it.
> 
> The data model section tells you, when you construct the node tree from a well-formed XML document, which nodes are parents of which other nodes.

Can you point to the sentence you believe tells the reader which nodes
are parents of which other nodes?

> ... I do not seem to have been able to convince you of anything.

Quite correct.


-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************
Received on Saturday, 16 March 2013 13:19:05 UTC