Re: XPath 1.0 change proposal

On Fri, Mar 15, 2013 at 12:55 AM, C. M. Sperberg-McQueen <
cmsmcq@blackmesatech.com> wrote:

the following sentences seem to be either unnecessary repetitions
> of simple facts that follow from the 1:1 relation between nodes in the
> data model instance and constructs in the XML spec, or contradictions
> of things normatively stated either in the XML spec, the namespaces
> spec, or elsewhere in XPath 1.0 (especially assumption (c)).
>
...

>
> 1 "Some types of nodes also have an expanded-name." Follows from XML
> 1.0 + Namespaces 1.0.


XML Namespaces says nothing about the concept of "node".  The fact that XML
Namespaces says that elements have expanded-names doesn't necessarily imply
that XPath 1.0 element nodes have expanded-names.  A key aspect of the data
model is the selection of available information that it chooses to expose.


>

2 "an expanded-name ... is a pair consisting of a local part and a
> namespace URI." Follows from Namespaces 1.0.
>

XML Namespaces uses the term namespace name. XPath chooses to use the term
namespace-uri.


> 3 "The namespace URI specified in the XML document can be a URI
> reference as defined in [RFC2396];" (original text), or "A namespace
> name specified in a namespace declaration in an XML document is a URI
> reference as defined in [RFC2396];" (erratum). Follows from Namespaces
> 1.0.
>
>
Ditto.


> 4 "this means it can have a fragment identifier and can be relative."
> (original text), or "this implies it can have a fragment identifier
> and can be relative." (erratum). Follows from RFC 2396 (but clearly
> labeled as such, so it really doesn't count in this enumeration).
>
> 5 "Element nodes occur before their children."  Follows from XML 1.0
> (together with the immediately preceding normative definition of
> document order)
>

Let's look at this in context:


> There is an ordering, *document order*, defined on all the nodes in the
> document corresponding to the order in which the first character of the XML
> representation of each node occurs in the XML representation of the
> document after expansion of general entities. Thus, the root node will be
> the first node. Element nodes occur before their children. Thus, document
> order orders element nodes in order of the occurrence of their start-tag in
> the XML (after expansion of entities). The attribute nodes and namespace
> nodes of an element occur before the children of the element. The namespace
> nodes are defined to occur before the attribute nodes. The relative order
> of namespace nodes is implementation-dependent. The relative order of
> attribute nodes is implementation-dependent.


The sentences following the first "Thus" are fleshing out the definition of
document order given in the first sentence.


6 "The attribute nodes and namespace nodes of an element occur before
> the children of the element."  Follows from XML 1.0 (together with the
> normative definition of document grammar).
>

Ditto.


>
> 7 "The namespace nodes are defined to occur before the attribute
> nodes."  Contradicts the normative statement of document order.
>

This is giving you the definition of document order for attribute nodes.


>
> 8 "The relative order of namespace nodes is implementation-dependent."
> Contradicts the normative statement of document order.
>

As is this.

>
> 9 "The relative order of attribute nodes is implementation-dependent."
> Contradicts the normative statement of document order.
>

As is this.

If you genuinely find this confusing, I suggest adding the words "as
further explained in the rest of this paragraph"  at the end of the first
sentence.


>
> 10 "Nodes never share children: if one node is not the same node as
> another node, then none of the children of the one node will be the
> same node as any of the children of another node."  Follows from
> assumption (a).
>

That is addressing a misinterpretation that could arise because of general
entity expansions.



> 11 "Every node other than the root node has exactly one parent, which
> is either an element node or the root node."  Follows from XML 1.0
> (assuming the usual usage of the word "parent" in XML contexts).
>
>
This is giving a precise definition of the term parent, which is a crucial
for XPath.



> 12 "A root node or an element node is the parent of each of its child
> nodes."  (Ditto.)


Ditto.


>

13 "The element node for the document element is a child of the root
> node."  Follows from XML 1.0.
>

Ditto: definition of child.  XML 1.0 defines parent/child only for elements.

>
> 14 "The root node also has as children processing instruction and
> comment nodes for processing instructions and comments that occur in
> the prolog and after the end of the document element."  Follows from
> XML 1.0.
>

Ditto: definition of child.


>
> 15 "The root node does not have an expanded-name."  Follows from XML
> 1.0 + Namespaces 1.0.
>

XML Namespaces says nothing about nodes.

>
> At this point, this exercise is costing me more tedium than I have
> patience for, so I am going to stop.


Good, I don't think it's advancing your case.


>  I will leave the rest of section 5
> as an exercise for the reader.
>
> you seem to be arguing that it's
> shoddy work never intended to be correct or to make good on the
> implications of the term "data model", and that any fix would
> constitute a major renovation.
>

I hope I have convinced you that the data model section is intended to do
nothing more than

- explain how to construct the instance of the data model from an XML
document
- define for such instances various key terms (parent, child, document
order, expanded-name, string-value etc) which are used in the rest of the
spec

I do not accept that the fact that it does no more than this makes it
"sloppy".  In any case, it has been approved as a W3C Recommendation in
this form.

As I understand it, you are looking for something more that this: a
self-contained definition of the data model that completely specifies all
the constraints that the data model must satisfy in order to be useable
with XPath.  I do not deny that this would be a nice thing to have and
would be more satisfying as a data model definition.  However, I think it's
way beyond what is appropriate for an errata (especially after this period
of time), and would require a major rewrite (going beyond even what you are
now proposing) to be completely satisfactory.  For example, in your current
draft I think the separation between the definition of the data model
itself and the mapping from XML to the data model is not nearly as clean as
it could be.

I also do no think the absence of what you are looking for is a practical
problem with XPath.  XPath is designed to be a component that is referenced
by other standarda.  The referencing standard has to define a whole bunch
of stuff to be able to use XPath, including conformance and how the context
is set up.  If a standard wants to apply XPath to something other than XML
documents, it needs to define how an instance of the XPath data model is
constructed from the structures that the standard deals with.  It's up to
that standard to do so in a way that ensures XPath does not break.  The
exact constraints that such XPath data model instances satisfy is dependent
on that referencing standard.  For example, if the DOM standard were to
reference XPath and allow XPath to be used to query a DocumentFragment in
the obvious way, then in that case the root node of the constructed XPath
data model would not necessarily satisfy the constraint of the root node
having a single element child.

In summary, although XPath 1.0 could have been written in many different
ways, and some of those ways might well be superior in some respects to how
it was in fact written, I do not believe that this change proposal has
identified a defect in XPath 1.0 that is in need of fixing at this stage.

James

Received on Friday, 15 March 2013 05:37:09 UTC