Re: Single Node XPath Evaluation Is Ambiguous from Ray Whitmer on 2002-11-01 (www-dom@w3.org from October to December 2002)

From: Ray Whitmer <rayw@netscape.com>
Date: Fri, 01 Nov 2002 04:12:54 -0800
To: www-dom@w3.org
Message-ID: <3DC26FC6.3050904@netscape.com>
Dominic Chambers wrote:

>The DOM Level 3 XPath module allows single nodes to be selected,
>where the returned node can be either the first node in document
>order, or any node from the result set. The ability to select
>the first node in document order is certainly useful, but the
>second definition is definately not.
>  
>
I believe you missed common valid use cases.

If I search for a node that I know occurs once in a document (because of 
control over the application that created it, validity checking, or 
whatever), it is not necessary to require that it is the first node that 
occurs in document order of the result set, which may potentially be 
quite complex and require significantly more processing to produce.  

If the implementation is free to return the first node it finds that 
matches the criteria, rather than guaranteeing that a 
potentially-complex expression does not return in later processing some 
other node that preceeds this one in document order, then it may be able 
to avoid significant processing.  This may vary greatly depending upon 
the implementation and the particular expression being evaluated.

Let me try to give more-concrete examples.

1.  Let's say I have a simple xpath expression that looks for a 
particular node in two different places.  If due to constraints on the 
document, I know either that a. The item does not exist at both places, 
or b. the information derived from either location is equivalent, then I 
can stop searching if I find the item at the first location.  If, on the 
other hand, the implementation is required to return the first node 
sorted in document order, then the implementation will be obligated to 
look for the information in both places.

2.  Let's say that I am trying to make a simple implementation of this 
API without building the logic to know whether a particular expression 
may naturally produce nodes out of order.  In this case, if I receive a 
request that requires the first node, sorted in document order, then I 
will require the expression to completely process to make sure it did 
not produce another node that preceeded the first one produced.  The 
implementation may well not be sophisticated enough to tell the 
difference between a simple expression that could have guaranteed order 
by the way it did the processing and a complex one that did not.  If I 
know that the application does not care, for a variety of reasons 
including, that the document is valid and contains only one, the pieces 
of information are equivalent, etc. then I can stop when I find the 
first one and return it.

In my experience, if I am looking for a single piece of information, 
finding two different pieces of information would present me with a huge 
headache of deciding which piece of information to use.  This is why I 
would rely on a model where there cannot be contradictory information in 
the document.  Once I have guaranteed this through validation or some 
other means, I want my XPath evaluation to just look until it finds the 
one piece of information and not care whether there might be another 
piece of information that precedes it in document order.  This is why 
this use case seems more common to me than the case where I want to do 
the extra processing that may be required to also ensure that it is the 
first node found within the document.

I do not doubt that others may have use cases that they consider more 
common where they want to do the extra processing required, perhaps to 
completely process the expression in a simple implementation, to 
guarantee that the node they found was the first one.  But just because 
XSLT forces the first result to be returned does not mean that that is 
all that should be made available to DOM applications.  If I am writing 
a DOM application, I will use XPath if I know it is both efficient and 
convenient.  But if I think it may search the entire document just 
because of the ordering constraint for something that a 
more-conventional search would terminate as soon as it found the first 
one, then I will hand-code it rather than using XPath API.  The point is 
to make it so that even simple implementations can be expected to 
terminate instantaneously when they find a matching result rather than 
worrying about order.

Clearly the single result that produces an object not in any order would 
not be used in an XSLT implementation.  Also, the simple examples you 
cited would not typically be used in a DOM application except 
occasionally parametrically, because simpler accessors exist to achieve 
those results.

Ray Whitmer
rayw@netscape.com
Received on Friday, 1 November 2002 07:13:53 UTC