Re: Document Object Model (DOM) Level 3 XPath Specification from Ray Whitmer on 2001-07-01 (www-dom@w3.org from July to September 2001)

From: Ray Whitmer <rayw@netscape.com>
Date: Sun, 01 Jul 2001 06:57:42 -0700
To: Bob Foster <bob.foster@webgain.com>
CC: www-dom@w3.org
Message-ID: <3B3F2C56.7020906@netscape.com>
Bob Foster wrote:

>Well, you stumped me back. It took two days to decide that maybe you mean
>that text processing within an XPath expression is compatible with XPath 1.0
>and it is only Text subclass nodes returned by evaluateAsXXX() that get this
>special treatment.
>
>Is this what you mean?
>
Yes.

>Agreed it's a good idea and not always an option. But sometimes it is an
>option.
>
>Function correctly is different than function compatibly. DOM XPath will not
>function compatibly on an unnormalized tree. It would be nice to think it
>would function compatibly on a "completely normalized" tree.
>
I agree.

>>A big part of the use of XPath in DOM is to find nodes in the document
>>so that they can be manipulated and later written out...
>>
>
>Yes, that is primarily how we use it in our implementation, and we have the
>same kind of shortcut.
>
>There should still be an evaluateAsObject(), for reasons enumerated
>previously in this list.
>
I agree that there might be a use or even a need for evaluateAsObject to 
support XPath 2.0.  I have not yet come to the definite conclusion that 
I believe that there is a need for evaluateAsObject, because it is not 
clear to me how the arbitrary schema types of  XPath 2.0 would be mapped 
to arbitrary object types in an arbitrary language binding, especially 
since apparently the types of 2.0 are no longer determined by expression 
syntax, but only by what value space the actual resultant values happen 
to find themselves in (see section 3.4 in "XQuery 1.0 and XPath 2.0 Data 
Model" in http://www.w3.org/TR) and hence is likely to simultaneously be 
of multiple types.

I am still rereading the XPath 2.0 data model to try to figure 
everything out, but it seems to me as though the caller, again, might 
need to specify the type to return, and only a few arbitrary types, 
especially String, should be supported, no matter how many types occur 
in the expressions because the types of XPath 2.0 seem not designed to 
map 1:1 to a typed language's objects or primitives.

>Along the lines of very clear statements, the document says evaluateAsNode()
>returns the "first node of the resulting set". What set? Node-sets are
>unordered. The first node in document order? The first node in position
>order? Undefined? Implementation-defined?
>
This is a significant point to clarify.  As described, it would match 
the first node in the orderless set, I believe.  This method currently 
satisfies the situation where the application wants one node that 
matches the expression, but does not care which one if there are 
multiples.  There is a bit of confusion between XPath's node set which 
has no order and ActiveNodeSet, which has an implementation-defined 
order.  It might be less confusing to just say that it returns one node 
of the resulting set instead of saying it returns the first.

It has already been suggested by others that some applications would 
like an exception if there is more than one match.  Or, as you point 
out, the application might want to guarantee which item is returned in 
the case of multiples.  Either of these constraints seem to impact 
performance because they require a more-complete evaluation of the 
expression, either to know that the node is the only match or to know 
that it is the first among all sorted results.  While this is still less 
overhead and more convenience since there is no ActiveNodeSet returned, 
the question would be whether and how these other cases should be 
accomodated, either via flags which modify the behavior of the method or 
by some other way.  Implementations might minimize this difference by 
guaranteeing for certain expressions that results will always be 
computed in document order, so perhaps we should not be looking so hard 
at the implementation right now, but I believe based upon my 
implementation experience that it might be quite difficult to deal with 
this efficiently.

Ray Whitmer
rayw@netscape.com
Received on Sunday, 1 July 2001 09:53:26 UTC