Re: Document Object Model (DOM) Level 3 XPath Specification from Curt Arnold on 2001-06-28 (www-dom@w3.org from April to June 2001)

From: Curt Arnold <carnold@houston.rr.com>
Date: Wed, 27 Jun 2001 21:04:30 -0500
To: <www-dom@w3.org>
Message-ID: <001701c0ff76$ab4714c0$7600a8c0@CurtMicron>
Ray Whitmer (>) and Bob Foster (>>) wrote:

> further feedback).  We are talking about the actual DOM nodes returned
> from an xpath expression, and what to do when XPath says there should be
> a single node, having discarded information, but DOM has fragmented it
> across multiple nodes.  Answer: in case of fragmentation, return just
> the first node.

That doesn't sound good.  If I have a text content, I don't want to get
partial
information if the implementation doesn't expand entity references.  I'd
would think
an object that exposes the appropriate interfaces (say CharacterData and
Node) but represents the underlying node list and that pseudo-node could
be used to update the content or remove the entire set.  If I set the value
of the pseudo-node, then all the existing text and entity references would
be replaced.

> >3) If you are going to add a method to the DOM, it would be far better to
> >introduce a variant of normalize() that coalesces adjacent text (that is,
> >Text and CDATA) nodes exactly as described in the XPath specification.
> >
> While such a normalize function is a good idea, it is not sufficient in
> cases where stripping out all CDATASections and EntityReferences is not
> an option.  It is also questionable whether XPath inquiries should only
> function correctly on a completely normalized tree.

If you were only interested in getting the value, you could do that by
adding string() around your XPath query.
You could already do that using string() in your XPath query.
Unfortunately,
that would not give you an object that you could use to modify the tree
or set a listener on.

> While enumerating all types is impossible for XPath 2.0, I still think
> there is no reason to in the common cases of String, integer, or boolean
> to force users to muck with untyped object returns and native coercion,
> when well-defined system primatives can be supported.
>
> For XPath 2.0, we should probably consider adding:
>
> Object evaluateAs(<type>, ...)

I think it is necessary that there is a generic evaluateAs(), however I
don't think it is necessary or desirable to pass the type as an argument
since XPath 2.0 should have the Schema datatype equivalents of
string(), number(), etc.  The evaluteAsString, evaluateAsNumber,
etc are useful and probably should be kept.

> >6) Some responses seem to think that the Node-returning variant is meant
as
> >a hint to XPath that at most one node need be returned. If this is the
sly
> >intention ("There is nothing to stop an XPath implementor from taking
> >advantage...") it should be made explicit. I agree that this is a common
> >case and a useful optimization (you can slap a /.[1] at the end of any
node
> >locator, but you can't stop most XPath implementations from grinding out
and
> >testing all n nodes). It just shouldn't sneak in the back door.
> >
> The call is not a "hint", but a very clear statement that one node is
> requested.  The implementation may or may not be "sly" about only
> computing the first node, just as it may be sly enough to evaluate
> ActiveNodeSets incrementally as the caller requests additional nodes, in
> which case this method wasn't needed to avoid computation.  There are
> quite a few gains, and some of the best ones have to do with not
> returning a NodeSet, and are just as true of evaluateAsString, for
> example.  It is not clear to me that we should spend a lot of time in
> the specification discussing these implementation optimizations which
> may or may not be available.  Implementations will choose how sly they
> should be.

This is really more an optimization external to the XPath evaluator.  Even
if the processor
grinds through the entire document, using a evaluateAsNode saves the
construction and interface negotations for the NodeList and probably a
couple of calls.

I'd definitely in favor of keeping this one.

> >7) [Interface ActiveNodeSet] For simplicity and concurrency reasons,
> >ActiveNodeSet should be eliminated entirely in favor of StaticNodeSet.
> >Without explicit synchronization of access to the DOM, the useful
lifetime
> >of an ActiveNodeSet cannot be determined. It is possible a returned
instance
> >might already be invalid.

I mentioned that I think NodeSetIterator and NodeSet are better names for
ActiveNodeSet
and StaticNodeSet.  ActiveNodeSet behaves like an fail-fast iterator on a
collection.  StaticNodeSet behaves like a common ancestor to NodeList and
NamedNodeMap.
Received on Wednesday, 27 June 2001 22:03:48 UTC