Iteration functionality request for XPath 2.0

Hi,

I'd like to discuss functionality that is (from my stand-point) missing in
XPath 1.0 and it's still not described in the requirements for XPath 2.0.
This issue might have been discussed already, but I couldn't find anything
about it in archives. This functionality is about context
parameters/functions like "position" and "last".

Consider the sample below:
<xsl:for-each select="child">
.....
.....
</xsl:for-each>

Sometimes it's very important to know if current node is the last one in the
node-set. There're two basic ways to do it in XPath 1.0:
**** FIRST APPROACH: using "position" and "last":
	<xsl:if test="position() = last()">
		....
	</xsl:if>

Drawbacks:
	If node-set implementation is iterative, i.e. new node is fetched
only when for-each instruction goes to a new cycle and whole list of nodes
is not cached in node-set object, then the first call to last() function
will go through all nodes to find out their quantity. This will repeat the
node-set fetch procedure twice or will lead to the caching of the nodes
list. Both ways will impact performance and/or memory consumption. If
node-set is quite big then actuall work of "for-each" cycle may be
drastically delayed.
I think iterative approach for retrieving node-sets is quite popular now. At
least, I'm pretty sure it's used in Xalan 2.X implementation. More over,
today many developers try to use implementations of node-set iterators for
processing external data. Querying external sources for the second time
might be too expensive.

**** SECOND APPROACH: using node set expression:
	<xsl:if test="not(following-sibling::child)">
		....
	</xsl:if>

Drawbacks:
	In this case another node-set expression has to be evaluated in each
cycle. This will cause a performance impact. In addition, "if" expression is
very dependant on the "for-each" expression. This worsens code's readability
and maintenance. This is even worse if you'd put this code inside of
template which is called by xsl:apply-templates instruction, i.e. when it's
not clear which node-set expression was used to find the current node.

In my point of view, this could be easily solved by extending list of
node-set/context functions with one more function, something like "is-last"
(another function "is-first" could also be added for symmetry, eventhough
it'd be redundant). This function would return boolean true if current node
is the last node in the node-set.
Such a function wouldn't contradict the existing system, it'd be an addition
to the existing "position" and "last". Implementations that don't use
iterators, i.e. retrieve whole list of nodes, can easily implement it as:
	is-last = (position == last)

>From the performance perspective this function should be very good: in the
worst-case-scenario it will lead to the caching of two nodes: current and
next.


Thanks,
Dimitry E. Voytenko

Received on Friday, 20 September 2002 03:22:33 UTC