Re: [XPath] About fn:last()

Although it's true that last() has significant performance implications,
consider even simple XPaths like /a/b[/a/c] or /a[b = /a/c].  Because you
can express cross-products, XPath queries can easily be O(N^2) or worse
unless you use indexing techniques (as Michael Kay replied)

>> '/e1[predicate1]/e2[predicate2]/e3[predicate3]/e4[fn:last()]',if
>> the resulting sequence is very long(e.g. 1,000,000
>> nodes) and the data is stored in a file,in the worst
>> case,we'll have to use 1,000,000 IO operations to get
>> the value of fn:last() and another 1,000,000 IO
>> operations to iterate to the last node.

If the data is stored in a file, then you need to parse the entire file
anyway to ensure that it's valid XML and to ensure that you've found all
matching nodes.

However, this particular query (if the elided predicates don't get in the
way) shouldn't require processing the file twice.  Just remember the last
matching e4 element within each matching e3, and when the e3 close is
reached, emit the last e4 you remembered.  In general, predicates of the
form [last()] (and possibly [last() - k] for constant k) should be
optimized.

Obviously there are plenty of other queries, like
   //a[last() > position() * b]
that will perform badly in most implementations no matter what kinds of
performance optimizations you make.  That's just the nature of the game.


Cheers,

Michael Brundage
xquery@comcast.net
Author, XQuery: The XML Query Language (Addison-Wesley, 2004)
Co-author, Professional XML Databases (Wrox Press, 2000)

Received on Sunday, 6 June 2004 19:08:49 UTC