W3C home > Mailing lists > Public > www-dom@w3.org > July to September 2001

Re: DOM Level 3 XPath: editorial, use case analysis, and counterproposal

From: Curt Arnold <carnold@houston.rr.com>
Date: Thu, 12 Jul 2001 22:14:10 -0500
Message-ID: <003701c10b49$e288d160$7600a8c0@CurtMicron>
To: <www-dom@w3.org>
I guess part of my thought process is to load all the features that I could
possibly envision into the design to see if things start looking weird.  In
this case, continually adding additional flags and parameters to the
evaluteAsNodeSet method started looking pretty nasty.  It became apparent
that XPathExpression was the right place for optimization hints.

> Timeout: Since our API is specifically operating against a DOM, I'd say
> that's not really an XPath issue. Timeout has to be dealt with for _any_
> remote DOM operation. I don't think base XPath implements document()...
and
> of course any extension function plugged in (if we permit that) has to
deal
> with what happens if the function blows up for various reasons.

Even if the DOM is in local memory, a poorly designed (or intentional Denial
of service) XPath expression or a really large document might result in
unacceptible evaluation times.  However, what time period would be
intolerable would depend on the application and might even vary query by
query within an application.

If the query was being processed immediately and on the same thread,
implementing time out behavior could be as simple as:

for each node
    check node against query
    if elapsed time > timeout throw exception
next node

Definitely timeout behavior would have to be carefully worded.  You
definitely couldn't guarantee a hard time out


> Limiting the return set: The ideal solution would be if the query itself
> was incremental and only computed results as you need them. But as someone
> who's involved in an XPath implementation, I can attest that this is not
> easy to implement for a complicated path. We're trying, but sometimes
we've
> had to compute the whole darned thing before you can return any part of
it,
> due to ordering requirements. I'm not sure that limiting the return set,
> per se, actually opens any significant opportunities for optimization.

Being able to relax the ordering requirements seems like a key enabler for
lazy evaluation (and unlocking the value of a return set limit)

If "first mode" is replaced by "any node" in the definition of
evaluteAsNode, then it could be lazy evaluated and that would be reasonable
since that method would only be called when there is an expectation of only
one node matching the pattern.  That behavior would seem to a special case
of the result set limit and disabling ordering.  I think the by default
ordering should be preserved, but probably should have an option on
XPathExpression when you really don't care about the order.

The more that I think about doing XSLT looks like an essential feature.  In
the previous messages, I had a XPathSortCriteria interface patterned after
the XSLT sort options.  Probably better to put those as properties of
XPathExpression and eliminate the distinct interface.
Received on Thursday, 12 July 2001 23:13:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 22 June 2012 06:13:49 GMT