Re: New Version of XPath Filter from Aleksey Sanin on 2002-04-08 (w3c-ietf-xmldsig@w3.org from April to June 2002)

From: Aleksey Sanin <aleksey@aleksey.com>
Date: Mon, 08 Apr 2002 12:02:38 -0700
To: John Boyer <JBoyer@PureEdge.com>
CC: merlin <merlin@baltimore.ie>, reagle@w3.org, w3c-ietf-xmldsig@w3.org
Message-ID: <3CB1E94E.40508@aleksey.com>
I need to think about profiling this scenarios. Probably you are right 
about performance.
However, personally I don't think that it's a good idea to write general 
standard with
performance as a top priority. My expirience shows that performance 
improvements are
usually very complicated, not simple to explain and very difficult to 
maintain.

Calculating S' is one additional step and this step adds complexety to 
the proposed scheme.
I like the initial idea: describe XPath transform in terms of sets 
operations. It really simplifies
understanding and makes everything clear. I suggest you to take a look 
at this from a "user"
point of view: "I've constructed XPath expression but transform added 
some nodes to it. Why???"
And remember that users never read docs :)

And I think you agree with me that your solution for the second scenario 
I described (when only
node "b" is required) has worse performance then a solution w/o S'.

Aleksey.

John Boyer wrote:

>Hi Aleksey,
> 
>What I mean by example is example code that performs more slowly.  As
>far as we know, an XPath expression evaluator will never select an
>entire subtree faster than the following two steps:
> 
>1) evaluate the expression that generates a subtree root
>2) manually execute code to form S'
> 
>Whether the Xpath evaluator selects all nodes in the subtree or whether
>the Xpath filter v2.0 does it seems immaterial.  S' is being created by
>*something*.  The question is, does the Xpath evaluator operate slower
>or faster?  Experience has so far told us that it will be slower.  This
>could be mistaken since our experience is based on Xpath filter 1.0,
>which has to do some extra processing every time it evaluates the XPath
>expression for each node.  However, the current belief is that having
>the XPath evaluator select subtrees will still be slower, so a
>counterexample in the form of a code profile would be needed for us to
>change course.
> 
>Also, I believe a method for handling the last case you mention (select
>nodes then deselect their descendants) can be easily done using a pair
>of filters (intersect to isolate subtree b, then subtract subtrees c and
>d).
> 
>Cheers,
>John Boyer 
> 
> 
>
>-----Original Message-----
>From: Aleksey Sanin [mailto:aleksey@aleksey.com]
>Sent: Monday, April 08, 2002 11:28 AM
>To: John Boyer
>Cc: merlin; reagle@w3.org; w3c-ietf-xmldsig@w3.org
>Subject: Re: New Version of XPath Filter
>
>
>Yes, I have an example. Suppose we have following document:
>    <a>
>        <b>
>            <c/>       
>            <d/>
>        </b>
>    </a>
>If the XPath expression selects node "b" (nodes set S = { node "b" }
>then the 
>current proposal  requires application to create new nodes set S' which
>will include 
>nodes "b", "c" and "d". This is an additional step. I don't see any
>reasons why the 
>author of XPath expression will not include these nodes him or herself. 
>
>Also this additional step limits the XPath filter functionality because
>one can have 
>a requirement to operate with nodes set having *only* node "b" but not
>nodes "c" and "d"
>
>Aleksey.
>
>
>
>John Boyer wrote:
>
>
>Hi Aleksey,
>
>All XPath implementations operate over actual nodes and not subtree
>roots.  At any point in time, the resultant node-set of an Xpath
>expression is interpreted by the application that consumes Xpath (the
>host language, so to speak).  For example, XSLT typically uses recursion
>to process the nodes in a node-set, so the nodes are interpreted as
>subtree roots.
>
>In essence, a node-set containing nodes that we choose to interpret as
>subtree roots is still a node-set containing nodes.
>
>Therefore, I do not see the basis in fact for your claim that we will
>sometimes lose efficiency.  Is this just what you suspect will happen or
>do you have an actual implementation that is harmed by this approach?
>
>Thanks,
>John Boyer
>
>-----Original Message-----
>From: Aleksey Sanin [ mailto:aleksey@aleksey.com]
>Sent: Monday, April 08, 2002 10:46
>
> AM
>To: merlin
>Cc:  reagle@w3.org;  w3c-ietf-xmldsig@w3.org
>Subject: Re: New Version of XPath Filter
>
>
>
>
>2. We choose to perform expansion from nodes to node trees outside
>the XPath processor to maximize the possible execution speed. It
>is much faster to evaluate and expand //Foo than to evaluate
>//Foo//self::node(). Remember, the only goal of this transform is
>speed; it doesn't provide any new capability.
>
>
>I don't think that there is no new functionality at all. For example, 
>uninon provides new ways
>to apply some transforms to a part of the document and add more nodes 
>later. Also some XPath
>implementations operates on the actual nodes sets (no sub-trees!) and by
>
>this construction
>S' is an additional and expensive operation!
>
>Aleksey.
>
>
>
Received on Monday, 8 April 2002 15:03:36 UTC