RE: New Version of XPath Filter from John Boyer on 2002-04-08 (w3c-ietf-xmldsig@w3.org from April to June 2002)

From: John Boyer <JBoyer@PureEdge.com>
Date: Mon, 8 Apr 2002 11:49:28 -0700
To: <aleksey@aleksey.com>
Cc: "merlin" <merlin@baltimore.ie>, <reagle@w3.org>, <w3c-ietf-xmldsig@w3.org>
Message-ID: <7874BFCCD289A645B5CE3935769F0B52328546@tigger.PureEdge.com>
Hi Aleksey,
 
What I mean by example is example code that performs more slowly.  As
far as we know, an XPath expression evaluator will never select an
entire subtree faster than the following two steps:
 
1) evaluate the expression that generates a subtree root
2) manually execute code to form S'
 
Whether the Xpath evaluator selects all nodes in the subtree or whether
the Xpath filter v2.0 does it seems immaterial.  S' is being created by
*something*.  The question is, does the Xpath evaluator operate slower
or faster?  Experience has so far told us that it will be slower.  This
could be mistaken since our experience is based on Xpath filter 1.0,
which has to do some extra processing every time it evaluates the XPath
expression for each node.  However, the current belief is that having
the XPath evaluator select subtrees will still be slower, so a
counterexample in the form of a code profile would be needed for us to
change course.
 
Also, I believe a method for handling the last case you mention (select
nodes then deselect their descendants) can be easily done using a pair
of filters (intersect to isolate subtree b, then subtract subtrees c and
d).
 
Cheers,
John Boyer 
 
 

-----Original Message-----
From: Aleksey Sanin [mailto:aleksey@aleksey.com]
Sent: Monday, April 08, 2002 11:28 AM
To: John Boyer
Cc: merlin; reagle@w3.org; w3c-ietf-xmldsig@w3.org
Subject: Re: New Version of XPath Filter


Yes, I have an example. Suppose we have following document:
    <a>
        <b>
            <c/>       
            <d/>
        </b>
    </a>
If the XPath expression selects node "b" (nodes set S = { node "b" }
then the 
current proposal  requires application to create new nodes set S' which
will include 
nodes "b", "c" and "d". This is an additional step. I don't see any
reasons why the 
author of XPath expression will not include these nodes him or herself. 

Also this additional step limits the XPath filter functionality because
one can have 
a requirement to operate with nodes set having *only* node "b" but not
nodes "c" and "d"

Aleksey.



John Boyer wrote:


Hi Aleksey,

All XPath implementations operate over actual nodes and not subtree
roots.  At any point in time, the resultant node-set of an Xpath
expression is interpreted by the application that consumes Xpath (the
host language, so to speak).  For example, XSLT typically uses recursion
to process the nodes in a node-set, so the nodes are interpreted as
subtree roots.

In essence, a node-set containing nodes that we choose to interpret as
subtree roots is still a node-set containing nodes.

Therefore, I do not see the basis in fact for your claim that we will
sometimes lose efficiency.  Is this just what you suspect will happen or
do you have an actual implementation that is harmed by this approach?

Thanks,
John Boyer

-----Original Message-----
From: Aleksey Sanin [ mailto:aleksey@aleksey.com]
Sent: Monday, April 08, 2002 10:46

 AM
To: merlin
Cc:  reagle@w3.org;  w3c-ietf-xmldsig@w3.org
Subject: Re: New Version of XPath Filter




2. We choose to perform expansion from nodes to node trees outside
the XPath processor to maximize the possible execution speed. It
is much faster to evaluate and expand //Foo than to evaluate
//Foo//self::node(). Remember, the only goal of this transform is
speed; it doesn't provide any new capability.


I don't think that there is no new functionality at all. For example, 
uninon provides new ways
to apply some transforms to a part of the document and add more nodes 
later. Also some XPath
implementations operates on the actual nodes sets (no sub-trees!) and by

this construction
S' is an additional and expensive operation!

Aleksey.
Received on Monday, 8 April 2002 14:50:13 UTC