RE: History: Question on C14N list of nodes instead of subtrees from John Boyer on 2002-02-04 (w3c-ietf-xmldsig@w3.org from January to March 2002)

From: John Boyer <JBoyer@PureEdge.com>
Date: Mon, 4 Feb 2002 11:52:14 -0800
To: "Eugene Kuznetsov" <eugene@datapower.com>, <reagle@w3.org>, "merlin" <merlin@baltimore.ie>
Cc: <w3c-ietf-xmldsig@w3.org>
Message-ID: <7874BFCCD289A645B5CE3935769F0B52350097@tigger.PureEdge.com>

Hi Eugene,

<eugene>
On the topic of nodes vs. subtrees and XPath directly, XPath itself
is not the problem. It *is* possible to make those operations 
extremely efficient (that's part of what DataPower does). I really
doubt that having a different selection approach with a more 
obscure specification will help either interoperability or efficiency
in the long run -- it's good to keep reusing XPath and XSLT whenever
possible, since at least there are folks working on buidling wirespeed
engines for those specifications. Who's got XPointer optimization
experience? ;-)
</eugene>

XPath is a big part of the problem right now.  Based on the run-time
performance posted by Merlin, the XPath implementations are clearly not
achieving optimal performance profiles.  Otherwise the runtime curves
would look like O(n log n) and not the ~ O(n^1.5 log n) curve we appear
to be seeing.  

I therefore agree with you that optimization of XPath could make a big
improvement. Unfortunately, this is one of those cases where
optimization creates perceived non-interoperability.  If a document
signature takes 1 second to validate on one implementation and 20
seconds under everyone else's implementation, then migrating off the
implementation that achieves the 1 second performance is a practical
impossibility.

As to whether nodes vs. subtrees will make a difference, I think it will
make a very big difference.  Because the slowness of XPath
implementations is a problem, if we run it once to identify some subtree
roots, then we get rid of the really slow part of the implementation.
Once the subtree roots to include/exclude are identified, then custom
DOM calls can be used to expand those subtrees (or the complement of
them in the exclude case).  These custom DOM calls would be easy to
understand and implement efficiently (i.e. there's no way running an
XPath expression on every subtree node will *ever* catch up no matter
how much optimization is done).

John Boyer, Ph.D.
PureEdge Solutions, Inc.

\\ Eugene Kuznetsov
\\ eugene@datapower.com
\\ DataPower Technology, Inc.

Received on Monday, 4 February 2002 14:52:51 UTC