[Fwd: Re: Performance numbers for C14N]

Here is the link to the old XML Signature library which follows the spec 
exactly without any optimizations.

Pratik

-------- Original Message --------
Subject: 	Re: Performance numbers for C14N
Date: 	Tue, 05 May 2009 09:45:57 -0400
From: 	Sean Mullan <Sean.Mullan@Sun.COM>
To: 	Pratik Datta <pratik.datta@oracle.com>
References: 	<49FF5109.9000505@oracle.com>



I think if you tried a very old version of Apache XMLSec you may find a 
test case for algorithm D. We removed the expansion of namespace nodes 
but I can't remember which release exactly. You will have to use the 
Apache XMLSec APIs (not JSR 105) though.

http://santuario.apache.org/dist/java-library/old/

--Sean

Pratik Datta wrote:
> Here are some performance numbers that demonstrate
> 
>  a) how subtree based canonicalization costs almost same as XML 
> serialization
>  b) how nodeset based canonicalization is really bad for performance
> 
> 
> Consider the following four algorithms
> 
>     * Algorithm A : Plain serialization
>     * Algorithm B:  The very efficient subtree based C14N
>     * Algorithm C: The moderately efficient nodeset based C14N, which
>       does not expand out namespace nodes (the one mentioned in
>       exclusive C14n spec, that Thomas pointed out)
>     * Algorithm D: The extremely inefficient nodeset based C14N which
>       expands all namespace nodes. (the one mentioned in inclusive C14N
>       spec)
> 
> 
> Algorithm A, B and C are available in JDK 1.6, and that is what I have 
> used to demonstrate the performance (with permission from Sean Mullan)
> The JDK 1.6 tries to use the subtree based code if possible. To make it 
> use the subtree based algorithm, I just assign an ID to the subtree, and 
> then create a Reference to this ID. But to make it use the nodeset based 
> algorithm, I use the same reference to that ID, but then I add a Xpath 
> Filter transform with an expression 1=1. This expression always 
> evaluates to true, so this is exactly same as signing the subtree.
> 
> 
> We will run these algorithms on these three xml files.
> 
>     * *5k_few_nodes.xml*:  This is a file with very few nodes. There is
>       just one very large text node
>     * *5k_many_nodes.xml*:  This is a file with many nodes, each node is
>       very small
>     * *5k_many_nodes_namespaces.xml*: This is a file with many nodes, it
>       also has many namespace nodes
> 
> 
> Here are the numbers on my machine
> 
> 
> 	Algorithm A
> (serialize)
> 	Algorithm B
> (subtree c14n)
> 	Algorithm C
> (nodeset c14n)
> 	Algorithm D
> (original)
> 5k_few_nodes.xml
> 	3.0ms
> 	4.0ms
> 	6.7ms
> 	
> 5k_many_nodes.xml
> 	4.1ms
> 	4.3ms
> 	21.5ms
> 	
> 5k_many_nodes_namespaces.xml
> 	5.0ms
> 	5.4ms
> 	164ms
> 	
> 
> 
> I need to dig up an implementation for Algorithm D.
> Also I will check in these tests into a location in the CVS
> 
> Pratik
> 
> 

Received on Friday, 6 November 2009 19:29:57 UTC