Re: Performance numbers for C14N

Code checked in.


Readme:
http://www.w3.org/2008/xmlsec/Drafts/performance/c14n-subtree/README.txt

Input files
http://www.w3.org/2008/xmlsec/Drafts/performance/c14n-subtree/5k_few_nodes.xml
http://www.w3.org/2008/xmlsec/Drafts/performance/c14n-subtree/5k_many_nodes.xml
http://www.w3.org/2008/xmlsec/Drafts/performance/c14n-subtree/5k_many_nodes_namespaces.xml

Performance test harness code:
http://www.w3.org/2008/xmlsec/Drafts/performance/c14n-subtree/w3c/xmlsec/TestSigPerf.java


Pratik


Pratik Datta wrote:
> Here are some performance numbers that demonstrate
>
>  a) how subtree based canonicalization costs almost same as XML 
> serialization
>  b) how nodeset based canonicalization is really bad for performance
>
>
> Consider the following four algorithms
>
>     * Algorithm A : Plain serialization
>     * Algorithm B:  The very efficient subtree based C14N
>     * Algorithm C: The moderately efficient nodeset based C14N, which
>       does not expand out namespace nodes (the one mentioned in
>       exclusive C14n spec, that Thomas pointed out)
>     * Algorithm D: The extremely inefficient nodeset based C14N which
>       expands all namespace nodes. (the one mentioned in inclusive
>       C14N spec)
>
>
> Algorithm A, B and C are available in JDK 1.6, and that is what I have 
> used to demonstrate the performance (with permission from Sean Mullan)
> The JDK 1.6 tries to use the subtree based code if possible. To make 
> it use the subtree based algorithm, I just assign an ID to the 
> subtree, and then create a Reference to this ID. But to make it use 
> the nodeset based algorithm, I use the same reference to that ID, but 
> then I add a Xpath Filter transform with an expression 1=1. This 
> expression always evaluates to true, so this is exactly same as 
> signing the subtree.
>
>
> We will run these algorithms on these three xml files.
>
>     * *5k_few_nodes.xml*:  This is a file with very few nodes. There
>       is just one very large text node
>     * *5k_many_nodes.xml*:  This is a file with many nodes, each node
>       is very small
>     * *5k_many_nodes_namespaces.xml*: This is a file with many nodes,
>       it also has many namespace nodes
>
>
> Here are the numbers on my machine
>
>
> 	Algorithm A
> (serialize)
> 	Algorithm B
> (subtree c14n)
> 	Algorithm C
> (nodeset c14n)
> 	Algorithm D
> (original)
> 5k_few_nodes.xml
> 	3.0ms
> 	4.0ms
> 	6.7ms
> 	
> 5k_many_nodes.xml
> 	4.1ms
> 	4.3ms
> 	21.5ms
> 	
> 5k_many_nodes_namespaces.xml
> 	5.0ms
> 	5.4ms
> 	164ms
> 	
>
>
> I need to dig up an implementation for Algorithm D.
> Also I will check in these tests into a location in the CVS
>
> Pratik
>
>

Received on Tuesday, 5 May 2009 22:09:27 UTC