Re: C14N-Hash implementations???

 > If instead of building a single canonical
> XML string you walk a DOM and only send substrings to a hash
> accumulator, in the C14N order, you should be able to produce the
> C14N hash of a DOM structure in almost the time it takes to walk that
> structure for printing without canonicalization.
> 
> So, has anyone done that experiment?  If so, how did it perform?

I just tried it in the Python implementation of C14N.  I took a small 
message that was mostly canonical (1369 before; 1364 after).  First I 
just called canon, which returns a string, and then hashed that.  Then I 
called canon, passing in an output object whose "write" method 
accumulated into a SHA object.  Repeating each loop 10 times showed .21 
seconds for return-string-then-hash, and .20 seconds for hash-as-you-go.

This is on a mostly-idle Linux box with 768Meg of memory.

So I think you're right, it's a silly waste to collect the string then 
hash it.  (I thought this before, in fact.  The C++ C14N/hash code that 
I wrote for our product does some small memory buffering, but *never* 
collects the entire canonical document in memory.)
	/r$

Received on Friday, 26 July 2002 12:37:19 UTC