- From: Jim Ley <jim@jibbering.com>
- Date: Thu, 21 Mar 2002 16:38:57 -0000
- To: <w3c-wai-er-ig@w3.org>, <www-annotation@w3.org>
Hi, I've had a quick look at document hashing comparing Mozilla 0.9.8 and IE's normalisations: http://jibbering.com/2002/3/documenthash.html contains the source of the script I used. The algorithm is basically construct a string of all names of all the nodes in document order that do not have any characters other than a-z (to remove text, comment and non HTML namespace nodes.). From that string I then calculate 2 MD5 hashes one of the whole string, and one of just the string from the body node. http://jibbering.com/2002/3/hashcompare.html has the results. Of the 9 I tested (this was just a quick test, I'll do some more if we can get Amaya to also do the tests.) the only one that failed to match on the "body hash" was http://www.ibm.com/ quite what happened with that one I'm not sure, neither found a body element, but viewing it in a browser you do have one (I proxy it through jibbering.com _without_ changing any of the src's so document.write javascript in an external file isn't generated.) So it would seem IE and Mozilla have a very similar normalisation of the body.
Received on Thursday, 21 March 2002 11:41:26 UTC