- From: Szabó Áron <aron@ik.bme.hu>
- Date: Wed, 5 Oct 2005 10:43:32 +0200
- To: <www-xml-canonicalization-comments@w3.org>
Dear Members, I'm checking several parsers + C14N canonicalization solutions to provide interoperability between applications. I've noticed strange functioning, therefore I've read through again the W3C C14N standard, but I couldn't find out which the correct way is. Could you please help me in explaining the text of the standard? What does this sentence exactly mean? "The string value of the node is modified by replacing all ampersands (&) with &, all open angle brackets (<) with <, all quotation mark characters with ", and the whitespace characters #x9, #xA, and #xD, with character references. The character references are written in uppercase hexadecimal with no leading zeroes (for example, #xD is represented by the character reference 
)." (http://www.w3.org/TR/xml-c14n) The following example was given as input for parsing and C14N canonicalization: <doc> <e1/> </doc> which contains the bit sequence (in hex) of "0D 0A 20 20 20". between the two tags. I've got outputs (made by several applications) that contained e.g. "0A 20 20 20" (in this case the escaped "#xD" character is missing) "0A 09" (the three "20" have been converted to "09" which is TAB) "26 23 78 44 3B 0A 20 20 20" (in which "26 23 78 44 3B" is "
") Which is the correct one? Any idea? Best regards, Aron ---------------------------------------------------- Aron Szabo, M. Sc. Research Associate, Center of Information Technology Budapest University of Technology and Economics
Received on Wednesday, 5 October 2005 08:43:52 UTC