- From: Juha Pääjärvi <juha@firsthop.com>
- Date: Sun, 26 Mar 2000 17:08:56 +0300 (EEST)
- To: w3c-ietf-xmldsig@w3.org
Hi, There has been some discussion about the canonicalization alternatives in the current XML-signature draft. Basically the comments have been that c14n is good but too complex for some applications, and that there are problems with the minimal canonicalization. It has been argued that the minimal one should be removed from the draft. Someone also pointed out that there is a potential security problem in the minimal canonicalization. I have a case for which either canonicalization method is good for. This is a real problem in my draft about XML encoding of SKPI certificates but this problem can manifest itself also in other applications. In my draft the problem is about calculating a hash of a public-key. (SPKI defines that issuer and subject can be identified with a hash of a public-key among other alternatives, so it has to be included for compatibility.) Hash of a public-key must be calculated over the XML encoded form of the public-key being hashed. In this case, a receiver of a certificate does not get the XML encoded version of the public-key, but is supposed to know the public-key and just to identify the correct public-key with its hash. So, a receiver calculates a hash for a public-key by first forming the XML encoded form of a public-key (which is an XML element defined as a part of the DTD for XML encoded SPKI certificates), then canonicalizing it and finally calculating a hash over the canonicalized version of the element. The key issue here is the canonicalization algorithm that is supposed to remove all possibilities for alternative presentations of the XML encoding of a public-key before the data is passed on to a hash algorithm. With current options for canonicalization in the XML-signature draft this canonicalization of the public-key element is not possible without leaving the possibility for alternative presentations. The full-fledged XML canonicalization cannot be used because the canonicalized data is an element, not a whole document. The minimal canonicalization, on the other hand, is not powerful enough for this case because the XML encoded element to be canonicalized is not transferred to recipients. The minimal canonicalization does not remove line breaks or other white spaces and because of this the minimal canonicalization fails to eliminate the possibility for alternative presentations. This problem is best understood with the following example: To calculate a hash of his/her public-key an issuer of a certificate must form an XML encoded version of the public-key. There are a number of options for the encoding of a public-key element; here are just two (the base64 encoded contents are truncated in this example): XML ENCODING OPTION 1: <public-key> <dsa-pubkey> <dsa-p>AP1/U4EddRIpUt9KnC7s5Of2EbdSPO9EAMMeP4C2USZpRV...</dsa-p> <dsa-q>AJdgUI8VIwvMspK5gqLrhAvwWBz1</dsa-q> <dsa-g>APfhoIXWmz3ey7yrXDa4V7l5lK+7+jrqgvlXTAs9B4JnUV...</dsa-g> <dsa-y>e45XbCIKlnly8lWIBJi3uX46+fzbYjt6jiApSqoFvvZVtT...</dsa-y> </dsa-pubkey> </public-key> XML ENCODING OPTION 2: <public-key><dsa-pubkey><dsa-p>AP1/U4...</dsa-p><dsa-g>AJdgUI...</dsa-g> <dsa-g>APfhoI...</dsa-g><dsa-y>e45XbC...</dsa-y></dsa-pubkey></public-key> For these two presentations of the same public-key the minimal canonicalization leaves many differences. The recipient of a certificate cannot know what kind of an encoding was used by the certificate issuer prior to canonicalization and hash calculation, and thus cannot create a certain match with the hash of a public-key even if the correct public-key is used. There might be applications where a signature is presented for XML data that is not available directly, but must be created from some raw data prior to checking a signature, and where XML c14n cannot be used (or is not desired). This is yet another reason in addition to those mentioned by other WG members why minimal canonicalization should be taken out of the draft. But I think that there should be a lightweight alternative for XML c14n because c14n is limited to complete documents, needs a DTD or a schema and is unnecessarily complicated for many applications. To conclude: I think it would be beneficial to replace the minimal canonicalization with a lightweight canonicalization that had the following properties: -Can be applied on elements and whole documents -Does not require a DTD or schema for processing -Does remove the most common sources of alternation in XML documents -Canonicalization can be done for DOM tree and SAX events The souces of alternation that should be removed are at least the following: -Character set normalization (UTF-8, I guess) -White spaces (spaces, tabs and line breaks) -Possibly attribute order (for example convert to alphabetical order) Any comments to the canonicalization requirements, are welcome. I have not designed those requirements thoroughly, so it's quite possible that I missed something there. Regards, Juha -- j u h a p ä ä j ä r v i [R&D Engineer] First Hop Ltd. Tekniikantie 12 work +358-9-2517 2332 juha.paajarvi@firsthop.com FIN-02150 Espoo mobile +358-40-560 2733 www.firsthop.com Finland
Received on Sunday, 26 March 2000 09:09:11 UTC