- From: Joseph M. Reagle Jr. <reagle@w3.org>
- Date: Wed, 26 May 1999 14:11:02 -0400
- To: "Donald E. Eastlake 3rd" <dee3@us.ibm.com>, <dee3@torque.pothole.com>
- Cc: cmsmcq@uic.edu, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
I brought this up at XML Syntax-WG. The XML Schema WG co-chair (Michael Sperberg-McQueen) committed to pointing this out to the editors and returning a response to you and the w3c-ietf-xmldsig@w3.org list. At 05:13 PM 5/21/99 -0400, Joseph M. Reagle Jr. wrote: > >Not sure who is ultimately responsible for canonicalizing the schema bits, but some thoughts to consider... (Perhaps a brief agenda item for next call?) > >Forwarded Text ---- >From: dee3@us.ibm.com >To: www-xml-schema-comments@w3.org, >"XML-DSig Workshop" <w3c-xml-sig-ws@w3.org> >Date: Fri, 21 May 1999 16:50:46 -0400 >Subject: XML Schema and the necessity for canonical representations Status: > > Having a canonical form of an entity is very important for comparison and digital signature purposes. > > XML is sufficiently rich that canonicalization needs to be considered at several levels. For example, the character set used in two XML documents needs to be converted to a standard if they are to be usefully compared for many purposes. There are also canonicalization considerations related to white space, namespace prefixes, etc, which are being considered by the XML Syntax WG. Similarly, I believe that canonicalization of datatype representation must be considered and the schema WG seems like the place to do it. > > I think the need for datatype's to have a designated canonical lexical form should be fairly clear for comparison purposes. It relieves the comparitor from the burden of having to be able to parse every form of every datatype and covert it to a canonical form the comparitor has selected. > > The need may not be as immediately obvious in the digital signature arena, depending on your mental picture of the "typical" digital signature application. If you picture is very document/object oriented, you might wonder what all the fuss is about since any lump of bits can be signed and, if faithfully transmitted, this signature can be verified later on the same lump of bits. On the other hand, if you have a transactional/protocol point of view, where pieces of messages are being signed, data is processed and forwarded by intermediate parties, and the signature verified by later recipients, etc., canonicalization is essential. > > I have been involved with too many systems where people thought that all they were doing was verifying signatures on unchanged data being sent through multi-party but faithful transmission channels only to find that there was some circumstance where a signed object had to be partly or fully re-constituted or some transmission channel was not as faithful as they thought. As a result, some incredibly stupid thing like capitalization, padding, line ending character sequences, etc., etc., at least temporarily derailed their entire effort as, on a crash basis, they designed and painfully retrofitted canonicalization into their system. Also witness the diddly little lack of canonicalization in the original ASN.1 time and date format: As soon as there was substantial real world use of this, a new, almost identical, fundamental data type, had to be added to ASN.1, with significant disruption and confusion, just to squeeze out the last case of alternative representations of the same date and time. > > There is no problem with the Schema Datatypes document providing multiple lexical representations as long as exactly one form is designated as the canonical form. > > I believe that the XML Schema Datatypes document should be changed to do this and perhaps this should be added to the XML Schema requirements document. > > Thanks, Donald > > End Forwarded Text ---- > _________________________________________________________ Joseph Reagle Jr. Policy Anylyst mailto:reagle@w3.org XML-DSig Co-Chair http://w3.org/People/Reagle/
Received on Wednesday, 26 May 1999 14:11:17 UTC