- From: Donald E. Eastlake 3rd <dee3@torque.pothole.com>
- Date: Wed, 27 Oct 1999 11:36:25 -0400
- To: <w3c-ietf-xmldsig@w3.org>
Hi Phil, From: "Phillip M Hallam-Baker" <pbaker@verisign.com> Resent-Date: Mon, 25 Oct 1999 11:13:55 -0400 (EDT) Resent-Message-Id: <199910251513.LAA04220@www19.w3.org> To: "Donald E. Eastlake 3rd" <dee3@torque.pothole.com>, <w3c-ietf-xmldsig@w3.org> Date: Mon, 25 Oct 1999 11:14:55 -0400 Message-ID: <000d01bf1efb$b1f96780$6e07a8c0@pbaker-pc.verisign.com> In-Reply-To: <199910250307.XAA02021@torque.pothole.com> >We have been over this many times. Indeed, canonicalization appears to be a contentious area. I'm sure there will substantial future discussion on this topic. :-) >The ability to reconstruct signatures after a document has been parsed and >regenerated was a dreadfull idea for ASN.1 leading to the mistake of DER. The complexity stems from having multiple surface representations of the same internal data. Thus, with BER (Basic Encoding Rules) or some of the other encoding rule options for ASN.1, you can serialize the internal data in multiple ways and when the signer and verifier do so, as Murphy's Law predicts, the signture breaks. I don't think ASN.1 was desgined with digital signatures in mind at all but this separation between the internal data and the surface representation was a deliberate feature. If it were practical to always carry along a particular surface representation, there would never have been any pressure to design DER (Distinguished Encoding Rules) which provides a unique external serialization of the true internal ASN.1 data. Protocol uses of XML, such as IOTP, are always assembling new messages out of pieces of old message, some signed and some not. It is essential that such things be possible without breaking XML signatures. The same is true of protocols of more than trivial compexity that use ASN.1. For example, I have architected a SET implementation and coded it for some of the SET messages. (So my distaste for ASN.1 is grounded in hands-on experience.) In SET you are always tearing apart and decrypting and verifying and putting together and signing and encrypting stuff. It's kind of tedious but it all works fine as long as there is a canonical representation defined. >But for DER, ASN.1 would probably have never gained the reputation it has. I don't think DER had much effect on the reputation of ASN.1 but to the extent that it had an effect, it was to improve its utility and reputation. The canonical representation of ASN.1 provided by DER made interoperable digital signatures over ASN.1 data practical for non-trivial systems just as XML canonicalization is required to make interoperable digital signatures over XML practical. >This canonicalization algorithm you keep touting is exactly the type of >baroque SGML boondogle that made invention of XML necessary in the first >place. I don't think so. There were quite a few motivations for the design of XML but I don't believe they were particularly worried about digital signatures during that design process. As a result, they chose to provide flexible output formating. This improves readability, which is another goal of XML. As soon as the decision was made to permit multiple surface representations of the same true interal data, the requirement for XML canonicalization, to make XML digital signature practical, was born. However, with XML, they went even further. Not only are multiple surface representations permitted, but the XML standards require that some surface details, details which are insignificant to determining the true interal data represented, must not be passed to any XML conformant application. >The chances of the canonicalization algorithm being generally implemented >well enough to be relied upon and actually used for the purpose of parsing >datastructures into databases and reconstructing them are zero. People had >difficulty with PKCS#12 which is many orders of magnitude simpler. The fact >that there is an argument going on on this list as to what the c14n form >would be is a case in point. I find your use of the word "databases" odd. Implementation of DER so as to be interoperable does not seem to have been a big problem. I think there will be a number of interoperable open source implementations of XML canonicalization so I don't see a problem there either. The current W3C canonicalization form would work, as far as I can tell. There are some discussions in this WG of details or options for canonicalization. But most of the arguments seem to be pretty much on whether you should canonicalize at all. I think a lot of this is the conflict between the document view, which doesn't want to touch any bits, and the protocol view which believes that substantial intermediate processing of signed data is the norm. >The processing overhead required to reassemble the attributes in order is >significant, particularly when the object in question is a complex data >structure generated from an XML schema, possibly incorporating several Mb >of internal attributes. I do not agree. The number of attributes on an element is usually quite small. But even if there were, say, 100 attributes on an element, which is more than I have ever seen, the effort involved in treating them in a canonical order is trivial. I understand that Jim Clark's XML parser always gives you the attributes in order. The standard XML interfaces, DOM and SAX, parse start tags and the attributes therein and DOM gives them to you a leaves of the tree it provides while SAX gives you an arrary of attributes as part of the start tag event. I don't quite know what you are refering with with "several Mb of internal attributes" but I don't see any relevance of the size of the attribute values to the trivial effort of ordering them before you serialize them to feed to your digest function. It is assumed in XML that the start tag and its attributes are more or less one atom that you can hold and process. Large, complex data is normally content, not attribute value. 2 or 3 attributes is very much more likely than 20 or 30. Attribute values of a few tens of characters or less are very VERY much more likely than a megabyte attribute value, but, as I say, the length of the attribute value has almost nothing to do with the trivial effort of attribute ordering. >Systems are already being built based on the PKCS#7 detached signature >format. If you insist on re-opening this issue yet again those systems >will cease to be prototypes and be embedded production systems by the >time the specification is complete. Why do you believe this issue was ever closed? This is not a working group to rubber stamp the binary PKCS#7 format and declare it to be the XML signature format. Nor is it a group to design a new binary signature format that happens to be a character encoded binary format which looks like XML. The charter of this working group is to specify an actual XML syntax signature format. One consonant with the established XML standards. And it wouldn't hurt if it fit with the existing standard DOM interface and the de facto standard SAX interface either. These standards and interfaces make, for example, original attribute ordering and insignificant white space inside start tags not just insignificant but completely inaccessible to any XML application. >Needless to say the chances of someone taking a working production system >and then introducing an unnecessary transformation to sort all the XML >attributes into alphabetical order before verifying a signature is small. I suppose there may be proprietary closed production systems that don't use any standard DOM or SAX derived interface and don't care about the XML standards. In my opinion, these are binary or text signature systems, not XML signature systems. If that works for them and they are happy, that's fine and perhaps they won't change. This doesn't seem to have much to with specifying an XML sytax signature standard, however. >If you really MUST support the functionality proposed, canonicalization >(which has a well defined meaning in computer science) is NOT the way >to achieve it. > >Instead of canonicalization, specify a syntax constraint. This might >state for example that the signed octets presented were presented with >null space eliminated, tags fully epanded and attributes sorted into >alphabetical order. Yes, you could canonicalize at "print" time rather than at signature and at verification time, throwing away the readability and flexibility of XML. Of course, nothing that passes the result or verifies the signature could use SAX or DOM unless it was willing to re-canonicalize it (in which case it is unclear what you gained by canonicalizing at print time). They would have to use proprietary home-brew processing. >The only functional difference between this approach and the one you >propose is that the signature on the syntax constrained text would fail >if it were to be passed through this 'noisy channel' that you keep >claiming exists (although I have never known a noisy channel >gratuitously re-order bytes). "Noisy channel" is not a particular good description. I claim that we should assume that much of the time XML will be processing in accordance with the XML 1.0 standard and other standards and that much of the time the standard DOM and de facto standard SAX interfaces will be used. I think there is substantial evidence beyond my personal claim for the existence and use of SAX, the existence and use of DOM, and the existence and use of the adopted XML standards. >On the processing side such a syntax constraint would require minimal >code, a few lines to check that each attribute was presented in order, >no unexpected whitespace appearing etc. > > >A syntax constraint is acceptable, a c14n requirement is not. The charter of this WG is to specify a digital signature in XML syntax. The standards for XML are such that, for anything even vaguely like our current XML digital signature syntax, XML canonicalization is required. >If we continue to keep re-opening this issue I don't think consensus >will be reached. Worse still, I predict that as with ASN.1 in PKIX >there will be a steady stream of attempts to re-open the syntax issue >and 'simplify', generally leading to entirely different syntaxes. I haven't the faintest idea why, in such a contentious area, you think this issue was ever closed in this WG. I would think that if we can not get to consensus on a fixed canonicalization algorithm for SignedInfo and can not even get to consensus on a default canonicalization algorithm for SignedInfo, it will just end up a mandatory to specify algorithm. And everyone actually doing XML processing as defined by the XML standards and implemented with DOM or SAX will specify an XML canonicalization, such as the current W3C canonicalization draft. I don't know what's going on in PKIX but in any area of active standards development there are normally a constant series of suggestions presented. I can't tell if you are claiming that adopting an XML canonicalization algorithm would lead to suggestions for change in the syntax of XML or suggestions for change in the syntax of XML digital signatures, but I don't see how either follows. > Phill I'm sorry that you don't like XML and wish it were a different animal than it is. But inherent in the adopted XML standards and in the standard DOM and SAX interfaces that are used to implement those standards are both the screening from any XML application of some insignificant aspects of XML input and the declaration that other aspects are insignificant. There is no problem with providing implementers with pre-defined options for using the XML digital signature format for more fragile non-XML signatures and the option of defining their own canonicalizations/transforms. But, in my opinion, the absence of a required to implement XML canonicalziation would mean the WG has failed in its charter duty to specify interoperable XML syntax signatures. Donald =================================================================== Donald E. Eastlake 3rd +1 914-276-2668 dee3@torque.pothole.com 65 Shindegan Hill Rd, RR#1 +1 914-784-7913(w) dee3@us.ibm.com Carmel, NY 10512 USA
Received on Wednesday, 27 October 1999 11:36:31 UTC