- From: Kevin Regan <kevinr@valicert.com>
- Date: Fri, 23 Jun 2000 13:18:00 -0700 (PDT)
- To: jboyer@PureEdge.com
- Cc: w3c-ietf-xmldsig@w3.org, kevinr@valicert.com
Hi, Let me preface my comments by saying that I do not consider myself an expert in either XML or XML Signature/C14N. However, I would like to comment on the lack of character normalization in both specifications. Please read this as a plea for clarification and personal edification rather than a disparagement of the specifications. Reading through the C14N spec, it states: --------------------------------------------------------------- A.1 No Character Model Normalization The Unicode standard [Unicode] allows multiple different representations of certain "precomposed characters" (a simple example is ""). Thus two XML documents with content that is equivalent for the purposes of most applications may contain differing character sequences. The W3C has recommended a normalized representation [CharModel]. Prior drafts of Canonical XML used this normalized form. However, most XML 1.0 processors do not perform the this normalization. Furthermore, applications that must solve this problem typically perform the character model normalization as character content is created, which would obviate the need for character model normalization during canonicalization. Therefore, character model normalization has been moved out of scope for Canonical XML. ---------------------------------------------------------------- In addition, the XML Signature spec states: ---------------------------------------------------------------- 7.0 XML Canonicalization and Syntax Constraint Considerations * * * Any canonicalization algorithm should yield output in a specific fixed coded character set. For both the minimal canonicalization defined in this specification, the W3C Canonical XML [XML-C14N], and the 2000 Canonical XML [XML-C14N-a], that coded character set is UTF-8. * * * Neither the minimal canonicalization nor the 2000 Canonical XML [XML-C14N-a] algorithms provide character normalization. We RECOMMEND that signature applications produce XML content in Normalized Form C [NFC] and check that any XML being consumed is in that form as well (if not, signatures may consequently fail to validate). ----------------------------------------------------------------- It seems that the responsibility for creating canonicalizable or signable documents is being pushed to the application creating the XML documents to be signed (as well as the application producing the XML Signature document itself). However, won't it most likely be the case that producers of XML documents will not have nearly the resources or technical no-how to reasonably perform this character normalization? Will the producers of XML documents even know that their work will be signed at some future date? In addition, doesn't this preclude the signing of XML documents that may have already been created in something other than the "Normalized Form C" format? Wouldn't it make more sense to put the burden of normalization on the application processing the XML document and producing the signature? It is this application that will be most knowledgeable about the need for character normalization and about the way in which character normalization can be implemented. The goal of the XML C14N spec seems to be to avoid the additional work (which, admittedly, is not trivial) of performing the character normalization step, pushing this on to the application that actually uses C14N. However, it is the XML Signature "application" that C14N is most meant to support. Therefore, it seems that the character normalization must either be called for in C14N or in the XML Signature specification itself. Currently, the XML Signature spec recommends creating a failure condition when the appropriate normalized form for input is not detected as well as creating its output in the same normalized form. Is this less work than simply converting all documents that are being processed into the normalized form before computing the signature? Wouldn't this allow us to eliminate a failure case (and the added complexity given to the producers of XML documents)? One final question. Is it possible for the processing of an XML document to change the character format? If so, wouldn't this add to the failure case mentioned in the previous paragraph? It seems that the door is being opened for a major incompatibility and the inability to sign a large number of pre-existing and future XML documents (that will be created without any regard given to character normalization). Sincerely, Kevin Regan kevinr@valicert.com
Received on Friday, 23 June 2000 16:17:52 UTC