- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 24 Jul 2003 17:58:26 -0400
- To: "John Boyer" <JBoyer@PureEdge.com>, <w3c-ietf-xmldsig@w3.org>
- Cc: <w3c-i18n-ig@w3.org>, <w3c-rdfcore-wg@w3.org>, "Peter F. \" Patel-Schneider" <pfps@research.bell-labs.com>
Hello John, Many thanks for your quick reply. Your proposed tweak would be fine by me. Regards, Martin. At 13:43 03/07/24 -0700, John Boyer wrote: >Hi Martin, > >The wording you gave for the note below seems fine to me, except the last >sentence, which begins "In such cases, users of Canonical XML should...", >should begin "In such cases, users of Canonical XML may...". > >Thanks, >John Boyer, Ph.D. >Senior Product Architect and Research Scientist >PureEdge Solutions Inc. > > >-----Original Message----- >From: Martin Duerst [mailto:duerst@w3.org] >Sent: Thursday, July 24, 2003 1:05 PM >To: w3c-ietf-xmldsig@w3.org >Cc: w3c-i18n-ig@w3.org; w3c-rdfcore-wg@w3.org; Peter F. " >Patel-Schneider >Subject: Request for clarification on Canonical XML > > > >Hello Joseph, dear XML Signature specialists, > >I would like to request a clarification on Canonical XML >(http://www.w3.org/TR/2001/REC-xml-c14n-20010315). > >At http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Terminology, >Canonical XML says: > > >>>> >The canonical form of an XML document is physical representation of the >document produced by the method described in this specification. The >changes are summarized in the following list: > >* The document is encoded in UTF-8 > >>>> > >There are numerous applications (parser testing, digital signatures, >encryption) where it is important to have an actual physical representation >for simple octet comparison or for input to cryptographic algorithms >that usually take octet streams as inputs. > >However, it has recently come to my attention that there are also some >attempts to use Canonical XML (or Exclusive XML Canonicalisation, which >inherits this aspect of its definition from Canonical XML) in other >situations, such as purely abstract modeling and comparison of XML >documents or XML fragments, or for API definitions. > >For abstract modeling, the encoding in UTF-8 irrelevant and confusing. >For API definitions, specifying the encoding is crucial, but it may >be counterproductive to use UTF-8 in a context where UTF-16 is >widely used (e.g. Java, Python). > >Although not appropriate in these cases, it seems to be difficult for >users of Canonical XML to abstract them from UTF-8 where necessary. >I therefore propose to add some clarification. As a first actual >text proposal (may need some additional work), I propose to >add a note at the end of Section 1.1, Terminology: > >Note: Canonical XML is defined here in terms of a physical (octet-based) >representation. This is appropriate for many applications, ranging from >digital signatures to parser testing. However, there are cases where a >physical representation is not needed, and there are cases where another >physical representation is appropriate. As an example, it may be >aproriate to choose UTF-16 rather than UTF-8 as the encoding of an >API in a programming language using UTF-16 to represent Unicode strings, >such as Java or Python. In such cases, users of Canonical XML should >abstract from the physical character encoding if they note this >appropriately. > >I'm sure there is a better way to word this. > > >Regards, Martin. >
Received on Thursday, 24 July 2003 18:08:57 UTC