- From: Tom Gindin <tgindin@us.ibm.com>
- Date: Tue, 28 Nov 2000 19:52:49 -0500
- To: "John Boyer" <jboyer@PureEdge.com>
- Cc: "Martin J. Duerst" <duerst@w3.org>, <w3c-ietf-xmldsig@w3.org>
Is what is meant "... from an encoding which is neither a UCS-n encoding nor a UTF-n encoding"? That would seem to cover UCS-2, UCS-4, UTF-8, and UTF-16 (along with UTF-7 for good measure). If UTF-8 is not included, although the NFC transformation would seem to have no effect on it, just replace "UTF-n" by "UTF-16" in the sentence above. Tom Gindin "John Boyer" <jboyer@PureEdge.com>@w3.org on 11/28/2000 05:39:27 PM Sent by: w3c-ietf-xmldsig-request@w3.org To: "Martin J. Duerst" <duerst@w3.org>, <w3c-ietf-xmldsig@w3.org> cc: Subject: Character Encoding Question Hi Martin and group, I received a letter today from Jeff Cochran (JCochran@docutouch.com) regarding a tweak that would appear to be needed regarding c14n and xml signature. The I18N group asked us to include a sentence along the lines of "REQUIRED to use Normalization Form C [NFC] when converting an XML document to the UCS character domain from a non-Unicode encoding". Apparently this is not exactly what is meant since UCS-4 character planes outside of the BMP are technically non-Unicode. The point Jeff makes is that he doesn't know whether to apply NFC to UCS data that appears outside of the BMP. Question: Should the statement be rewritten? If so, how? Thanks, John Boyer Team Leader, Software Development Distributed Processing and XML PureEdge Solutions Inc. Creating Binding E-Commerce v: 250-479-8334, ext. 143 f: 250-479-3772 1-888-517-2675 http://www.PureEdge.com <http://www.pureedge.com/> -----Original Message----- From: w3c-ietf-xmldsig-request@w3.org [mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Martin J. Duerst Sent: Friday, November 24, 2000 6:17 PM To: w3c-ietf-xmldsig@w3.org Cc: lilley@w3.org Subject: Fwd: I18N problem in XML canonicalisation Chris Lilley just pointed out the following problem in C14N. I think this at least has to be explained much more clearly in the notes. >http://www.w3.org/TR/xml-c14n#Example-UTF8 > >Demonstrates using *two* NCRs foa single UTF-8 character (because it uses >two bytes in UTF8 !!! It's not really NCRs. It's a special notation to stand in for byte values. >I suspect you may have a problem with that..... given that even surrogates >use a single NCR not two. Also, its not clear the result is even >wellformed! There needs to be a much better note to make very clear that (different to the other examples), this example is not really intended to be XML and cannot be used directly in a test. It would also be advisable to provide an actual file that contains the real bytes, or to point to it if that's already around. Regards, Martin.
Received on Tuesday, 28 November 2000 19:53:29 UTC