W3C home > Mailing lists > Public > w3c-ietf-xmldsig@w3.org > October to December 2000

Fwd: I18N problem in XML canonicalisation

From: Martin J. Duerst <duerst@w3.org>
Date: Sat, 25 Nov 2000 11:16:30 +0900
Message-Id: <4.2.0.58.J.20001125105623.03bac2f0@sh.w3.mag.keio.ac.jp>
To: w3c-ietf-xmldsig@w3.org
Cc: lilley@w3.org
Chris Lilley just pointed out the following problem
in C14N. I think this at least has to be explained
much more clearly in the notes.

>http://www.w3.org/TR/xml-c14n#Example-UTF8
>
>Demonstrates using *two* NCRs foa single UTF-8 character (because it uses
>two bytes in UTF8 !!!

It's not really NCRs. It's a special notation to stand in for byte values.


>I suspect you may have a problem with that..... given that even surrogates
>use a single NCR not two. Also, its not clear the result is even
>wellformed!

There needs to be a much better note to make very clear that (different
to the other examples), this example is not really intended to be XML
and cannot be used directly in a test. It would also be advisable
to provide an actual file that contains the real bytes, or to point
to it if that's already around.

Regards,    Martin.
Received on Friday, 24 November 2000 22:02:35 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.29 : Thursday, 13 January 2005 12:10:11 GMT