Re: I18N WG/IG last call comments

Just some comments below:

At 00/05/31 17:18 -0400, Joseph M. Reagle Jr. wrote:
>Martin, this is my attempt to definitively respond to the last of your
>issues.
>
>At 12:27 00/03/25 +0900, Martin J. Duerst wrote:
>  >Character encoding and transcoding
>  >----------------------------------
>  >[Transcoding is the conversion from one character encoding
>  >(charset) to another.]
>  >
>  >- 'minimal' canonicalization is required, but it should be made
>  >   very clear that this does not imply that conversion from all
>  >   'charset's to UTF-8 is required. A set of 'charset's for which
>  >   support is required should be defined exactly, e.g. as UTF-8
>  >   and UTF-16. This is the same for other transforms.
>  >
>  >- There should be a clear and strong warning and an official
>  >   non-guarantee in the spec about conversions from legacy
>  >   'charset's (i.e. encodings not based on the UCS) into encodings
>  >   based on the UCS (i.e. UTF-8, UTF-16,...). The problem is
>  >   that while for most 'charset's, all transcoder implementations
>  >   produce the same result for almost all characters in these
>  >   'charset's, the number of 'charset's where all transcoding
>  >   implementations behave exactly the same is rather limited.
>  >   As an example, in "Shift_JIS", the 'charset' used on Japanese PCs,
>  >   about 10 to 15 special characters are transcoded differently
>  >   on Windows, on the Mac, and by Java.
>  >
>  >- Part of the above warning should be to recommend UTF-8 and UTF-16
>  >   to be used for the original documents, and to recommend to use
>  >   numeric character references (e.g. ꯍ) for cases where
>  >   differences are known.
>
>The specification does not contain either of these recommendations yet.
>Absent specific text that could be proposed in our specification, I (as one
>of the editors) don't feel I have the expertise to treat this topic with the
>appropriate breadth and depth (and no proposals have been forthcoming on the
>list) that you have -- beyond what I've already responded to in previous
>emails and what I speak of below.

The Japanese XML Profile Submission now provides some helpful
background material. It wasn't out at the time we made our
last call comments, unfortunately.


>Also, I wonder to what extent is it
>optimum to write a NFC verification algorithm (or algorithm ID) within the
>Signature specification. Much of this is independent of Signature and common
>to all XML, we are just one of the first consumer that are affected by this.

DSig is not the first customer at all, it's just that for
Signatures, different issues have to be considered.
There is no need to write a verification algorithm, because
it's trivially defined by the actual normalization algorithm
(normalize, compare with original, if the same, okay, otherwise bad).
But the I18N WG retracted this request, anyway.



Regards,   Martin.

Received on Tuesday, 27 June 2000 03:54:31 UTC