W3C home > Mailing lists > Public > www-international@w3.org > July to September 2000

Re: Followup on I18N Last Call comments and disposition

From: Joseph M. Reagle Jr. <reagle@w3.org>
Date: Sat, 12 Aug 2000 10:59:43 +0900
Message-Id: <4.2.0.58.J.20000812105933.032d12f0@sh.w3.mag.keio.ac.jp>
To: www-international@w3.org
At 14:26 8/11/2000 +0900, Martin J. Duerst wrote:
  >>Ok. I achieved the symmetry you wanted below. However, it doesn't make
that
  >>much sense to me in that we define the REQUIRED algorithms. Consequently,
  >>shouldn't we just say NFC and UTF8/UTF16 are RECOMMENDED for all , and
  >>define 6.5.1 and 6.5.2 apporpiately (MANDATORY support/creation of/for
those
  >>formats?) Or is it easier to keep it in the general section?
  >
  >Well, whatever you like.
...
  >>Various canonicalization algorithms transcode from a non-Unicode encoding
to
  >>Unicode. Where any such algorithm is REQUIRED or RECOMMENDED by this
  >>specification the algorithm MUST perform normalization [NFC]. Otherwise,
  >>normalization is RECOMMENDED. (Note, there can be ambiguities in
converting
  >>existing charsets to Unicode, for an example see the XML Japanese Profile
  >>[XML-Japanese] NOTE.)
  >
  >In the above paragraph, I guess it would be best to change 'normalization'
  >to something like 'text normalization' or so, two times.

Ok, I tweaked the text to:

Various canonicalization algorithms require conversion to [UTF-8].The two
algorithms below understand at least [UTF-8] and [UTF-16] as input
encodings. We RECOMMEND that externally specified algorithms do the same.
Knowledge of other encodings is OPTIONAL.

Various canonicalization algorithms transcode from a non-Unicode encoding to
Unicode. The two algorithms below perform text normalization during
transcoding [NFC]. We RECOMMENDED that externally specified canonicalization
algorithms do the same. (Note, there can be ambiguities in converting
existing charsets to Unicode, for an example see the XML Japanese Profile
[XML-Japanese] NOTE.)

and added the following text to the first bullt of minimal:

o converts the character encoding to UTF-8 (without any byte order mark
(BOM)). /+ Implementations MUST understand at least [UTF-8] and [UTF-16] as
input encodings. Non-Unicode to Unicode transcoding MUST perform text
normalization [NFC].+/

_________________________________________________________
Joseph Reagle Jr.
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/People/Reagle/
Received on Saturday, 12 August 2000 03:08:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT