Re: Accept-Charset support from Erik van der Poel on 1996-12-10 (www-international@w3.org from October to December 1996)

From: Erik van der Poel <erik@netscape.com>
Date: Mon, 09 Dec 1996 19:49:14 -0800
To: Francois Yergeau <yergeau@alis.com>
CC: www-international@w3.org, Klaus Weide <kweide@tezcat.com>
Message-ID: <32ACDDBA.7C01@netscape.com>

> The choice between "UNICODE-1-1-UTF-8" and "UTF-8" has been debated at
> length on the ISO10646 and Unicode lists, with the result that we have now:
> "UTF-8".  The wise implementer, however, would be well advised to support
> the longer tag as an ad hoc alias.

I'm not sure what you mean by "ad hoc alias", but the term "alias" is
used in this context (Internet "charsets") to mean a synonym. Are
"unicode-1-1-utf-8" and "utf-8" synonymous? If so, what is the name of
UTF-8-encoded Unicode 2.0?

Unicode 1.1 and 2.0 are not the same. In particular, there was a big
change in the Korean block. The Korean characters in the U+3400 to
U+3D2D range were removed, and they were added again with some others in
the U+AC00 to U+D7A3 range. A future version of the Unicode standard may
re-use the U+3400 to U+3D2D range. If/when that happens, what does
"utf-8" mean?

Without rehashing the whole debate that you say already took place on
those other mailing lists (which I didn't follow), could you briefly
explain the future plans for the charset name "utf-8"? I glanced at RFC
2044 but didn't immediately see anything about this.


Erik

Received on Monday, 9 December 1996 22:50:11 UTC