W3C home > Mailing lists > Public > ietf-charsets@w3.org > October to December 1999

RE: Fwd: Last Call: UTF-16, an encoding of ISO 10646 to Proposed

From: Harald Tveit Alvestrand <Harald@Alvestrand.no>
Date: Thu, 16 Dec 1999 23:40:13 +0100
To: Kenneth Whistler <kenw@sybase.com>
Cc: ietf-charsets@iana.org, kenw@sybase.com, mark.davis@us.ibm.com
Message-id: <4.2.0.58.19991216233302.045d02b0@dokka.maxware.no>
At 14:20 16.12.99 -0800, Kenneth Whistler wrote:

>I think we may be talking at cross-purposes here. *I* am responsible
>for maintaining the content of http://www.unicode.org/pending/pending.html,
>by the way.
>
>UTC and ISO/JTC1/SC2/WG2 *both* are committed to allowing encoding in
>Planes 0..16.

Oops....apologies; I managed to read your 0..16 as "0".
That's my second stupid error this debate.....my brain must be rotting....


>Plane 2: SIP  ~47000 Han characters (Vertical Extension B) under ballot
>               ~18500 still assignable code points

[haring off on an unrelated topic]
one thing I'm not connected enough to have found out:
Is this large set *new* characters, or are parts of the set used for 
rolling back the unifications that created so much tension in and around 
the IRG, for people who want that?

.......

>So that big gap in planes 4..13 looms unused and effectively unusable.
>Nobody in the professional character encoding community has any
>candidates to put there that would really count as characters. There
>are various bizarre schemes that could eat numbers, but as long as
>10646 and the Unicode Standard remain *character* encoding standards,
>it is quite likely that those 10 planes will simply be held in
>reserve. This is enough engineering slack on the character encoding
>to last through this upcoming century easily, even if the World
>Congress on Universal Orthography decides to invent and impose a
>new world orthography each and every decade. :-)
>
>That is why the UTC and WG2 just want to formally close the books
>on this. 16 bits turned out not to be enough for everything that
>somebody wanted to encode as characters. But all projections are
>that 21 bits *is* enough, and we can hold the line there. Nobody
>needs 31 bits for character encoding.

I read you. And see no reason to disagree.
A pity that 21 is not a power of 2.....but that's well beyond the ability 
of standards comittees to accomplish.

                         Harald

--
Harald Tveit Alvestrand, EDB Maxware, Norway
Harald.Alvestrand@edb.maxware.no
Received on Friday, 17 December 1999 01:35:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 June 2006 15:10:51 GMT