Re: revised "generic syntax" internet draft

Edward Cherlin (
Sun, 27 Apr 1997 12:59:01 -0700

Message-Id: <v03007802af895cefc84c@[]>
In-Reply-To: <>
Date: Sun, 27 Apr 1997 12:59:01 -0700
From: Edward Cherlin <>
Subject: Re: revised "generic syntax" internet draft (Keld J|rn Simonsen) wrote:

>"Martin J. Duerst" writes:
>> > (iv) It is not hard to demonstrate that, in the medium to
>> > long term, there are some requirements for character set
>> > encoding for which Unicode will not suffice and it will be
>> > necessary to go to multi-plane 10646
>> You are not the first or only one to notice this. Unicode
>> currently can encode planes 0 to 16 (for a total of about
>> one million codepoints) by a mechanism called surrogates
>> or UTF-16. Please check your copy of Unicode vol. 2.
>Surely we are not talking Unicode, (an industry standard) but ISO 10646?
>IETF normally specifies ISO standards when available. 10646 is 32 bits.

ISO 10646 specifies Unicode as a 16-bit subset. There is nothing to argue
about here. We will formally specify 10646, but we will actually only use
Unicode, since there are no other characters defined in 10646, and current
expectation is that there never will be any 10646 characters not in
Unicode, since their alignment is part of the current definition of both.

Unicode 1.0 encoded plane 0 of ISO 10646, and Unicode 2.0 encodes 17 planes
of ISO 10646, including somewhat more than a million characters. The most
generous estimate of possible future need is a quarter million characters.

Edward Cherlin     Everything should be made
Vice President     Ask. Someone knows.       as simple as possible,
NewbieNet, Inc.                                 __but no simpler__.                Attributed to Albert Einstein