Re: RFC 2279 (UTF-8) to Full Standard

Yes, they have; and it is quoted in the UTF-32 TR. Moreover, it is of
course safest if the RFC UTF-8 is restricted to 10FFFF, since any
higher values will not convert to UTF-16, and could even cause
security problems if converted incorrectly (e.g. overlaying legitimate
codes).

Mark
—————

Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Martin Duerst" <duerst@w3.org>
To: "Kenneth Whistler" <kenw@sybase.com>; <FYergeau@alis.com>
Cc: <ietf-charsets@iana.org>
Sent: Thursday, April 11, 2002 23:29
Subject: RE: RFC 2279 (UTF-8) to Full Standard


> At 18:10 02/04/11 -0700, Kenneth Whistler wrote:
>
> >I agree, even though the Unicode Standard only describes UTF-8
> >out to U+10FFFF. 10646 still gives the full scheme to U-7FFFFFFF,
> >and it will be awhile (if ever) before we can change that to
> >deprecate all the 5- and 6-byte values.
>
> I thought ISO had adopted a standing policy on not allocating
> anything beyond U+10FFFF. Ken, do you know the exact status of
> this? Can you tell us?
>
> >So I see no good reason
> >right now to put RFC 2279 out of synch with 10646, particularly
> >if it would slow down a revision of RFC 2279 now.
>
> I think the new document should clearly state that codepoints above
> U+10FFFF cannot be encoded in UTF-16, that the Unicode consortium
> won't allocate any codepoints above that, that ISO has some relevant
> policy (if they do),... Also, pointing to UTF-32 might be a good
idea.
> (I just found out that it has been approved for registration, but
> is not yet listed in the relevant file.)
>
>
> Regards,   Martin.
>

Received on Friday, 12 April 2002 10:15:22 UTC