W3C home > Mailing lists > Public > ietf-charsets@w3.org > January to March 2002

Re: Registration of new charset CESU-8

From: <toby_phipps@peoplesoft.com>
Date: Thu, 03 Jan 2002 07:26:16 +1000
To: phoffman@imc.org
Cc: ietf-charsets@iana.org
Message-id: <OFF22D537F.9ADC7CF5-ON88256B35.00753027@peoplesoft.com>

Paul Hoffman / IMC wrote:

>The statement "It is not intended nor recommended as an encoding used
>for open information exchange." is underlined in the TR for emphasis.

>Charset labels are used for exchanging information. Thus, CESU-8 is
>*not* a candidate for having a charset label.

The proposal clearly states that CESU-8 is not for common use, both in the
description provided and in the Intended Usage block. It can however be
used between consenting and supporting systems when appropriately tagged as
CESU-8 in a higher-level protocol.  This also is stated in the DUTR.  MIME
types are just one of the higher level protocols for describing the
encodings, although in the case of CESU-8 the registration also states that
it is unsuitable for use as a MINE type.

IANA registration is important to CESU-8 as it will be (is) in systems from
several large software vendors, and IANA registered charsets are frequently
used within products to provide a canonical and vendor-neutral name for a
character set used within code, file descriptions etc.

Nothing in RFC 2278 requires that the character set registered is in common
use or suitable for open common interchange.

Toby.



                                                                                                              
                    "Paul Hoffman                                                                             
                    / IMC"               To:     toby_phipps@peoplesoft.com, ietf-charsets@iana.org           
                    <phoffman@imc        cc:                                                                  
                    .org>                Subject:     Re: Registration of new charset CESU-8                  
                                                                                                              
                    03/01/2002                                                                                
                    05:01 AM                                                                                  
                                                                                                              
                                                                                                              




At 1:48 AM -0800 1/2/02, toby_phipps@peoplesoft.com wrote:
>Published specification(s):
>    Unicode Technical Report #26
>    "Compatibility Encoding Scheme for UTF-16: 8-bit (CESU-8)"
>    http://www.unicode.org/unicode/reports/tr26

The summary in that TR says:

>This document specifies an 8-bit Compatibility Encoding Scheme for
>UTF-16 (CESU) that is intended for internal use within systems
>processing Unicode in order to provide an ASCII-compatible 8-bit
>encoding that is similar to UTF-8 but preserves UTF-16 binary
>collation. It is not intended nor recommended as an encoding used
>for open information exchange. The Unicode Consortium, does not
>encourage the use of CESU-8, but does recognize the existence of
>data in this encoding and supplies this technical report to clearly
>define the format and to distinguish it from UTF-8. This encoding
>does not replace or amend the definition of UTF-8.

The statement "It is not intended nor recommended as an encoding used
for open information exchange." is underlined in the TR for emphasis.

Charset labels are used for exchanging information. Thus, CESU-8 is
*not* a candidate for having a charset label.

--Paul Hoffman, Director
--Internet Mail Consortium
Received on Wednesday, 2 January 2002 16:29:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 June 2006 15:10:52 GMT