Re: Registration of new charset BOCU-1

                                                                                                               
                                                                                                               
                                                                                                               


The significant advantages of BOCU-1 over SCSU are:

- MIME compatibility (unless you don't think that is important ;-)
- binary order preservation: this is valuable wherever the binary order
must be maintained, and is not true of SCSU.

What binary order preservation means is that:

If you take any two UTF-8 strings X and Y, and compress them with BOCU-1 to
X' and Y',
   X < Y if and only if X' < Y'.

Mark
___
mark.davis@us.ibm.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799



                                                                                                                     
                      Harald Tveit                                                                                   
                      Alvestrand               To:       Markus Scherer <markus.scherer@jtcsv.com>, charsets <ietf-  
                      <harald@alvestran         charsets@iana.org>                                                   
                      d.no>                    cc:                                                                   
                                               Subject:  Re: Registration of new charset BOCU-1                      
                      2002.07.24 19:47                                                                               
                                                                                                                     
                                                                                                                     





--On 23. juli 2002 15:17 -0700 Markus Scherer <markus.scherer@jtcsv.com>
wrote:

> BOCU-1 was not created and proposed for registration to then be
> discouraged, but to encourage users to use a Unicode encoding when they
> would otherwise choose a legacy encoding just for its compactness (aside
> from database applications).
>
> As I said before, the SCSU registration has a similar list of features,
> and no one thought it unwise then. It is much easier for someone to
> figure out if a charset is appropriate for some use if one need not
> follow a URL.
>
> I made this argument two weeks ago and there was no response at all, so I
> assumed that this was all acceptable.
>
> I would like to ask again, What do others think?
> What does the approver think?

1) The approver (that's me) agrees with RFC 2278:

3.5.  Usage and Implementation Requirements

   Use of a large number of charsets in a given protocol may hamper
   interoperability. However, the use of a large number of undocumented
   and/or unlabelled charsets hampers interoperability even more.

   A charset should therefore be registered ONLY if it adds significant
   functionality that is valuable to a large community, OR if it
   documents existing practice in a large community. Note that charsets
   registered for the second reason should be explicitly marked as being
   of limited or specialized use and should only be used in Internet
   messages with prior bilateral agreement.

The approver has a hard time seeing that the added value over SCSU and
UTF-8 is enough to be "significant". Are there applications (not toolkits)
today that have committed to converting to use of BOCU-1? Is the Unicode
Consortium considering adding BOCU-1 to its specifications?

2) The approver agrees with the submitter that putting the usability
information into the registration is probably a Good Thing.

                         Harald

Received on Thursday, 25 July 2002 13:40:34 UTC