- From: Mark Davis <mark.davis@us.ibm.com>
- Date: Thu, 25 Jul 2002 10:39:43 -0700
- To: Harald Tveit Alvestrand <harald@alvestrand.no>
- Cc: charsets <ietf-charsets@iana.org>, Markus Scherer <markus.scherer@jtcsv.com>
- Message-id: <OFF2D010CC.AA1538DE-ON88256C01.005E1251@us.ibm.com>
The significant advantages of BOCU-1 over SCSU are: - MIME compatibility (unless you don't think that is important ;-) - binary order preservation: this is valuable wherever the binary order must be maintained, and is not true of SCSU. What binary order preservation means is that: If you take any two UTF-8 strings X and Y, and compress them with BOCU-1 to X' and Y', X < Y if and only if X' < Y'. Mark ___ mark.davis@us.ibm.com IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 Harald Tveit Alvestrand To: Markus Scherer <markus.scherer@jtcsv.com>, charsets <ietf- <harald@alvestran charsets@iana.org> d.no> cc: Subject: Re: Registration of new charset BOCU-1 2002.07.24 19:47 --On 23. juli 2002 15:17 -0700 Markus Scherer <markus.scherer@jtcsv.com> wrote: > BOCU-1 was not created and proposed for registration to then be > discouraged, but to encourage users to use a Unicode encoding when they > would otherwise choose a legacy encoding just for its compactness (aside > from database applications). > > As I said before, the SCSU registration has a similar list of features, > and no one thought it unwise then. It is much easier for someone to > figure out if a charset is appropriate for some use if one need not > follow a URL. > > I made this argument two weeks ago and there was no response at all, so I > assumed that this was all acceptable. > > I would like to ask again, What do others think? > What does the approver think? 1) The approver (that's me) agrees with RFC 2278: 3.5. Usage and Implementation Requirements Use of a large number of charsets in a given protocol may hamper interoperability. However, the use of a large number of undocumented and/or unlabelled charsets hampers interoperability even more. A charset should therefore be registered ONLY if it adds significant functionality that is valuable to a large community, OR if it documents existing practice in a large community. Note that charsets registered for the second reason should be explicitly marked as being of limited or specialized use and should only be used in Internet messages with prior bilateral agreement. The approver has a hard time seeing that the added value over SCSU and UTF-8 is enough to be "significant". Are there applications (not toolkits) today that have committed to converting to use of BOCU-1? Is the Unicode Consortium considering adding BOCU-1 to its specifications? 2) The approver agrees with the submitter that putting the usability information into the registration is probably a Good Thing. Harald
Attachments
- image/gif attachment: graycol.gif
- image/gif attachment: ecblank.gif
- image/gif attachment: pic00079.gif
Received on Thursday, 25 July 2002 13:40:34 UTC