- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 10 Jul 2002 10:33:16 +0900
- To: Markus Scherer <markus.scherer@jtcsv.com>, charsets <ietf-charsets@iana.org>
Hello Markus, Two comments: At 10:34 02/07/09 -0700, Markus Scherer wrote: >(This is a proposal for a registration; I am using the template from RFC >2978.) > >Charset name: BOCU-1 > >Charset aliases: (none, except for the implicit csBOCU-1) > >Suitability for use in MIME text: Yes > >Published specifications: > CCS & CES: The BOCU-1 charset is a combination of the > Unicode and ISO 10646 Coded Character Set (CCS) 'combination' sounds very strange. The CCS of Unicode and ISO 10646 is identical by design, but 'combination' suggests that BOCU-1 has brought them together. >with > the Character Encoding Scheme (CES) specified in > the above document. It covers exactly the > UTF-16-reachable subset of ISO 10646. > >ISO 10646 equivalency table: > Algorithmic, see published specification and sample code. > >Additional information: Given that you (correctly, in my view) say "Intended usage: LIMITED USE", I would just cut out all of this, because there is no need for marketing. I assume it's all documented in the spec that you have already cited. Regards, Martin. > BOCU-1 is an encoding (CES/TES) of Unicode/ISO 10646 > for the storage and exchange of text data. > It is stateful and provides a good byte/code point ratio while > being directly usable in SMTP emails, database fields and other contexts. > > BOCU-1 combines the wide applicability of UTF-8 with the compactness > of SCSU. > It is useful for short strings and maintains code point order. > > BOCU-1 does not encode most ASCII characters with US-ASCII byte values. > > There is a Unicode signature byte sequence defined > (FB EE 28, see specification). > > BOCU-1 is suitable for > - databases: maintains Unicode code point order > - emails: directly suitable for MIME text > - CVS and similar: deterministic and resets at CR and LF > > BOCU-1 is not suitable for > - efficient internal processing (convert to UTF-8/16/32) > - contexts where encoding declarations _in_ documents _must_ be > ASCII-readable > >Person & email address to contact for further information: > Markus W. Scherer > IBM Globalization Center of Competency > 5600 Cottle Road > Mail Stop: 50-2/B11 > San Jose, CA 95193 > USA > > markus.scherer@jtcsv.com > markus.scherer@us.ibm.com > >Intended usage: LIMITED USE > >---- >Suggested MIBenum value: 1020 > (first available in Unicode/ISO 10646 range; like SCSU [which is 1011]) > > >Thank you for your consideration, > >markus >
Received on Tuesday, 9 July 2002 21:35:43 UTC