- From: unknown charset <yergeau@ALIS.COM>
- Date: Thu, 16 Dec 1999 15:36:39 -0500
- To: ietf-charsets@iana.org
> De: Harald Tveit Alvestrand [mailto:Harald@Alvestrand.no] > Date: jeudi 16 décembre 1999 10:24 > > My list of disadvantages: > > - No compatibility with cstrings due to NULL Ah, this is what I had in mind with my "C string and ASCII-thinking". It's real, but of limited scope. There's a corresponding advantage when you think "Java and Unicode" instead of "C and ASCII". In other words, it's not an intrinsic disadvantage or advantage, it depends on the programming environment. > - Inability to represent characters outside Planes 0-16 Real but very theoretical. In addition to what he wrote, Ken Whistler should have mentionned his calculation of when those planes will fill up if we maintain the current allocation rate (unlikely, most things of import are already done or in the pipeline). I don't remember the target date offhand, but I think it was 23rd century. > - VERY bad expansion factor for characters outside Plane 0 > (100% overhead) Oops! There's not much size difference between a 4-byte UTF-8 character and a 4-byte UTF-16 character. > - No ability to mix ASCII and UTF-16 elements in a simple viewer It's no harder to mix ASCII with UTF-16 than to mix ASCII with any other charset. You just transcode to your favorite encoding of Unicode and you're done. > - Two incompatible byte orders That one is really, really real. > My list of advantages: > > - Does not require conversion between UCS-2 and UTF-16 when > only Plane 0 > characters are used in the UTF-16 Plus the size advantage in plain text for many languages, some of them important in practice. As Ken pointed out, UTF-8 is denser only for ASCII. That said, I will concur with Martin, Harald and Ira: the important thing at this time is to get this out the door, not whether it's Info or PS. -- François Yergeau
Received on Thursday, 16 December 1999 15:50:01 UTC