Date: Mon, 21 Apr 1997 14:53:41 +0200 (MET DST) From: "Martin J. Duerst" <firstname.lastname@example.org> To: Chris Newman <Chris.Newman@innosoft.com> Cc: John C Klensin <email@example.com>, IETF URI list <firstname.lastname@example.org> Subject: Re: revised "generic syntax" internet draft In-Reply-To: <Pine.SOL.3.95.970415130735.22015Kemail@example.com> Message-Id: <Pine.SUN.3.96.970421145201.245I-100000@enoshima> On Tue, 15 Apr 1997, Chris Newman wrote: > On Tue, 15 Apr 1997, John C Klensin wrote: [About length problems with UTF-8.] > UTF-8 requires 2 octets to encode characters from the 8859-1 set which > normally take 1 octet. UTF-8 requires 3 octets to encode ideographic > characters from UCS-2 which normally require 2 octets. So > western Europeans take a worse storage hit from UTF-8 than ideographic > languages do. This is not exactly true. Western European languages contain many characters from ASCII, and only occasionally a character that needs two bytes in UTF-8. But anyway, I think we agree that the size of UTF-8 is not really an issue. Regards, Martin.