- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Mon, 21 Apr 1997 14:53:41 +0200 (MET DST)
- To: Chris Newman <Chris.Newman@innosoft.com>
- Cc: John C Klensin <klensin@mci.net>, IETF URI list <uri@bunyip.com>
On Tue, 15 Apr 1997, Chris Newman wrote: > On Tue, 15 Apr 1997, John C Klensin wrote: [About length problems with UTF-8.] > UTF-8 requires 2 octets to encode characters from the 8859-1 set which > normally take 1 octet. UTF-8 requires 3 octets to encode ideographic > characters from UCS-2 which normally require 2 octets. So > western Europeans take a worse storage hit from UTF-8 than ideographic > languages do. This is not exactly true. Western European languages contain many characters from ASCII, and only occasionally a character that needs two bytes in UTF-8. But anyway, I think we agree that the size of UTF-8 is not really an issue. Regards, Martin.
Received on Monday, 21 April 1997 08:54:49 UTC