- From: Keld Jørn Simonsen <keld@dkuug.dk>
- Date: Tue, 24 Dec 2002 09:55:08 +0100
- To: Markus Scherer <markus.scherer@jtcsv.com>
- Cc: charsets <ietf-charsets@iana.org>
On Thu, Dec 19, 2002 at 02:03:12PM -0800, Markus Scherer wrote: > > Remember that UTF-8 was designed to shoehorn Unicode/UCS into Unix file > systems, nothing more. Where ASCII byte-stream compatibility is not an > issue, there are Unicode charsets that are more efficient than UTF-8, > different ones for different uses. Well, it is true that the UTF-FSS encoding, the previous name for UTF-8, was for UNIX filesystems (FSS means File Systems Safe), but when it was renamed to UTF-8 by SC2/WG2, it at the same time replaced the UTF-1 encoding, which was intended for network use. So UTF-8 is purposedly meant for network interchange by the designers of ISO 10646. Furthermore IETF/IESG has stated the policy that UTF-8 is the preferred encoding for all Internet protocols, all existing protocols need to support it, and new protocols should only use UTF-8. So nowadays UTF-8 is much more than just for Unix filesystems. One wonders why W3C made UTF-16 the encoding of choice for XML. Kind regards Keld
Received on Tuesday, 24 December 2002 03:55:44 UTC