- From: Jon Hanna <jon@spin.ie>
- Date: Tue, 20 Aug 2002 11:39:28 +0100
- To: "WAI \(E-mail\)" <w3c-wai-ig@w3.org>
> Could somebody please explain the difference between UTF8 and UTF16 to me > and why you would want to use UTF16 over UTF8? Jukka has ably answered the first part of the question. The second part, why one would want to use UTF-16 rather than UTF-8, has two main answers. The first is that it's easier to convert from UCS-2 to UTF-16, in fact UTF-16 is exactly the same as UCS-2, they differ only when it comes to character points outside of the UCS-2 range. UCS-2 is used internally in some operating systems (Windows NT for example), and is the "natural" type of character in some languages (VB, Java). The second is that the way that UTF-8 encodes UCS results in shortening the size of the bytestream when the characters are mainly from the ASCII range, maintaining the same size when the characters are mainly from the range U0080 - U07FF, and increasing the size when the characters are mainly above character point U07FF. Hence for some languages UTF-16 may be more efficient on bandwidth. I find it hard to believe that their are many user-agents that can support UTF-8 but not UTF-16, but maybe I'm putting too much faith in common-sense. Certainly any browser that can process XML must be able to support UTF-16.
Received on Tuesday, 20 August 2002 06:37:45 UTC