W3C home > Mailing lists > Public > www-international@w3.org > January to March 2005

Re: Unicode encoding for web pages

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Thu, 31 Mar 2005 03:40:01 +0200
To: www-international@w3.org
Message-ID: <424B54F1.72A5@xyzzy.claranet.de>

Deborah Cawkwell wrote:

> For web pages, would you consider using a Unicode encoding
> other than UTF-8, eg UTF-16? If so, why? or why not?

Not, because BE vs. LE causes me headaches.  HTML 4 or later
pages are (conceptually) always translated to Unicode.  If
you need almost always only Latin-1, you could use it, and
for the remaining char.s use symbolic or numeric character
references like &euro; / &#8364; / &#x20AC; (W3C recommends
the latter, but some old Netscape browsers don't like hex.)

Latin-1 plus a few character references might be even shorter
than UTF-8.  For radical backwards compatibility or radical
"compression" you could try Windows-1252 instead of Latin-1.

Some really old browsers know Windows-1252 but not Unicode.
For these really old browsers UTF-16 would fail miserably.
OTOH if your input is UTF-16 you maybe don't care and just
use it as is.
             Bye, Frank
Received on Thursday, 31 March 2005 01:44:07 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:25 UTC