W3C home > Mailing lists > Public > www-validator@w3.org > March 2001

RE: Character set question

From: Liam Quinn <liam@htmlhelp.com>
Date: Wed, 7 Mar 2001 14:39:38 -0500 (EST)
To: Thanasis Kinias <tkinias@asu.edu>
cc: "'Kathleen Anderson'" <kathleen@spiderwebwoman.com>, <www-validator@w3.org>
Message-ID: <Pine.LNX.4.30.0103071432400.1146-100000@localhost.localdomain>
On Wed, 7 Mar 2001, Thanasis Kinias wrote:

> Kathleen Anderson wrote:
>
> > Could someone explain, in layperson's terms, if using <meta
> > http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> is
> > preferred over <meta http-equiv="Content-Type" content="text/html;
> > charset=windows-1252">
>
> The short answer is "you don't need either, most of the time."

This may be the case for XML, but it's not for HTML.

> The default
> charset is UTF-8, which is identical to ISO Latin-1 (ISO 8859-1).

There is no default charset for HTML, and UTF-8 is not identical to
ISO-8859-1.  UTF-8 and ISO-8859-1 are only identical for the 7-bit
(US-ASCII) characters.

> You only
> need to specify Windows 1252 if you are using non-Unicode Windows software
> and have "hard-coded" characters such as euro sign, daggers, em dash, which
> are where Latin-1 and Windows 1252 differ.

And you shouldn't generate such pages since they will not render correctly
on most non-Windows and non-Mac systems.

> If you use entities (e.g.,
> &#8212; for an em dash) or compose with Unicode-compliant software, you are
> safe skipping the charset declaration.

The charset declaration is required for HTML documents, regardless of
whether you use entities.  In practice, you're probably "safe" if you skip
the charset and stick to US-ASCII, but there's no reason not to specify
the charset.

-- 
Liam Quinn
Received on Wednesday, 7 March 2001 14:39:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:55 GMT