RE: Character set question

From: Liam Quinn (liam@htmlhelp.com)
Date: Wed, Mar 07 2001

  • Next message: Thanasis Kinias: "RE: Character set question"

    Date: Wed, 7 Mar 2001 14:39:38 -0500 (EST)
    From: Liam Quinn <liam@htmlhelp.com>
    To: Thanasis Kinias <tkinias@asu.edu>
    cc: "'Kathleen Anderson'" <kathleen@spiderwebwoman.com>, <www-validator@w3.org>
    Message-ID: <Pine.LNX.4.30.0103071432400.1146-100000@localhost.localdomain>
    Subject: RE: Character set question
    
    On Wed, 7 Mar 2001, Thanasis Kinias wrote:
    
    > Kathleen Anderson wrote:
    >
    > > Could someone explain, in layperson's terms, if using <meta
    > > http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> is
    > > preferred over <meta http-equiv="Content-Type" content="text/html;
    > > charset=windows-1252">
    >
    > The short answer is "you don't need either, most of the time."
    
    This may be the case for XML, but it's not for HTML.
    
    > The default
    > charset is UTF-8, which is identical to ISO Latin-1 (ISO 8859-1).
    
    There is no default charset for HTML, and UTF-8 is not identical to
    ISO-8859-1.  UTF-8 and ISO-8859-1 are only identical for the 7-bit
    (US-ASCII) characters.
    
    > You only
    > need to specify Windows 1252 if you are using non-Unicode Windows software
    > and have "hard-coded" characters such as euro sign, daggers, em dash, which
    > are where Latin-1 and Windows 1252 differ.
    
    And you shouldn't generate such pages since they will not render correctly
    on most non-Windows and non-Mac systems.
    
    > If you use entities (e.g.,
    > &#8212; for an em dash) or compose with Unicode-compliant software, you are
    > safe skipping the charset declaration.
    
    The charset declaration is required for HTML documents, regardless of
    whether you use entities.  In practice, you're probably "safe" if you skip
    the charset and stick to US-ASCII, but there's no reason not to specify
    the charset.
    
    -- 
    Liam Quinn