W3C home > Mailing lists > Public > www-html-editor@w3.org > October to December 2000

XHTML 1.0 Erratum: charset in http-equiv

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Mon, 9 Oct 2000 22:09:34 +0200
Message-ID: <01f701c0322d$243d7b20$23cbb43e@de>
To: <www-html-editor@w3.org>

XHTML 1.0 [1] reads:

C.9 Character Encoding

To specify a character encoding in the document, use both the encoding
attribute specification on the xml declaration (e.g. <?xml version="1.0"
encoding="EUC-JP"?>) and a meta http-equiv statement (e.g. <meta
http-equiv="Content-type" content='text/html; charset="EUC-JP"' />). The value
of the encoding attribute of the xml processing instruction takes precedence.

RFC 2616 [2] says:

   HTTP character sets are identified by case-insensitive tokens. The
   complete set of tokens is defined by the IANA Character Set registry

       charset = token

   Although HTTP allows an arbitrary token to be used as a charset
   value, any token that has a predefined value within the IANA
   Character Set registry [19] MUST represent the character set defined
   by that registry. Applications SHOULD limit their use of character
   sets to those defined by the IANA registry.

The token from the example is '"EUC-JP"'. There is no such character set.
There is a character set 'Extended_UNIX_Code_Packed_Format_for_Japanese' with
an alias 'EUC-JP' but this is a different charset than '"EUC-JP"'. In other
words: the quotes around 'EUC-JP' are wrong.

HTML 4.01 gets this right, see [3]

<META http-equiv="Content-Type" content="text/html; charset=EUC-JP">

[1] http://www.w3.org/TR/2000/REC-xhtml1-20000126
[2] http://www.ietf.org/rfc/rfc2616.txt
[3] http://www.w3.org/TR/html401/charset.html#h-5.2.2
Björn Höhrmann ^ mailto:bjoern@hoehrmann.de ^ http://www.bjoernsworld.de
am Badedeich 7 ° Telefon: +49(0)4667/981ASK ° http://www.websitedev.de/
25899 Dagebüll # PGP Pub. KeyID: 0xA4357E78 # http://learn.to/quote +{i}
..weaving a secure, well-formed, standard compliant WWW for =everyone=..
Received on Monday, 9 October 2000 16:13:01 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:08:24 UTC