- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Mon, 9 Oct 2000 22:09:34 +0200
- To: <www-html-editor@w3.org>
Hi, XHTML 1.0 [1] reads: [...] C.9 Character Encoding To specify a character encoding in the document, use both the encoding attribute specification on the xml declaration (e.g. <?xml version="1.0" encoding="EUC-JP"?>) and a meta http-equiv statement (e.g. <meta http-equiv="Content-type" content='text/html; charset="EUC-JP"' />). The value of the encoding attribute of the xml processing instruction takes precedence. [...] RFC 2616 [2] says: [...] HTTP character sets are identified by case-insensitive tokens. The complete set of tokens is defined by the IANA Character Set registry [19]. charset = token Although HTTP allows an arbitrary token to be used as a charset value, any token that has a predefined value within the IANA Character Set registry [19] MUST represent the character set defined by that registry. Applications SHOULD limit their use of character sets to those defined by the IANA registry. [...] The token from the example is '"EUC-JP"'. There is no such character set. There is a character set 'Extended_UNIX_Code_Packed_Format_for_Japanese' with an alias 'EUC-JP' but this is a different charset than '"EUC-JP"'. In other words: the quotes around 'EUC-JP' are wrong. HTML 4.01 gets this right, see [3] [...] <META http-equiv="Content-Type" content="text/html; charset=EUC-JP"> [...] [1] http://www.w3.org/TR/2000/REC-xhtml1-20000126 [2] http://www.ietf.org/rfc/rfc2616.txt [3] http://www.w3.org/TR/html401/charset.html#h-5.2.2 -- Björn Höhrmann ^ mailto:bjoern@hoehrmann.de ^ http://www.bjoernsworld.de am Badedeich 7 ° Telefon: +49(0)4667/981ASK ° http://www.websitedev.de/ 25899 Dagebüll # PGP Pub. KeyID: 0xA4357E78 # http://learn.to/quote +{i} ..weaving a secure, well-formed, standard compliant WWW for =everyone=..
Received on Monday, 9 October 2000 16:13:01 UTC