- From: Karl Ove Hufthammer <karl@huftis.org>
- Date: Sat, 07 Jun 2003 17:15:26 +0200
- To: www-validator@w3.org
Terje Bless <link@pobox.com> wrote in news:f02000001-1026-D90ABEF498F711D7B1DF0030657B83E8@[193.157.66. 23]: > Therefore, user agents [MUST NOT] assume any default value > for the "charset" parameter. > ]]] - W3C HTML 4.01 Recommendation 5.2.2 > > Which puts us in a right pretty pickle. > > We've been over this discussion ad nauseum on this list > several times before. The bottom line is that RFC 2616 and the > HTML 4.01 Recommendation (and, by extension, XHTML as well[0]) > are incompatible on this point[1] As I read them, they're not *really* incompatible. I.e. the only way for a document to be conforming to *both* RFC 2616 and the HTML 4.01 Rec. is to *always* explicitly send a 'charset' parameter. There is one other interpretation, though. RFC 2616 talks about HTTP clients and HTML 4.01 talks about user agents. When a HTML document send by HTTP with no explicit 'charset' parameter is received by a user agent, it's already been through a HTTP client, and has been given a 'charset' value of 'ISO-8859-1'. Therefore the paragraph describing what user agents should do when receiving HTML documents without any 'charset' parameter never applies. > and the _only_ safe way to > achieve the correct character encoding for your documents is > to explicitly specify it in the HTTP «Content-Type» header. Yes. > [0] - With the added complication that XHTML superficially is > meant to obey XML defaulting rules for character encoding (e.g. > unlabelled usually means UTF-8). Except when sent as 'text/xml', where it means 'US-ASCII'. :/ -- Karl Ove Hufthammer
Received on Saturday, 7 June 2003 11:15:52 UTC