- From: Terje Bless <link@pobox.com>
- Date: Sat, 7 Jun 2003 18:00:51 +0200
- To: W3C Validator <www-validator@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Karl Ove Hufthammer <karl@huftis.org> wrote: >There is one other interpretation, though. RFC 2616 talks about HTTP >clients and HTML 4.01 talks about user agents. When a HTML document send >by HTTP with no explicit 'charset' parameter is received by a user >agent, it's already been through a HTTP client, and has been given a >'charset' value of 'ISO-8859-1'. Therefore the paragraph describing what >user agents should do when receiving HTML documents without any >'charset' parameter never applies. Yes, well, in the interest of full disclosure, let me add that another significant factor in the Validator's current behaviour is that the HTTP defaulting behaviour is considered harmfull to i18n and all those users for whom iso-8859-1 is insufficient. I'm not saying this is an entirely uncontroversial position (and I think Björn among others have pointed this out on several occasions), but it is one factor that affects which way we've elected to go in the absence of persuasive specification language. In particular, if we allow for your interpretation above, we would in effect default to ISO-8859-1 not only for pages such as Kjetil's (who are most certainly correct and the author very aware of what he is doing), but also for Joe Web-duh-signer and his clueless little hosting company where there is _no_ conscious decision involved and ISO-8859-1 is the _wrong_ value more often then not. There is also the open question of HTTP's relationship with MIME, whose rules would indicate that text/html without a charset parameter ought to be interpreted as US-ASCII. Which taken together with the above and previous issues all lead up to the single conclusion that «Charset Defaulting Considered Harmfull» (to invoke a modern Godwin-equivalent ;D) and that the only reliable way -- and therefore the behaviour the Validator should be aiming to encourage and enforce -- to deal with character encoding issues is to label them explicitly. - -- I have lobbied for the update and improvement of SGML. I've done it for years. I consider it the jewel for which XML is a setting. It does deserve a bit or polishing now and then. -- Len Bullard -----BEGIN PGP SIGNATURE----- Version: PGP SDK 3.0.2 iQA/AwUBPuIMMqPyPrIkdfXsEQIhWgCg3hy2O5gifcpVNI08OzqT5KeB/jMAnRXj lo1ZO96Vg+MafBRIi25x+bqj =ij/A -----END PGP SIGNATURE-----
Received on Saturday, 7 June 2003 12:00:54 UTC