- From: Kjetil Torgrim Homme <kjetilho@ifi.uio.no>
- Date: Sat, 07 Jun 2003 18:02:03 +0200
- To: W3C Validator <www-validator@w3.org>
[Terje Bless]: > > Uhm, what a wonderfully confrontationally phrased bug report; > you're usually much more carefull with your formulation in no.* > Kjetil! :-) sorry, I guess I was a bit terse. > The relevant parts of the cited sections of RFC 2616 read [...] > > Which appears to support your claim. Unfortunately, the HTML 4.01 > Recommendation, Section 5.2.2, reads: [...] > Which puts us in a right pretty pickle. a Standards Track RFC can't be overridden by a Recommendation from W3C. > This new behaviour goes some way towards addressing your concern, > but you will still find your documents labelled Invalid unless you > specify a character encoding. my document is valid, so this is incorrect behaviour. > I would strongly encourage you to explicitly specify the character > encoding. In particular, I direct your attention to the part of > RFC2616 3.4.1 which reads: «Senders wishing to defeat this > behavior MAY include a charset parameter even when the charset is > ISO-8859-1 ***and SHOULD do so when it is known that it will not > confuse the recipient.***» [emphasis added]. > > In this particular case, not only is it known that specifying the > encoding will not confuse the recipient; explicitly specifying it > is the only way to _avoid_ confusing «the recipient» (IOW, the > «SHOULD» certainly kicks in). I don't subscribe to cargo cult coding, and I don't care about catering to broken software. also note that this paragraph wasn't in the original HTTP/1.1 RFC, and the text in 5.2.2 has not changed since HTML 4.0 of December 1997. furthermore, configuring Apache to set include charset=iso-8859-1 for all files of type text/html will make it impossible for a document to use a different charset since it overrides META HTTP-EQUIV. (another poor choice in the HTML Recommendation, IMHO). > [3] - <http://validator.w3.org:8001/>. Feedback encouraged! well, it didn't process http://www.usenet.no. in fact it assumed UTF-8, which there is no basis for doing at all. IMO, that's a further regression. -- Kjetil T.
Received on Saturday, 7 June 2003 12:02:08 UTC