- From: Pete Forman <pete.forman@westerngeco.com>
- Date: Thu, 31 Aug 2006 15:14:24 +0100
- To: www-validator@w3.org
Frank Ellermann <nobody@xyzzy.claranet.de> writes: > Jukka K. Korpela wrote: > >> They might also be using "free" web space on a server that >> adds some code on each page sent, making it invalid. > > Yes, that would be a hopeless case. But RFC 2616 is more > tolerant wrt the http header. If the choice is "no info" vs. > "wrong info" I pick the former - some of my plain text files > are pc-multilingual-850+euro, no decent Web server could get > this right without direct instructions. > >> The charset issue is however much less serious > > This got a MAY, a SHOULD, and two MUSTs in 3.4.1 of RFC 2616. > And probably my browser belongs to the "unfortunately" cases. > Tough. At least this mess is limited to HTTP/1.0, so that > can't confuse the validator. RFC 2616 is trumped by the HTML spec which states that an absent HTTP Content-Type header may not be construed as ISO-8859-1. A user agent must manage its own default for encoding if none is specified in HTTP headers, META declaration, or charset attribute. http://www.w3.org/TR/html4/charset.html#h-5.2.2 I'd suggest that a validating UA might use US-ASCII as its default encoding and raise errors for out of range characters. Of course there should still be a warning if neither the web server nor document specify an encoding. There must be many pages which render exactly the same if interpreted with an encoding of US-ASCII, ISO-8859-1, WINDOWS-1252 or UTF-8. This is for HTML, the rules are different for XHTML. -- Pete Forman -./\.- Disclaimer: This post is originated WesternGeco -./\.- by myself and does not represent pete.forman@westerngeco.com -./\.- the opinion of Schlumberger or http://petef.port5.com -./\.- WesternGeco.
Received on Thursday, 31 August 2006 14:41:14 UTC