- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Fri, 9 Dec 2005 08:26:30 +0200 (EET)
- To: www-validator@w3.org
On Thu, 8 Dec 2005, Jirka Kosek wrote: > Jukka K. Korpela wrote: > >> Apparently the validator uses UTF-8 as the implied default. >> >> The choice is impractical > > I can't recall RFC number from the top of my head, but HTTP protocol assumes > ISO-8859-1 for all text/* media types as a default. HTTP/1.1 is RFC 2616. The clause you are referring to is 3.7.1. > So it is no "impractical", it is clearly bug. There is a definite contradiction between the HTTP protocol definition and the HTML 4.01 specification. This has been discussed on different fora several times. The consensus is that HTML as a higher-level protocol trumps the transfer protocol. The HTML 4.01 specification, in clause 5.2.2, discusses this very theme and concludes: "user agents must not assume any default value for the 'charset' parameter". (This does not exclude the possibility of ultimately falling back to a default, which may depend on the user agent. It just means that in the absence of an HTTP header with a 'charset' parameter, user agents must not imply ISO-8859-1 or any other 'charset' value but proceed to the algorithm of using other sources of information, such as a <meta> tag.) > That's why text/xml was superseded by application/xml where is no such > default assumed. The media types for XML are a mess, as you can see from RFC 3023. The type text/xml has not been superseded; it is an alternative that can be used - and _should_ be used under some conditions (if we take RFC 3023 seriously). > If there were no charset parameter, ISO-8859-1 should be > assumed from HTTP point of view, But not by the definition of the text/xml media type in RFC 3023, which specifies US-ASCII as the default when text/xml is transmitted over HTTP without a 'charset' parameter. Thus, if you submit an XML document to a validator without 'charset', then the validator is formally required to treat it as US-ASCII. (I hope the validator doesn't actually behave so, or at least issues an adequate warning.) -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Friday, 9 December 2005 06:30:01 UTC