- From: Karl Dubost <karl@w3.org>
- Date: Mon, 7 Apr 2008 11:13:22 +0900
- To: Nikita The Spider The Spider <nikitathespider@gmail.com>
- Cc: "W3C Validator Community" <www-validator@w3.org>
Le 7 avr. 2008 à 10:15, Nikita The Spider The Spider a écrit : > I'm trying to understand the validator's logic for ignoring the > http-equiv charset declaration in documents delivered as > application/xhtml+xml. > > I get an encoding of UTF-8 when validating this file: > http://NikitaTheSpider.com/boneyard/temp/meta-test-xhtml-as-xml.xhtml Hmm interesting. The file is not XML conformant. UTF-8 or UTF-16 are assumed for XML files, if you use a different encoding, you have to use an XML declaration such as: <?xml version="1.0" encoding="ISO-8859-5"?> in this case. In your file, this doesn't seem to be correct: <meta http-equiv="Content-Type" content="charset=ISO-8859-5" /> The syntax for HTTP headers is Content-Type = "Content-Type" ":" media-type Section 14.17 http://www.ietf.org/rfc/rfc2616.txt It should be <meta http-equiv="Content-Type" content="application/xhtml+xml;charset=ISO-8859-5" /> I wonder if it's the way the validator process the file. There is no valid HTTP headers, so Maybe a content sniffing is going on to try to guess the encoding or the validator assumes that it is utf-8 because no valid http headers were given. > And ISO-8859-5 when validating this file: > http://NikitaTheSpider.com/boneyard/temp/meta-test-xhtml-as-html.html > > They're the same except for the media type. The former is delivered as > application/xhtml+xml, the latter as text/html. In neither case is the > encoding declared in the HTTP header, but both files contain an XHTML > doctype and a META http-equiv statement that declares the encoding to > be ISO-8859-5. > > Is this logic based on the last paragraph in section 3.3 here? > http://www.w3.org/TR/xhtml-media-types/#application-xml > > Thanks > > -- > Philip > http://NikitaTheSpider.com/ > Whole-site HTML validation, link checking and more -- Karl Dubost - W3C http://www.w3.org/QA/ Be Strict To Be Cool
Received on Monday, 7 April 2008 02:14:22 UTC