- From: David Brownell <david-b@pacbell.net>
- Date: Wed, 04 Dec 2002 08:01:12 -0800
- To: www-validator@w3.org
I recently validated a xhtml 1.0 page that used to validate just fine, and instead, I got a message that said things like: I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to validate the document. The sources I tried are: * The HTTP Content-Type field. * The XML Declaration. * The HTML "META" element. And I even tried to autodetect it using the algorithm defined in Appendix F of the XML 1.0 Recommendation. This seems pretty bogus. HTTP defaults to iso-8859-1, and the validator can+should know that character encoding is the default. Or have people been playing with charset detection policies again? - Dave p.s. Given that it's XHTML, I find the fact that it even _tried_ using the META element to be worrisome ... that means that parsing this document as XML could give different results, which breaks all XHTML goals I ever heard. Not that I've tracked XHTML recently, but this seems like trouble.
Received on Wednesday, 4 December 2002 11:37:06 UTC