- From: David Brownell <david-b@pacbell.net>
- Date: Wed, 04 Dec 2002 08:01:12 -0800
- To: www-validator@w3.org
I recently validated a xhtml 1.0 page that used to validate just fine, and
instead, I got a message that said things like:
I was not able to extract a character encoding labeling from any of
the valid sources for such information. Without encoding information
it is impossible to validate the document. The sources I tried are:
* The HTTP Content-Type field.
* The XML Declaration.
* The HTML "META" element.
And I even tried to autodetect it using the algorithm defined in
Appendix F of the XML 1.0 Recommendation.
This seems pretty bogus. HTTP defaults to iso-8859-1, and the
validator can+should know that character encoding is the default.
Or have people been playing with charset detection policies again?
- Dave
p.s. Given that it's XHTML, I find the fact that it even _tried_
using the META element to be worrisome ... that means that
parsing this document as XML could give different results,
which breaks all XHTML goals I ever heard. Not that I've
tracked XHTML recently, but this seems like trouble.
Received on Wednesday, 4 December 2002 11:37:06 UTC