Bug: Confusing messages for UTF-8 encoded XHTML1/XHTML11 docs

PROPOSAL: For XHTML pages, then, the "not-NU" validation service for 
some documents determine the encoding according to the XML rules. This 
seems to happen, without regard to the MIME type, whenever the page 
uses an XHTML 1 or XHTML 1.1. doctype. In these case, then please make 
the validator check that the encoding determined via XML, has also be 
declared with a meta http-equiv statement, for HTML-compatibility. And 
issue a warning if it has not. The warning could point to XHTML 
Appendix C, point 1 and 9.

   Appendix C, point 1:
]] you may want to avoid using processing instructions and XML
   declarations. Remember, however,when the XML declaration is
   not included in a document, the document can only use the
   default character encodings UTF-8 or UTF-16. [[

   Appendix C, point 9:
]] a document that wants to set its character encoding explicitly
   must include both the XML declaration an encoding declaration
   and a meta http-equiv statement [[

(Note: The reason why point 9, a little against point 1, recommends the 
XML declaration, is because point 9 discusses a non-UTF-8 encoding.)

Examples: 

1) * XML encoding declaration of an XHTML1 document says "UTF-8",
     but there is no meta http-equiv which says the same:
     ! ISSUE A WARNING
2) * The XML prologue of an XHTML1 document has been omitted,
     thus the page is determined - by default - to be UTF-8,
     however there is no meta http-equiv which says the same:
     ! ISSUE A WARNING     

For background on this proposal, and more on the validator's confusing 
behavior, see the unicode list.[3]

[1] http://www.w3.org/TR/xhtml1/#C_1
[2] http://www.w3.org/TR/xhtml1/#C_9
[3] http://www.unicode.org/mail-arch/unicode-ml/y2012-m11/0289.html
-- 
leif halvard silli

Received on Wednesday, 28 November 2012 18:54:35 UTC