- From: Michael[tm] Smith <mike@w3.org>
- Date: Mon, 4 Jul 2011 16:11:26 +0900
- To: Richard Ishida <ishida@w3.org>
- Cc: www-validator@w3.org
Richard Ishida <ishida@w3.org>, 2011-07-03 10:32 +0100: > Checking http://www.w3.org/International/tests/i18n-checker/utf16/utf16le-charset-html5.html > > I get the following error messages: > > [[ > Error Line 5, Column 70: Internal encoding declaration specified utf-16 > which is not an ASCII superset. Continuing as if the encoding had been > utf-8. > > <meta http-equiv="Content-Type" content="text/html; charset=utf-16" /> > > ✉ > Error Line 5, Column 70: Internal encoding declaration utf-8 disagrees with > the actual encoding of the document (utf-16). > > <meta http-equiv="Content-Type" content="text/html; charset=utf-16" /> > ]] > > It is incorrect to parse the document as utf-8, since the document actually > *is* a utf-16 document. You can report that use of the utf-16 meta > declaration is against the spec in utf-16 documents, but not assume that the > encoding is wrong. According to the HTML5 spec, it is correct to parse the document as UTF-8. In fact, the spec requires that behavior; see step 5.1.13 of the algorithm in the "Determining the character encoding" section of the spec: "If charset is a UTF-16 encoding, change the value of charset to UTF-8." http://dev.w3.org/html5/spec/parsing.html#determining-the-character-encoding The validator.nu backend includes a parser that conforms to the HTML5 spec (which incidentally is the same parser that Firefox now uses). And both of the error messages you cite above are being emitted by that parser, during the parsing phase, before the backend actually gets around to starting the validation stage at all. Note also that any browser which conforms to the HTML5 spec will exhibit this same behavior (that is, changing the charset from UTF-16 to UTF-8) So as far as the spec goes, those messages are both correct and expected -- as well as being consistent with parsing behavior in browsers. --Mike -- Michael[tm] Smith http://people.w3.org/mike
Received on Monday, 4 July 2011 07:11:29 UTC