- From: Eliot Graff <eliotgra@microsoft.com>
- Date: Fri, 1 Oct 2010 15:12:29 +0000
- To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, Richard Ishida <ishida@w3.org>
- CC: "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
The Editor's Draft of 29 September contains the following edit, and I have therefore resolved bug 10150 as fixed. ]] Note that the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually. [[ I would like to link to a resource for this statement, though. Can you recommend one that's better than the i18n article, "Character encodings" [1]? Thanks, Eliot [1] http://www.w3.org/International/O-charset -----Original Message----- From: Leif Halvard Silli [mailto:xn--mlform-iua@målform.no] Sent: Thursday, July 15, 2010 1:56 PM To: Richard Ishida Cc: public-html@w3.org; public-i18n-core@w3.org; Eliot Graff Subject: i18n Polyglot Markup/in-doc encoding declarations (2nd issue) I resend my comments, on request from Richard, with on issue per message. This is about the 2nd issue on the i18n group's tracking page: http://www.w3.org/International/reviews/1007-polyglot/ Excerpt of the 2nd issue: ]] In-document declarations always useful [...] So it's true to say that you strictly don't need it, but we would prefer that people do. Please could you reflect that in your document. [[ Comment: I have long since filed bug 9962 which says that only UTF-8 and UTF-16 should be permitted. (No other encodings should be allowed, as there are no HTML5-compatible way to specify them.) And also, there is an on-going debate to limit the encodings to only UTF-8 - see Sam's message and the replies [1]. In the following, I'll assume that only UTF-8 and UTF-16 are relevant. For UTF-16, there is no HTML5-compatible way to have an in-document UTF-16 declaration. At least not as of yet. The i18n group can file a bug against HTML5 to make it valid, of course. Until that day, then your 2nd issue is not relevant w.r.t. UTF-16. When it comes to UTF-8, then in-document declaration is _necessary_, unless you want to rely on HTTP or BOM. Without BOM, HTTP or meta@charset, the HTML parser will most likely default to WIN-1252 or another locale dependent 8 bit encoding - at least in off-line parsing and other uncontrolled contexts. Thus I tend to have the opinion that in-document declaration is a requirement for UTF-8. [1] http://www.w3.org/mid/4C3F56AB.7030105@intertwingly.net -- leif halvard silli
Received on Friday, 1 October 2010 15:13:15 UTC