- From: Richard Ishida <ishida@w3.org>
- Date: Fri, 8 Oct 2010 18:51:41 +0100
- To: "'Eliot Graff'" <eliotgra@microsoft.com>, "'Leif Halvard Silli'" <xn--mlform-iua@xn--mlform-iua.no>
- Cc: <public-html@w3.org>, <public-i18n-core@w3.org>
There is another article Declaring character encodings in HTML http://www.w3.org/International/questions/qa-html-encoding-declarations , but there isn't currently a W3C Note or other rec track document that can be cited. RI ============ Richard Ishida Internationalization Lead W3C (World Wide Web Consortium) http://www.w3.org/International/ http://rishida.net/ > -----Original Message----- > From: Eliot Graff [mailto:eliotgra@microsoft.com] > Sent: 01 October 2010 16:12 > To: Leif Halvard Silli; Richard Ishida > Cc: public-html@w3.org; public-i18n-core@w3.org > Subject: RE: i18n Polyglot Markup/in-doc encoding declarations (2nd issue) > Importance: High > > The Editor's Draft of 29 September contains the following edit, and I have > therefore resolved bug 10150 as fixed. > > ]] > Note that the W3C Internationalization (i18n) Group recommends to always > include a visible encoding declaration in a document, because it helps > developers, testers, or translation production managers to check the > encoding of a document visually. > [[ > > I would like to link to a resource for this statement, though. Can you > recommend one that's better than the i18n article, "Character encodings" [1]? > > Thanks, > > Eliot > > [1] http://www.w3.org/International/O-charset > > -----Original Message----- > From: Leif Halvard Silli [mailto:xn--mlform-iua@målform.no] > Sent: Thursday, July 15, 2010 1:56 PM > To: Richard Ishida > Cc: public-html@w3.org; public-i18n-core@w3.org; Eliot Graff > Subject: i18n Polyglot Markup/in-doc encoding declarations (2nd issue) > > I resend my comments, on request from Richard, with on issue per message. > This is about the 2nd issue on the i18n group's tracking page: > http://www.w3.org/International/reviews/1007-polyglot/ > > Excerpt of the 2nd issue: > > ]] In-document declarations always useful [...] So it's true to > say that you strictly don't need it, but we would prefer that people do. > Please could you reflect that in your document. [[ > > Comment: I have long since filed bug 9962 which says that only UTF-8 > and UTF-16 should be permitted. (No other encodings should be allowed, as > there are no HTML5-compatible way to specify them.) And also, there is an > on-going debate to limit the encodings to only UTF-8 - see Sam's message > and the replies [1]. In the following, I'll assume that only > UTF-8 and UTF-16 are relevant. > > For UTF-16, there is no HTML5-compatible way to have an in-document > UTF-16 declaration. At least not as of yet. The i18n group can file a bug > against HTML5 to make it valid, of course. Until that day, then your 2nd issue > is not relevant w.r.t. UTF-16. > > When it comes to UTF-8, then in-document declaration is _necessary_, unless > you want to rely on HTTP or BOM. Without BOM, HTTP or meta@charset, the > HTML parser will most likely default to WIN-1252 or another locale > dependent 8 bit encoding - at least in off-line parsing and other uncontrolled > contexts. Thus I tend to have the opinion that in-document declaration is a > requirement for UTF-8. > > [1] http://www.w3.org/mid/4C3F56AB.7030105@intertwingly.net > -- > leif halvard silli > > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.856 / Virus Database: 271.1.1/3159 - Release Date: 09/30/10 > 19:34:00
Received on Friday, 8 October 2010 17:52:16 UTC