W3C home > Mailing lists > Public > public-html@w3.org > October 2010

RE: i18n Polyglot Markup/in-doc encoding declarations (2nd issue)

From: Eliot Graff <eliotgra@microsoft.com>
Date: Fri, 8 Oct 2010 18:30:44 +0000
To: Richard Ishida <ishida@w3.org>, 'Leif Halvard Silli' <xn--mlform-iua@xn--mlform-iua.no>
CC: "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <CE3A5BFD1228D84A8D9C158EEC195FD50EBCA788@TK5EX14MBXW605.wingroup.windeploy.ntdev.microsoft.com>
Thank you. I've added the link to the recommendation and will publish later today.

> -----Original Message-----
> From: Richard Ishida [mailto:ishida@w3.org]
> Sent: Friday, October 08, 2010 10:52 AM
> To: Eliot Graff; 'Leif Halvard Silli'
> Cc: public-html@w3.org; public-i18n-core@w3.org
> Subject: RE: i18n Polyglot Markup/in-doc encoding declarations (2nd issue)
> 
> There is another article Declaring character encodings in HTML
> http://www.w3.org/International/questions/qa-html-encoding-declarations
> , but there isn't currently a W3C Note or other rec track document that can
> be cited.
> 
> RI
> 
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
> 
> http://www.w3.org/International/
> http://rishida.net/
> 
> 
> 
> 
> > -----Original Message-----
> > From: Eliot Graff [mailto:eliotgra@microsoft.com]
> > Sent: 01 October 2010 16:12
> > To: Leif Halvard Silli; Richard Ishida
> > Cc: public-html@w3.org; public-i18n-core@w3.org
> > Subject: RE: i18n Polyglot Markup/in-doc encoding declarations (2nd
> > issue)
> > Importance: High
> >
> > The Editor's Draft of 29 September contains the following edit, and I
> > have therefore resolved bug 10150 as fixed.
> >
> > ]]
> > Note that the W3C Internationalization (i18n) Group recommends to
> > always include a visible encoding declaration in a document, because
> > it helps developers, testers, or translation production managers to
> > check the encoding of a document visually.
> > [[
> >
> > I would like to link to a resource for this statement, though. Can you
> > recommend one that's better than the i18n article, "Character encodings"
> [1]?
> >
> > Thanks,
> >
> > Eliot
> >
> > [1] http://www.w3.org/International/O-charset
> >
> > -----Original Message-----
> > From: Leif Halvard Silli [mailto:xn--mlform-iua@målform.no]
> > Sent: Thursday, July 15, 2010 1:56 PM
> > To: Richard Ishida
> > Cc: public-html@w3.org; public-i18n-core@w3.org; Eliot Graff
> > Subject: i18n Polyglot Markup/in-doc encoding declarations (2nd issue)
> >
> > I resend my comments, on request from Richard, with on issue per
> message.
> > This is about the 2nd issue on the i18n group's tracking page:
> > http://www.w3.org/International/reviews/1007-polyglot/
> >
> > 	Excerpt of the 2nd issue:
> >
> > 		]] In-document declarations always useful [...] So it's true
> to
> > say that you strictly don't need it, but we would prefer that people do.
> > Please could you reflect that in your document. [[
> >
> > 	Comment: I have long since filed bug 9962 which says that only UTF-8
> > and UTF-16 should be permitted. (No other  encodings should be
> > allowed, as there are no HTML5-compatible way to  specify them.) And
> > also, there is an on-going debate to limit the encodings to only UTF-8
> > - see Sam's message and the replies [1]. In the following, I'll assume
> > that only
> > UTF-8 and UTF-16 are relevant.
> >
> > For UTF-16, there is no HTML5-compatible way to have an in-document
> > UTF-16 declaration. At least not as of yet. The i18n group can file a
> > bug against HTML5 to make it valid, of course. Until that day, then
> > your 2nd
> issue
> > is not relevant w.r.t. UTF-16.
> >
> > When it comes to UTF-8, then in-document declaration is _necessary_,
> unless
> > you want to rely on HTTP or BOM. Without BOM, HTTP or meta@charset,
> > the HTML parser will most likely default to WIN-1252 or another locale
> > dependent 8 bit encoding - at least in off-line parsing and other
> uncontrolled
> > contexts.  Thus I tend to have the opinion that in-document
> > declaration is
> a
> > requirement for UTF-8.
> >
> > [1] http://www.w3.org/mid/4C3F56AB.7030105@intertwingly.net
> > --
> > leif halvard silli
> >
> >
> > No virus found in this incoming message.
> > Checked by AVG - www.avg.com
> > Version: 9.0.856 / Virus Database: 271.1.1/3159 - Release Date:
> > 09/30/10
> > 19:34:00
> 
Received on Friday, 8 October 2010 18:31:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:15 GMT