W3C home > Mailing lists > Public > public-html@w3.org > October 2010

FW: i18n comments on Polyglot Markup [issue #4]

From: Richard Ishida <ishida@w3.org>
Date: Thu, 7 Oct 2010 19:25:53 +0100
To: <public-i18n-core@w3.org>, <public-html@w3.org>
Cc: "'Eliot Graff'" <eliotgra@microsoft.com>
Message-ID: <013c01cb664d$13c89ae0$3b59d0a0$@org>
[forwarding to public-i18n-core, so they are kept in the loop.  Please reply to this email, rather than the previous one.]

From: Eliot Graff [mailto:eliotgra@microsoft.com] 
Sent: 01 October 2010 20:13
To: Richard Ishida; public-html@w3.org
Subject: RE: i18n comments on Polyglot Markup [issue #4]
Importance: High

Issue #4 of the on the Internationalization Comments on Polyglot Markup: HTML-Compatible XHTML Documents [1] asks to omit the either/or list in character encoding:

]]
" In short, for correct character encoding, polyglot markup must either: "
The MUST is too strong. There is no problem with using more than one declaration, and in an earlier comment we said that we recommend that you have a readable declaration in the source in addition to a UTF8/16 encoding.
I think it is better just to omit the list and it's lead-in paragraph "In short, for correct ...".
The information is contained in the following paragraph that starts with "If polyglot markup uses an encoding other than..."
[[

The 1 October Editor's draft of the polyglot spec now reads as follows for this issue:

]]
Polyglot markup uses either UTF-8 or UTF-16. UTF-8 is preferred. When polyglot markup uses UTF-16, it must include the BOM indicating UTF-16LE or UTF-16BE. 

Polyglot markup declares character encoding one of two ways: 

By using the BOM. 
In the HTTP header of the response [HTTP11], as in the following: 
Content-type: text/html; charset=utf-8 
or 
Content-type: text/html; charset=utf-16 

Note that polyglot markup may use either text/html or application/xhtml+xml for the value of the content type. 

Using <meta charset="*"/> has no effect in XML. Therefore, polyglot markup may use <meta charset="*"/> in combination with BOM, as long the meta element specifies the same character encoding as the BOM. In addition, the meta tag may be used in the absence of a BOM as long as it matches the already specified encoding. Note that the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually. 
 [[

I believe this satisfies this request. Accordingly, I resolved bug 10151 as closed. [2]

Can someone please edit the information on the Internationalization Comments on Polyglot Markup: HTML-Compatible XHTML Documents to indicate this change is made? [2]

Thanks for your help and patience.

Eliot

[1] http://www.w3.org/International/reviews/1007-polyglot/ 
[2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=10151

-----Original Message-----
From: public-html-request@w3.org [mailto:public-html-request@w3.org] On Behalf Of Richard Ishida
Sent: Tuesday, July 13, 2010 12:40 PM
To: public-html@w3.org
Subject: i18n comments on Polyglot Markup

Hello Eliot,

Thank you for beginning work on the polyglot document.  I think it will be very useful.  FWIW, I would welcome an approach to the text that made it more like an author-friendly "how-to" guide, rather than spec text.

I am about to raise 8 bugs in bugzilla.  These comments have been discussed by the i18n WG.  I hope you find them helpful.

FWIW, the i18n group keeps track of comments on your doc at http://www.w3.org/International/reviews/1007-polyglot/

Cheers,
RI

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/








No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.856 / Virus Database: 271.1.1/3159 - Release Date: 09/30/10 19:34:00
Received on Thursday, 7 October 2010 18:26:25 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:20 UTC