W3C home > Mailing lists > Public > public-html@w3.org > October 2010

RE: i18n comments on Polyglot Markup [issue #4]

From: Eliot Graff <eliotgra@microsoft.com>
Date: Fri, 1 Oct 2010 19:13:22 +0000
To: Richard Ishida <ishida@w3.org>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <CE3A5BFD1228D84A8D9C158EEC195FD50EBC5C55@TK5EX14MBXW605.wingroup.windeploy.ntdev.microsoft.com>
Issue #4 of the on the Internationalization Comments on Polyglot Markup: HTML-Compatible XHTML Documents [1] asks to omit the either/or list in character encoding:

" In short, for correct character encoding, polyglot markup must either: "
The MUST is too strong. There is no problem with using more than one declaration, and in an earlier comment we said that we recommend that you have a readable declaration in the source in addition to a UTF8/16 encoding.
I think it is better just to omit the list and it's lead-in paragraph "In short, for correct ...".
The information is contained in the following paragraph that starts with "If polyglot markup uses an encoding other than..."

The 1 October Editor's draft of the polyglot spec now reads as follows for this issue:

Polyglot markup uses either UTF-8 or UTF-16. UTF-8 is preferred. When polyglot markup uses UTF-16, it must include the BOM indicating UTF-16LE or UTF-16BE. 

Polyglot markup declares character encoding one of two ways: 

By using the BOM. 
In the HTTP header of the response [HTTP11], as in the following: 
Content-type: text/html; charset=utf-8 
Content-type: text/html; charset=utf-16 

Note that polyglot markup may use either text/html or application/xhtml+xml for the value of the content type. 

Using <meta charset="*"/> has no effect in XML. Therefore, polyglot markup may use <meta charset="*"/> in combination with BOM, as long the meta element specifies the same character encoding as the BOM. In addition, the meta tag may be used in the absence of a BOM as long as it matches the already specified encoding. Note that the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually. 

I believe this satisfies this request. Accordingly, I resolved bug 10151 as closed. [2]

Can someone please edit the information on the Internationalization Comments on Polyglot Markup: HTML-Compatible XHTML Documents to indicate this change is made? [2]

Thanks for your help and patience.


[1] http://www.w3.org/International/reviews/1007-polyglot/ 
[2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=10151

-----Original Message-----
From: public-html-request@w3.org [mailto:public-html-request@w3.org] On Behalf Of Richard Ishida
Sent: Tuesday, July 13, 2010 12:40 PM
To: public-html@w3.org
Subject: i18n comments on Polyglot Markup

Hello Eliot,

Thank you for beginning work on the polyglot document.  I think it will be very useful.  FWIW, I would welcome an approach to the text that made it more like an author-friendly "how-to" guide, rather than spec text.

I am about to raise 8 bugs in bugzilla.  These comments have been discussed by the i18n WG.  I hope you find them helpful.

FWIW, the i18n group keeps track of comments on your doc at http://www.w3.org/International/reviews/1007-polyglot/


Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)



Received on Friday, 1 October 2010 19:14:06 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:05 UTC