W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > March 2011

[Bug 12062] UTF-8 BOM should not be forbidden in Polyglot Markup

From: <bugzilla@jessica.w3.org>
Date: Thu, 17 Mar 2011 20:12:39 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Q0JYl-0006VR-OM@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=12062

Eliot Graff <eliotgra@microsoft.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #15 from Eliot Graff <eliotgra@microsoft.com> 2011-03-17 20:12:36 UTC ---
I believe I have everything in the last two comments. The editor's draft of 17
March now has this for section 3:

]]
3. Specifying a Document's Character Encoding

 Polyglot markup uses the UTF-8 character encoding, the only character encoding
for which both HTML and XML require support. HTML requires UTF-8 to be
explicitly declared to avoid fallback to a legacy encoding [HTML5]. For XML,
UTF-8 is an encoding default. As such, character encoding may be left
undeclared in XML with the result that UTF-8 is still supported [XML10]. 

Polyglot markup declares the UTF-8 character encoding in the following ways,
which may be used separately or in combination: 
•Within the document
  &#9702;By using the Byte Order Mark (BOM) character (preferred).
  &#9702;By using <meta charset="UTF-8"/> (the HTML encoding declaration).
•Outside the document 
  &#9702;By adding "charset=utf-8" to the MIME/HTTP Content-Type header
[HTTP11], as the following examples show in HTML and XML, respectively:
 Example
Content-type: text/html; charset=utf-8
 Example
Content-type: application/xhtml+xml; charset=utf-8

Note
 The HTML encoding declaration has no effect in XML. When the HTML encoding
declaration is the only encoding declaration, the encoding default from XML
makes XML parsers treat content as UTF-8. 

The W3C Internationalization (i18n) Group recommends to always include a
visible encoding declaration in a document, because it helps developers,
testers, or translation production managers to check the encoding of a document
visually. 
[[

Thanks so much,

Eliot

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 17 March 2011 20:12:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 17 March 2011 20:12:46 GMT