- From: <bugzilla@jessica.w3.org>
- Date: Sun, 13 Mar 2011 02:40:44 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=12062 --- Comment #14 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-03-13 02:40:43 UTC --- (In reply to comment #13) As one piece of text - and with some additional changes. Justification etc at the bottom. ]] 3. Specifying a Document's Character Encoding Polyglot markup uses UTF-8, the only character encoding for which both HTML and XML requires support. For HTML, then UTF-8 has to be explicitly declared, to avoid fallback to a legacy encoding. For XML, then UTF-8 is the encoding default and as such MAY be left undeclared. The UTF-8 encoding is declared in the following ways, which can be used together or separately: * Within the document o By using the Byte Order Mark (BOM) character (preferred). o By using <meta charset="UTF-8"/> (the HTML encoding declaration). * Outside the document o By adding "charset=utf-8" to the MIME/HTTP Content-Type header [HTTP11]: HTML Content-Type example: Content-type: text/html; charset=utf-8 XHTML Content-Type example: Content-type: application/xhtml+xml; charset=utf-8 NOTE: The HTML encoding declaration has no effect in XML. So when this is the only encoding declaration, then it is XML's encoding default that makes XML parsers treat it as UTF-8. The W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually. [[ JUSTIFICATION for some of the wording choices above: * 'XML encoding declaration' is a wording used in XML 1.0. 'HTML encoding declaration' is made on the same pattern. * Tried to use '_character_ encoding' at least once. * 'legacy encoding' = HTML5 uses this wording about non-UTF-8 encodings * deleted "(if used in combination, each approach contains identical encoding information)" because it is unrelevant when the only encoding is UTF-8 * 'Outside the document' - for analogy with your 'Inside the document' * Important to have both an XHTML exampe and a HTML example with regard to the MIME/HTTP * Added 'MIME' as, that is what it is. * Tried to diminish the number of places where the text mentioned the encoding default of XML ... * 'in combination' does not make good sense as it indicates that the methods cooperates. Tried 'together' instead. * Deleted note about other MIME types because the text says "By adding "charset=utf-8" to the MIME/HTTP Content-Type header", which is valid for any MIME type. The examples are just examples. * Tried to be short. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Sunday, 13 March 2011 02:40:46 UTC