- From: poot <cvsmail@w3.org>
- Date: Fri, 11 Mar 2011 13:17:15 -0500
- To: public-html-diffs@w3.org
eliot: Edited section 3 per requests in bug 12062, comment 11; removed UTF-16, per bug 12242; http://dev.w3.org/cvsweb/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html?r1=1.65&r2=1.66&f=h =================================================================== RCS file: /sources/public/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html,v retrieving revision 1.65 retrieving revision 1.66 diff -u -d -r1.65 -r1.66 --- html-xhtml-authoring-guide.html 4 Mar 2011 23:23:34 -0000 1.65 +++ html-xhtml-authoring-guide.html 11 Mar 2011 18:16:01 -0000 1.66 @@ -14,7 +14,7 @@ <a href="http://www.w3.org/"><img height="48" width="72" alt="W3C" src="http://www.w3.org/Icons/w3c_home"/></a> </p> <h1 class="title" id="title">Polyglot Markup: HTML-Compatible XHTML Documents</h1> - <h2 id="w3c-editor-s-draft-05-january-2011">W3C Editor's Draft 4 March 2011</h2> + <h2 id="w3c-editor-s-draft-05-january-2011">W3C Editor's Draft 11 March 2011</h2> <dl> <dt>This version:</dt> <dd><a href="http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html">http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html</a></dd> @@ -237,25 +237,17 @@ <div id="character-encoding" class="section"> <!--OddPage--><h2><span class="secno">3. </span>Specifying a Document's Character Encoding</h2> <p> - <a class="internalDFN" href="#dfn-polyglot-markup" title="polyglot markup">Polyglot markup</a> uses either UTF-8 or UTF-16. UTF-8 is preferred. - When <a class="internalDFN" href="#dfn-polyglot-markup">polyglot markup</a> uses UTF-16, it <em title="must" class="rfc2119">must</em> include the BOM, per <a href="http://www.w3.org/TR/REC-xml/#charencoding">Character Encoding in Entities</a>. [<cite><a href="#bib-XML10" rel="biblioentry" class="bibref">XML10</a></cite>] - </p> - <p> <a class="internalDFN" href="#dfn-polyglot-markup" title="polyglot markup">Polyglot markup</a> declares character encoding in the following ways, which may be used separately or in combination (if used in combination, each approach contains identical encoding information): </p><ul> <li>Within the document</li> <ul> - <li>By using the BOM.</li> + <li>By using the Byte Order Mark (BOM) character (preferred).</li> <li>By relying on UTF-8 as the encoding default of XML, used in combination with the HTML <code><meta charset="UTF-8"/></code> element.</li> </ul> <li>In the HTTP header of the response [<cite><a href="#bib-HTTP11" rel="biblioentry" class="bibref">HTTP11</a></cite>], as in the following: <p> <code>Content-type: text/html; charset=utf-8</code> - <br/> - or - <br/> - <code>Content-type: text/html; charset=utf-16</code> </p> Note that <a class="internalDFN" href="#dfn-polyglot-markup">polyglot markup</a> may use either <code>text/html</code> or <code>application/xhtml+xml</code> for the value of the content type. </li> @@ -266,10 +258,24 @@ Therefore, <a class="internalDFN" href="#dfn-polyglot-markup">polyglot markup</a> may use <code><meta charset="*"/></code> provided the document is encoded as UTF-8 and the value of charset is a case-insensitive match for the string "utf-8". </p> <p> - Note that the <a href="http://www.w3.org/International/questions/qa-html-encoding-declarations">W3C Internationalization (i18n) Group recommends</a> - to always include a visible encoding declaration in a document, because - it helps developers, testers, or translation production managers to -check the encoding of a document visually. + <a class="internalDFN" href="#dfn-polyglot-markup" title="polyglot markup">Polyglot markup</a> uses UTF-8 encoding. + The BOM character <em title="may" class="rfc2119">may</em> be used with the UTF-8 encoding (see <a href="http://dev.w3.org/html5/spec/syntax.html#writing">Writing HTML documents</a> in [<cite><a href="#bib-HTML5" rel="biblioentry" class="bibref">HTML5</a></cite>]), + + and using the BOM character is preferred to not using the BOM +character. + Because the construct of the BOM character is the same for XML and +HTML (unlike the encoding declaration inside the HTTP Content-Type +header), + and because the BOM character works in both XML and HTML (unlike the <code><meta charset="UTF-8"/></code> declaration of HTML and + the UTF-8 encoding default of XML), + the BOM character can be said to be the most polyglot encoding declaration. + </p> + <p> + The <a href="http://www.w3.org/International/questions/qa-html-encoding-declarations">W3C Internationalization (i18n) Group recommends</a> + to always include + a visible encoding declaration in a document, because it helps +developers, testers, or translation production managers to check the +encoding of a document visually. </p> <!--End section: Specifying a Document's Character Encoding--> </div> @@ -996,9 +1002,6 @@ <!--End section: Example Document--> </div> - - - <!-- Appendix --> <div id="acknowledgements" class="appendix section"> <h2><span class="secno">A. </span>Acknowledgements</h2>
Received on Friday, 11 March 2011 18:17:16 UTC