- From: Eliot Graff via cvs-syncmail <cvsmail@w3.org>
- Date: Mon, 21 Jun 2010 21:08:10 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/html-xhtml-author-guide In directory hutz:/tmp/cvs-serv5385 Modified Files: html-xhtml-authoring-guide.html Log Message: Changed 'polyglot document' to 'polyglot markup' throughout the spec to minimize confusion for those who might think the spec is about serving up documents of multilingual content. Index: html-xhtml-authoring-guide.html =================================================================== RCS file: /sources/public/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html,v retrieving revision 1.16 retrieving revision 1.17 diff -u -d -r1.16 -r1.17 --- html-xhtml-authoring-guide.html 18 Jun 2010 17:03:53 -0000 1.16 +++ html-xhtml-authoring-guide.html 21 Jun 2010 21:08:08 -0000 1.17 @@ -20,8 +20,8 @@ src="html-xhtml-authoring-guide_files/w3c_home.png" alt="W3C" width="72" height="48"></a></p><h1 class="title" id="title">Polyglot Markup: HTML-Compatible XHTML Documents</h1><h2 -id="w3c-editor-s-draft-18-june-2010"><acronym title="World Wide Web -Consortium">W3C</acronym> Editor's Draft 18 June 2010</h2><dl><dt>This +id="w3c-editor-s-draft-21-june-2010"><acronym title="World Wide Web +Consortium">W3C</acronym> Editor's Draft 21 June 2010</h2><dl><dt>This version:</dt><dd><a href="http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html">http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html</a></dd><dt>Latest published version:</dt><dd><a href="http://www.w3.org/TR/xxx-xxx/">http://www.w3.org/TR/xxx-xxx/</a></dd><dt>Latest @@ -45,16 +45,16 @@ use</a> rules apply.</p><hr></div> <div class="introductory section" id="abstract"><h2>Abstract</h2> <p> - A polyglot document is an HTML5 document which is at the same time - an XML document and an HTML document, and which meets a well defined -set of constraints. - Polyglot documents that meet these constraints as interpreted as + A document that uses polyglot markup is an HTML5 document which is + at the same time an XML document and an HTML document, and which meets a + well defined set of constraints. + Polyglot markup that meets these constraints as interpreted as compatible, regardless of whether they are processed as HTML or as XHTML, per the HTML5 specification. - Polyglot documents use a specific doctype, namespace declarations, - and a specific case—normally lower case but occasionally camel case—for - element and attribute names. - Polyglot documents use lower case for certain attribute values. + Polyglot markup uses a specific doctype, namespace declarations, +and a specific case—normally lower case but occasionally camel case—for +element and attribute names. + Polyglot markup uses lower case for certain attribute values. Further constraints include those on empty elements, named entity references, and the use of scripts and style. </p> @@ -160,9 +160,10 @@ document, and they and others may process the document using XML tools. These documents are served as text/html. The language used to create documents that can be parsed by both HTML -and XML parsers is called <dfn id="dfn-polyglot">polyglot</dfn>. - Polyglot is the overlap language of documents which are both HTML5 -documents and XML documents. +and XML parsers is called <dfn id="dfn-polyglot-markup">polyglot markup</dfn>. + + Polyglot markup is the overlap language of documents which are both +HTML5 documents and XML documents. </p> </div> @@ -170,7 +171,7 @@ <!--OddPage--><h2><span class="secno">2. </span>Processing Instructions and the XML Declaration</h2> <p> - A polyglot document does not use processing instructions. + Polyglot markup does not use processing instructions. Note that the parsing rules for the XML declaration are not processing instructions and are defined separately in <a href="http://www.w3.org/TR/REC-xml/#NT-XMLDecl">Prolog and Document Type @@ -181,17 +182,17 @@ <div id="character-encoding" class="section"> <!--OddPage--><h2><span class="secno">3. </span>Character Encoding</h2> <p> - A polyglot document uses either UTF-8 or UTF-16, although generally -UTF-8 is preferred. - If a polyglot document uses UTF-16, it <em title="should" + Polyglot markup uses either UTF-8 or UTF-16, although generally UTF-8 + is preferred. + When polyglot markup uses UTF-16, it <em title="should" class="rfc2119">should</em> include the BOM indicating UTF-16LE or UTF-16BE. - In addition, a polyglot document need not include the meta charset + In addition, polyglot markup need not include the meta charset declaration, because the parser would have to read UTF-16 in order to parse it by definition. </p> <p> - In short, for correct character encoding, a polyglot document <em + In short, for correct character encoding, polyglot markup <em title="must" class="rfc2119">must</em> either: </p><ul> <li>Use UTF-8 or UTF-16 with the appropriate BOM.</li> @@ -204,12 +205,12 @@ <p> - If a polyglot document uses an encoding other than UTF-8 or UTF-16, -it <em title="must" class="rfc2119">must</em> include the XML -declaration; however, in this case the document <em title="must" -class="rfc2119">must</em> also include the HTML <code>meta</code> tag -specifying the character set. - When a polyglot document uses both the XML declaration and the HTML <code>meta</code> + If polyglot markup uses an encoding other than UTF-8 or UTF-16, it <em + title="must" class="rfc2119">must</em> include the XML declaration; +however, in this case the document <em title="must" class="rfc2119">must</em> + also include the HTML <code>meta</code> tag specifying the character +set. + When polyglot markup uses both the XML declaration and the HTML <code>meta</code> tag, these <em title="must" class="rfc2119">must</em> specify the same character and coding. @@ -219,9 +220,8 @@ <div id="doctype" class="section"> <!--OddPage--><h2><span class="secno">4. </span>The DOCTYPE</h2> <p> - A polyglot document uses the <code><!DOCTYPE html></code> -doctype. - Note that for a polyglot document the string, <code>html</code>, <em + Polyglot markup uses the <code><!DOCTYPE html></code> doctype. + Note that for polyglot markup the string, <code>html</code>, <em title="must" class="rfc2119">must</em> be lower case. For a pure HTML document, the string is defined as case-insensitive. [<a href="#bib-HTML5" rel="biblioentry" class="bibref">HTML5</a>] @@ -231,8 +231,7 @@ <div id="namespaces" class="section"> <!--OddPage--><h2><span class="secno">5. </span>Namespaces</h2> <p> - The following rules apply to namespaces used in polyglot -documents. + The following rules apply to namespaces used in polyglot markup. </p> <ul> <li> @@ -271,8 +270,8 @@ <div class="section" id="required-elements"> <h3><span class="secno">6.1 </span>Required Elements</h3> <p> - Each polyglot document <em title="must" class="rfc2119">must</em> - have a root <code>html</code> element. + Each document using polyglot markup <em title="must" +class="rfc2119">must</em> have a root <code>html</code> element. The root <code>html</code> element <em title="must" class="rfc2119">must</em> contain both a <code>head</code> and a <code>body</code> element. @@ -284,9 +283,9 @@ <div id="tables" class="section"> <h4><span class="secno">6.1.1 </span>Tables</h4> <p> - Within a polyglot document, a table <em title="must" -class="rfc2119">must</em> explicitly have a <code>tbody</code> element -surrounding groups of <code>tr</code> elements. + Polyglot markup <em title="must" class="rfc2119">must</em> +explicitly have a <code>tbody</code> element surrounding groups of <code>tr</code> + elements within a <code>table</code> element. HTML parsers insert the <code>tbody</code> element, but XML parsers do not, thus creating different DOMs. </p> @@ -307,28 +306,26 @@ <p> The following guidelines apply to any usage of element names, attribute names, or attribute values in markup, script, or CSS. - When required, a polyglot document uses lower case letters for all -ASCII letters; however, case requirements do not apply to non-ASCII -letters such as Greek, Cyrillic, or non-ASCII Latin letters. + When required, polyglot markup uses lower case letters for all ASCII +letters; however, case requirements do not apply to non-ASCII letters +such as Greek, Cyrillic, or non-ASCII Latin letters. </p> <div id="element-names" class="section"> <h4><span class="secno">6.2.1 </span>Element Names</h4> - <p>A polyglot document uses the correct case for element -names.</p> + <p>Polyglot markup uses the correct case for element names.</p> <ul> <li> - A polyglot document uses lowercase letters for all HTML element -names. + Polyglot markup uses lowercase letters for all HTML element names. </li> <li> - A polyglot document uses lowercase letters for all MathML element + Polyglot markup uses lowercase letters for all MathML element names. </li> <li> - A polyglot document uses lowercase letters for all SVG element -names except the following, which <em title="must" class="rfc2119">must</em> - be in mixed case: + Polyglot markup uses lowercase letters for all SVG element names +except the following, which <em title="must" class="rfc2119">must</em> +be in mixed case: <ul> <li><code>altGlyph</code></li> <li><code>altGlyphDef</code></li> @@ -374,22 +371,22 @@ <div id="attribute-names" class="section"> <h4><span class="secno">6.2.2 </span>Attribute Names</h4> <p> - A polyglot document uses the correct case for attribute names. + Polyglot markup uses the correct case for attribute names. </p> <ul> <li> - A polyglot document uses lowercase letters in attribute -names for all HTML elements. + Polyglot markup uses lowercase letters in attribute names +for all HTML elements. </li> <li> - A polyglot document uses lowercase letters in attribute -names for all MathML elements except the following: + Polyglot markup uses lowercase letters in attribute names +for all MathML elements except the following: <p>The lowercase <code>definitionurl</code> <em title="must" class="rfc2119">must</em> be changed to the mixed case <code>definitionURL</code>.</p> </li> <li> - A polyglot document uses lowercase letters in attribute -names for all SVG elements except the following, which <em title="must" + Polyglot markup uses lowercase letters in attribute names +for all SVG elements except the following, which <em title="must" class="rfc2119">must</em> be in mixed case: <ul> <li><code>attributeName</code></li> @@ -462,13 +459,12 @@ <div id="attribute-values" class="section"> <h4><span class="secno">6.2.3 </span>Attribute Values</h4> <p> - A polyglot document uses lowercase letters for the values of the + Polyglot markup uses lowercase letters for the values of the attributes in the following list when they exist on HTML elements. - More specifically, where required, a polyglot document <em -title="must" class="rfc2119">must</em> use lower case letters for all -ASCII letters in these attribute values; however, case requirements do -not apply to non-ASCII letters such as Greek, Cyrillic, or non-ASCII -Latin letters. + More specifically, where required, polyglot markup <em title="must" +class="rfc2119">must</em> use lower case letters for all ASCII letters +in these attribute values; however, case requirements do not apply to +non-ASCII letters such as Greek, Cyrillic, or non-ASCII Latin letters. Attributes for HTML elements other than those in the following list <em title="may" class="rfc2119">may</em> have values made of mixed case letters. @@ -530,8 +526,8 @@ <div id="empty-elements" class="section"> <h3><span class="secno">6.3 </span>Empty Elements</h3> <p> - A polyglot document uses only the elements in the following -list as empty elements. + Polyglot markup uses only the elements in the following list as + empty elements. </p> <ul> <li><code>area</code></li> @@ -550,15 +546,15 @@ <li><code>source</code></li> </ul> <p> - A polyglot document uses the minimized tag syntax for empty + Polyglot markup uses the minimized tag syntax for empty elements, e.g. <code><br/></code>. The alternative syntax <code><br></br></code> allowed by XML gives uncertain results in many existing user agents. </p> <p> Given an empty instance of an element whose content model is not - EMPTY (for example, an empty title or paragraph) a polyglot document -does not use the minimized form (e.g. the document uses <code><p></p></code> + EMPTY (for example, an empty title or paragraph) polyglot markup does +not use the minimized form (e.g. the document uses <code><p></p></code> and not <code><p /></code>). </p> <p> @@ -570,10 +566,10 @@ <div id="attributes" class="section"> <!--OddPage--><h2><span class="secno">7. </span>Attributes</h2> - <p>A polyglot document does not contain line breaks and multiple white - space characters within attribute values. These are handled + <p>Polyglot markup does not contain line breaks and multiple white +space characters within attribute values. These are handled inconsistently by user agents.</p> - <p>A polyglot document surrounds all attribute values with quotation + <p>Polyglot markup surrounds all attribute values with quotation marks. Attribute values <em title="may" class="rfc2119">may</em> be surrounded either by single quotation marks or by double quotation marks.</p> @@ -585,8 +581,7 @@ <!--OddPage--><h2><span class="secno">8. </span>Named Entity References</h2> <p> - A polyglot document uses only the following named entity -references: + Polyglot markup uses only the following named entity references: </p> <ul> <li><code>amp</code></li> @@ -597,7 +592,7 @@ </ul> <p> For entities beyond the previous list, a ployglot document uses -character references. For example, a polyglot document uses <code>&#160;</code> +character references. For example, polyglot markup uses <code>&#160;</code> instead of <code>&nbsp;</code>. </p> </div> @@ -609,9 +604,9 @@ Script and style commands <em title="should" class="rfc2119">should</em> be included by linking to external files rather than including them in-line. - However, a polyglot document <em title="must not" -class="rfc2119">must not</em> link to an external stylesheet by using -the xml-stylesheet processing instruction. + However, polyglot markup <em title="must not" class="rfc2119">must + not</em> link to an external stylesheet by using the xml-stylesheet +processing instruction. See also <a href="#PI-and-xml">Processing Instructions and the XML Declaration</a>. </p> @@ -622,22 +617,22 @@ <p> Although <code>document.write()</code> and <code>document.writeln()</code> are valid in an HTML document, neither function may be used in XHTML. - Therefore, neither is used in a polyglot document. + Therefore, neither is used in polyglot markup. Instead, use the <code>innerHTML</code> property for both HTML and XHTML. Note that the <code>innerHTML</code> property takes a string. XML parsers parse the string as XML in XHTML. HTML parsers parse the string as HTML in HTML. Because of the difference in parsing, if you send the parser -content that does not follow the rules for a polyglot document the -results will differ for a DOM create with an XML parser and one created -with an HTML parser. +content that does not follow the rules for polyglot markup the results +will differ for a DOM create with an XML parser and one created with an +HTML parser. </p> <div id="external-script-and-style" class="section"> <h3><span class="secno">9.1 </span>External Script and Style</h3> <p> - A polyglot document uses external scripts if that document's -script or style sheet uses <code><</code> or <code>&</code> or <code>]]></code> + Polyglot markup uses external scripts if that document's script +or style sheet uses <code><</code> or <code>&</code> or <code>]]></code> or <code>--</code>. Note that XML parsers are permitted to silently remove the contents of comments; therefore, the historical practice of hiding @@ -648,10 +643,10 @@ <div id="in-line-script-and-style" class="section"> <h3><span class="secno">9.2 </span>In-line Script and Style</h3> <p> - If a polyglot document must use script or style commands within -its source code, either use safe content or wrap the command in a CDATA + If polyglot markup must use script or style commands within its +source code, either use safe content or wrap the command in a CDATA section. - However, a polyglot document does not use a <code>CDATA</code> + However, polyglot markup does not use a <code>CDATA</code> section unless it is being used within foreign content. </p><ul> <li>Safe content is content that does not contain a <code><</code> @@ -683,8 +678,8 @@ <p> When using MathML or SVG, the parser follows the XML parsing rules. - A polyglot document does not rely on getting a CDATA instance -from the DOM when using MathML or SVG, because the HTML parser does not + Polyglot markup does not rely on getting a CDATA instance from +the DOM when using MathML or SVG, because the HTML parser does not create a CDATA instance in the DOM. </p>
Received on Monday, 21 June 2010 21:08:12 UTC