- From: Eliot Graff via cvs-syncmail <cvsmail@w3.org>
- Date: Mon, 21 Jun 2010 21:08:10 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/html-xhtml-author-guide
In directory hutz:/tmp/cvs-serv5385
Modified Files:
html-xhtml-authoring-guide.html
Log Message:
Changed 'polyglot document' to 'polyglot markup' throughout the spec to minimize confusion for those who might think the spec is about serving up documents of multilingual content.
Index: html-xhtml-authoring-guide.html
===================================================================
RCS file: /sources/public/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html,v
retrieving revision 1.16
retrieving revision 1.17
diff -u -d -r1.16 -r1.17
--- html-xhtml-authoring-guide.html 18 Jun 2010 17:03:53 -0000 1.16
+++ html-xhtml-authoring-guide.html 21 Jun 2010 21:08:08 -0000 1.17
@@ -20,8 +20,8 @@
src="html-xhtml-authoring-guide_files/w3c_home.png" alt="W3C" width="72"
height="48"></a></p><h1 class="title" id="title">Polyglot Markup:
HTML-Compatible XHTML Documents</h1><h2
-id="w3c-editor-s-draft-18-june-2010"><acronym title="World Wide Web
-Consortium">W3C</acronym> Editor's Draft 18 June 2010</h2><dl><dt>This
+id="w3c-editor-s-draft-21-june-2010"><acronym title="World Wide Web
+Consortium">W3C</acronym> Editor's Draft 21 June 2010</h2><dl><dt>This
version:</dt><dd><a
href="http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html">http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html</a></dd><dt>Latest
published version:</dt><dd><a href="http://www.w3.org/TR/xxx-xxx/">http://www.w3.org/TR/xxx-xxx/</a></dd><dt>Latest
@@ -45,16 +45,16 @@
use</a> rules apply.</p><hr></div>
<div class="introductory section" id="abstract"><h2>Abstract</h2>
<p>
- A polyglot document is an HTML5 document which is at the same time
- an XML document and an HTML document, and which meets a well defined
-set of constraints.
- Polyglot documents that meet these constraints as interpreted as
+ A document that uses polyglot markup is an HTML5 document which is
+ at the same time an XML document and an HTML document, and which meets a
+ well defined set of constraints.
+ Polyglot markup that meets these constraints as interpreted as
compatible, regardless of whether they are processed as HTML or as
XHTML, per the HTML5 specification.
- Polyglot documents use a specific doctype, namespace declarations,
- and a specific case—normally lower case but occasionally camel case—for
- element and attribute names.
- Polyglot documents use lower case for certain attribute values.
+ Polyglot markup uses a specific doctype, namespace declarations,
+and a specific case—normally lower case but occasionally camel case—for
+element and attribute names.
+ Polyglot markup uses lower case for certain attribute values.
Further constraints include those on empty elements, named entity
references, and the use of scripts and style.
</p>
@@ -160,9 +160,10 @@
document, and they and others may process the document using XML tools.
These documents are served as text/html.
The language used to create documents that can be parsed by both HTML
-and XML parsers is called <dfn id="dfn-polyglot">polyglot</dfn>.
- Polyglot is the overlap language of documents which are both HTML5
-documents and XML documents.
+and XML parsers is called <dfn id="dfn-polyglot-markup">polyglot markup</dfn>.
+
+ Polyglot markup is the overlap language of documents which are both
+HTML5 documents and XML documents.
</p>
</div>
@@ -170,7 +171,7 @@
<!--OddPage--><h2><span class="secno">2. </span>Processing Instructions
and the XML Declaration</h2>
<p>
- A polyglot document does not use processing instructions.
+ Polyglot markup does not use processing instructions.
Note that the parsing rules for the XML declaration are not
processing instructions and are defined separately in <a
href="http://www.w3.org/TR/REC-xml/#NT-XMLDecl">Prolog and Document Type
@@ -181,17 +182,17 @@
<div id="character-encoding" class="section">
<!--OddPage--><h2><span class="secno">3. </span>Character Encoding</h2>
<p>
- A polyglot document uses either UTF-8 or UTF-16, although generally
-UTF-8 is preferred.
- If a polyglot document uses UTF-16, it <em title="should"
+ Polyglot markup uses either UTF-8 or UTF-16, although generally UTF-8
+ is preferred.
+ When polyglot markup uses UTF-16, it <em title="should"
class="rfc2119">should</em> include the BOM indicating UTF-16LE or
UTF-16BE.
- In addition, a polyglot document need not include the meta charset
+ In addition, polyglot markup need not include the meta charset
declaration, because the parser would have to read UTF-16 in order to
parse it by definition.
</p>
<p>
- In short, for correct character encoding, a polyglot document <em
+ In short, for correct character encoding, polyglot markup <em
title="must" class="rfc2119">must</em> either:
</p><ul>
<li>Use UTF-8 or UTF-16 with the appropriate BOM.</li>
@@ -204,12 +205,12 @@
<p>
- If a polyglot document uses an encoding other than UTF-8 or UTF-16,
-it <em title="must" class="rfc2119">must</em> include the XML
-declaration; however, in this case the document <em title="must"
-class="rfc2119">must</em> also include the HTML <code>meta</code> tag
-specifying the character set.
- When a polyglot document uses both the XML declaration and the HTML <code>meta</code>
+ If polyglot markup uses an encoding other than UTF-8 or UTF-16, it <em
+ title="must" class="rfc2119">must</em> include the XML declaration;
+however, in this case the document <em title="must" class="rfc2119">must</em>
+ also include the HTML <code>meta</code> tag specifying the character
+set.
+ When polyglot markup uses both the XML declaration and the HTML <code>meta</code>
tag, these <em title="must" class="rfc2119">must</em> specify the same
character and coding.
@@ -219,9 +220,8 @@
<div id="doctype" class="section">
<!--OddPage--><h2><span class="secno">4. </span>The DOCTYPE</h2>
<p>
- A polyglot document uses the <code><!DOCTYPE html></code>
-doctype.
- Note that for a polyglot document the string, <code>html</code>, <em
+ Polyglot markup uses the <code><!DOCTYPE html></code> doctype.
+ Note that for polyglot markup the string, <code>html</code>, <em
title="must" class="rfc2119">must</em> be lower case.
For a pure HTML document, the string is defined as case-insensitive. [<a
href="#bib-HTML5" rel="biblioentry" class="bibref">HTML5</a>]
@@ -231,8 +231,7 @@
<div id="namespaces" class="section">
<!--OddPage--><h2><span class="secno">5. </span>Namespaces</h2>
<p>
- The following rules apply to namespaces used in polyglot
-documents.
+ The following rules apply to namespaces used in polyglot markup.
</p>
<ul>
<li>
@@ -271,8 +270,8 @@
<div class="section" id="required-elements">
<h3><span class="secno">6.1 </span>Required Elements</h3>
<p>
- Each polyglot document <em title="must" class="rfc2119">must</em>
- have a root <code>html</code> element.
+ Each document using polyglot markup <em title="must"
+class="rfc2119">must</em> have a root <code>html</code> element.
The root <code>html</code> element <em title="must"
class="rfc2119">must</em> contain both a <code>head</code> and a <code>body</code>
element.
@@ -284,9 +283,9 @@
<div id="tables" class="section">
<h4><span class="secno">6.1.1 </span>Tables</h4>
<p>
- Within a polyglot document, a table <em title="must"
-class="rfc2119">must</em> explicitly have a <code>tbody</code> element
-surrounding groups of <code>tr</code> elements.
+ Polyglot markup <em title="must" class="rfc2119">must</em>
+explicitly have a <code>tbody</code> element surrounding groups of <code>tr</code>
+ elements within a <code>table</code> element.
HTML parsers insert the <code>tbody</code> element, but XML
parsers do not, thus creating different DOMs.
</p>
@@ -307,28 +306,26 @@
<p>
The following guidelines apply to any usage of element names,
attribute names, or attribute values in markup, script, or CSS.
- When required, a polyglot document uses lower case letters for all
-ASCII letters; however, case requirements do not apply to non-ASCII
-letters such as Greek, Cyrillic, or non-ASCII Latin letters.
+ When required, polyglot markup uses lower case letters for all ASCII
+letters; however, case requirements do not apply to non-ASCII letters
+such as Greek, Cyrillic, or non-ASCII Latin letters.
</p>
<div id="element-names" class="section">
<h4><span class="secno">6.2.1 </span>Element Names</h4>
- <p>A polyglot document uses the correct case for element
-names.</p>
+ <p>Polyglot markup uses the correct case for element names.</p>
<ul>
<li>
- A polyglot document uses lowercase letters for all HTML element
-names.
+ Polyglot markup uses lowercase letters for all HTML element names.
</li>
<li>
- A polyglot document uses lowercase letters for all MathML element
+ Polyglot markup uses lowercase letters for all MathML element
names.
</li>
<li>
- A polyglot document uses lowercase letters for all SVG element
-names except the following, which <em title="must" class="rfc2119">must</em>
- be in mixed case:
+ Polyglot markup uses lowercase letters for all SVG element names
+except the following, which <em title="must" class="rfc2119">must</em>
+be in mixed case:
<ul>
<li><code>altGlyph</code></li>
<li><code>altGlyphDef</code></li>
@@ -374,22 +371,22 @@
<div id="attribute-names" class="section">
<h4><span class="secno">6.2.2 </span>Attribute Names</h4>
<p>
- A polyglot document uses the correct case for attribute names.
+ Polyglot markup uses the correct case for attribute names.
</p>
<ul>
<li>
- A polyglot document uses lowercase letters in attribute
-names for all HTML elements.
+ Polyglot markup uses lowercase letters in attribute names
+for all HTML elements.
</li>
<li>
- A polyglot document uses lowercase letters in attribute
-names for all MathML elements except the following:
+ Polyglot markup uses lowercase letters in attribute names
+for all MathML elements except the following:
<p>The lowercase <code>definitionurl</code> <em
title="must" class="rfc2119">must</em> be changed to the mixed case <code>definitionURL</code>.</p>
</li>
<li>
- A polyglot document uses lowercase letters in attribute
-names for all SVG elements except the following, which <em title="must"
+ Polyglot markup uses lowercase letters in attribute names
+for all SVG elements except the following, which <em title="must"
class="rfc2119">must</em> be in mixed case:
<ul>
<li><code>attributeName</code></li>
@@ -462,13 +459,12 @@
<div id="attribute-values" class="section">
<h4><span class="secno">6.2.3 </span>Attribute Values</h4>
<p>
- A polyglot document uses lowercase letters for the values of the
+ Polyglot markup uses lowercase letters for the values of the
attributes in the following list when they exist on HTML elements.
- More specifically, where required, a polyglot document <em
-title="must" class="rfc2119">must</em> use lower case letters for all
-ASCII letters in these attribute values; however, case requirements do
-not apply to non-ASCII letters such as Greek, Cyrillic, or non-ASCII
-Latin letters.
+ More specifically, where required, polyglot markup <em title="must"
+class="rfc2119">must</em> use lower case letters for all ASCII letters
+in these attribute values; however, case requirements do not apply to
+non-ASCII letters such as Greek, Cyrillic, or non-ASCII Latin letters.
Attributes for HTML elements other than those in the following list <em
title="may" class="rfc2119">may</em> have values made of mixed case
letters.
@@ -530,8 +526,8 @@
<div id="empty-elements" class="section">
<h3><span class="secno">6.3 </span>Empty Elements</h3>
<p>
- A polyglot document uses only the elements in the following
-list as empty elements.
+ Polyglot markup uses only the elements in the following list as
+ empty elements.
</p>
<ul>
<li><code>area</code></li>
@@ -550,15 +546,15 @@
<li><code>source</code></li>
</ul>
<p>
- A polyglot document uses the minimized tag syntax for empty
+ Polyglot markup uses the minimized tag syntax for empty
elements, e.g. <code><br/></code>.
The alternative syntax <code><br></br></code>
allowed by XML gives uncertain results in many existing user agents.
</p>
<p>
Given an empty instance of an element whose content model is not
- EMPTY (for example, an empty title or paragraph) a polyglot document
-does not use the minimized form (e.g. the document uses <code><p></p></code>
+ EMPTY (for example, an empty title or paragraph) polyglot markup does
+not use the minimized form (e.g. the document uses <code><p></p></code>
and not <code><p /></code>).
</p>
<p>
@@ -570,10 +566,10 @@
<div id="attributes" class="section">
<!--OddPage--><h2><span class="secno">7. </span>Attributes</h2>
- <p>A polyglot document does not contain line breaks and multiple white
- space characters within attribute values. These are handled
+ <p>Polyglot markup does not contain line breaks and multiple white
+space characters within attribute values. These are handled
inconsistently by user agents.</p>
- <p>A polyglot document surrounds all attribute values with quotation
+ <p>Polyglot markup surrounds all attribute values with quotation
marks. Attribute values <em title="may" class="rfc2119">may</em> be
surrounded either by single quotation marks or by double quotation
marks.</p>
@@ -585,8 +581,7 @@
<!--OddPage--><h2><span class="secno">8. </span>Named Entity
References</h2>
<p>
- A polyglot document uses only the following named entity
-references:
+ Polyglot markup uses only the following named entity references:
</p>
<ul>
<li><code>amp</code></li>
@@ -597,7 +592,7 @@
</ul>
<p>
For entities beyond the previous list, a ployglot document uses
-character references. For example, a polyglot document uses <code>&#160;</code>
+character references. For example, polyglot markup uses <code>&#160;</code>
instead of <code>&nbsp;</code>.
</p>
</div>
@@ -609,9 +604,9 @@
Script and style commands <em title="should" class="rfc2119">should</em>
be included by linking to external files rather than including them
in-line.
- However, a polyglot document <em title="must not"
-class="rfc2119">must not</em> link to an external stylesheet by using
-the xml-stylesheet processing instruction.
+ However, polyglot markup <em title="must not" class="rfc2119">must
+ not</em> link to an external stylesheet by using the xml-stylesheet
+processing instruction.
See also <a href="#PI-and-xml">Processing Instructions and the
XML Declaration</a>.
</p>
@@ -622,22 +617,22 @@
<p>
Although <code>document.write()</code> and <code>document.writeln()</code>
are valid in an HTML document, neither function may be used in XHTML.
- Therefore, neither is used in a polyglot document.
+ Therefore, neither is used in polyglot markup.
Instead, use the <code>innerHTML</code> property for both HTML
and XHTML.
Note that the <code>innerHTML</code> property takes a string.
XML parsers parse the string as XML in XHTML.
HTML parsers parse the string as HTML in HTML.
Because of the difference in parsing, if you send the parser
-content that does not follow the rules for a polyglot document the
-results will differ for a DOM create with an XML parser and one created
-with an HTML parser.
+content that does not follow the rules for polyglot markup the results
+will differ for a DOM create with an XML parser and one created with an
+HTML parser.
</p>
<div id="external-script-and-style" class="section">
<h3><span class="secno">9.1 </span>External Script and Style</h3>
<p>
- A polyglot document uses external scripts if that document's
-script or style sheet uses <code><</code> or <code>&</code> or <code>]]></code>
+ Polyglot markup uses external scripts if that document's script
+or style sheet uses <code><</code> or <code>&</code> or <code>]]></code>
or <code>--</code>.
Note that XML parsers are permitted to silently remove the
contents of comments; therefore, the historical practice of hiding
@@ -648,10 +643,10 @@
<div id="in-line-script-and-style" class="section">
<h3><span class="secno">9.2 </span>In-line Script and Style</h3>
<p>
- If a polyglot document must use script or style commands within
-its source code, either use safe content or wrap the command in a CDATA
+ If polyglot markup must use script or style commands within its
+source code, either use safe content or wrap the command in a CDATA
section.
- However, a polyglot document does not use a <code>CDATA</code>
+ However, polyglot markup does not use a <code>CDATA</code>
section unless it is being used within foreign content.
</p><ul>
<li>Safe content is content that does not contain a <code><</code>
@@ -683,8 +678,8 @@
<p>
When using MathML or SVG, the parser follows the XML parsing
rules.
- A polyglot document does not rely on getting a CDATA instance
-from the DOM when using MathML or SVG, because the HTML parser does not
+ Polyglot markup does not rely on getting a CDATA instance from
+the DOM when using MathML or SVG, because the HTML parser does not
create a CDATA instance in the DOM.
</p>
Received on Monday, 21 June 2010 21:08:12 UTC