CVS html5/html-polyglot

Update of /sources/public/html5/html-polyglot
In directory roscoe:/tmp/cvs-serv7358

Modified Files:
	html-polyglot.html 
Log Message:
More language edits, up to but not including 4.6.2.1 The safe text content option

--- /sources/public/html5/html-polyglot/html-polyglot.html	2014/01/07 22:41:46	1.20
+++ /sources/public/html5/html-polyglot/html-polyglot.html	2014/01/08 00:44:58	1.21
@@ -218,7 +218,7 @@
     <section id="PI-and-xml" class="section">
     <h3>Processing instructions and the XML declaration</h3>
     <p>
-        Processing Instructions and the XML Declaration are both forbidden in <a>polyglot markup</a>.
+        Processing instructions and the XML declaration are both forbidden in <a>polyglot markup</a>.
     </p>
     <!--End section: Processing Instructions and the XML Declaration-->
 </section>
@@ -226,13 +226,14 @@
     <h3>Specifying a document’s character encoding</h3>
     <p>
         <a title="polyglot markup">Polyglot markup</a> uses the UTF-8 character encoding, the only character encoding for which both HTML and XML require support.
-        HTML requires UTF-8 to be explicitly declared to avoid <a href="http://www.w3.org/TR/html5/semantics.html#charset">fallback to a legacy encoding</a> [[!HTML5]].
-        For XML, UTF-8 is an <a href="http://www.w3.org/TR/2008/REC-xml-20081126/#charencoding">encoding default</a>.
-        As such, character encoding MAY be left undeclared in XML with the result that UTF-8 is still supported [[!XML10]].
+        HTML requires UTF-8 to be explicitly declared to avoid <a href="http://www.w3.org/TR/html5/semantics.html#charset">fallback to a legacy encoding</a>. [[!HTML5]]
+    </p>
+    <p> For XML, UTF-8 is an <a href="http://www.w3.org/TR/2008/REC-xml-20081126/#charencoding">encoding default</a>. 
+        As such, character encoding MAY be left undeclared in XML with the result that UTF8 is still supported [[!XML10]].
     </p>
     <p>
         <a title="polyglot markup">Polyglot markup</a> declares the UTF-8 character encoding in the following ways, which may be used separately or
-        in combination (but note that here can only be a <em>single</em> <a title="HTML encoding declaration">HTML encoding declaration</a>):
+        in combination (but note that there can only be a <em>single</em> <a title="HTML encoding declaration">HTML encoding declaration</a>):
     </p>
     <ul>
         <li>Within the document
@@ -316,7 +317,7 @@
         <p>
             [[!HTML5]] introduces undeclared (native) default namespaces for the root HTML element, <code>html</code>, the root SVG element, <code>svg</code>,
             and the root MathML element, <code>math</code>.
-            <a title="polyglot markup">Polyglot markup</a> declares the following default namespaces, when the markup languages are included in the document, to maintain XML-compatibility [[!XML10]]:</p>
+            <a title="polyglot markup">Polyglot markup</a> declares the following default namespaces, when the markup languages are included in the document, to maintain XML compatibility [[!XML10]]:</p>
         <ul class="inline-list">
             <li><code>&lt;html xmlns="http://www.w3.org/1999/xhtml"></code></li>
             <li><code>&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML"></code></li>
@@ -354,13 +355,13 @@
         </ul>
         <p>
             Note that there are other prefixed attributes that can be used beyond <code>xlink:href</code> (such as <code>xml:base</code>).
-            <a title="polyglot markup">Polyglot markup</a> does not declare these prefixes via xmlns. The prefixes are implicitly declared
+            <a title="polyglot markup">Polyglot markup</a> does not declare these prefixes via <code>xmlns</code>. The prefixes are implicitly declared
             in XML and are automatically applied to the appropriate attributes in HTML.
         </p>
         <p>
             The namespaced attributes, such as <code>xml:lang=""</code> and <code>xmlns=""</code>, are "namespaced" within XHTML, SVG and MathML.
-            Thus, the rules for how they can be sued as CSS selectors is governed by CSS namespaces. [[!CSS3NAMESPACE]]
-            For more on the issues related to attribute selectors and namespaces, with and without prefix, see the section on <a
+            Thus, the rules for how they can be used as CSS selectors is governed by CSS namespaces. [[!CSS3NAMESPACE]]
+            For more about the issues related to attribute selectors and namespaces, with and without prefixes, see the section on <a
             href="#scripting-and-styling-polyglot-markup">Scripting and styling polyglot markup</a>.
         <p>
 
@@ -374,19 +375,21 @@
         <section id="required-elements" class="section">
     <h6>Required elements and tags</h6>
 
-    <p> HTML5’s concept of <dfn>optional tags</dfn> – start tags and/or end tags – covers <a
-            href="http://www.w3.org/TR/html5/syntax.html#optional-tags">elements that the
-        HTML parser itself automatically adds to the DOM</a> if the code doesn’t contain the tags for
-        them. However, since XML does not have a feature whereby elements with one or both tags that have been
-        omitted  from the code (such as when start and end tags of <code>html</code> are omitted) are added to the DOM,
-        omitting a tag in <a>polyglot markup</a> is equivalent of producing a not well-formed document or,
-        if both tags are omotted, equivalent of not adding the element at all. Therefore, <a>polyglot markup</a> does not
-        operate with <a>optional tags</a>.</p>
-
-    <p>That <a>polyglot markup</a> doesn’t operate with optional tags, may create surprises e.g. for someone not used
-        to adding e.g. the <code>tbody</code> tags in their code or to someone accustomed to omitting the end tag of the
-        <code>p</code> element. However, the requirement to be complete with regard to tags, is a key feature of <a>polyglot
-            markup</a> that makes the code <a title="robustness">robust</a> against subpar parsers and authoring surprises.</p>
+    <p> <a title="polyglot markup">Polyglot markup</a> does not employ <a>optional tags</a>.
+        HTML5’s concept of <dfn>optional tags</dfn> – missing start tags and/or end tags – covers 
+        <a href="http://www.w3.org/TR/html5/syntax.html#optional-tags">
+        elements that the HTML parser itself automatically adds to the DOM</a> 
+        if the code doesn’t contain the tags for them. 
+        Because XML does not have such a feature that adds missing start and/or end tags to the DOM, 
+        omitting a tag in <a>polyglot markup</a> is equivalent to producing a document that is not well-formed or,
+        if both tags are omitted, equivalent to not adding the element at all. </p>
+
+    <p>That <a>polyglot markup</a> doesn’t operate with optional tags, may create surprises for an author not used
+        to adding the <code>tbody</code> tags in their code, for example, 
+        or to someone accustomed to omitting the end tag of the <code>p</code> element. 
+        However, the requirement to be well-formed with regard to tags is a key feature of <a>polyglot markup</a> 
+        that makes the code <a title="robustness">robust</a> against subpar parsers and authoring surprises.
+    </p>
     <section id="minimal-polyglot-html-document">
         <h4>A minimal HTML document</h4>
         <p>
@@ -645,7 +648,7 @@
         	<a>polyglot markup</a> uses both the <code>lang</code> and the <code>xml:lang attributes</code> 
         	(see <a href="#language-attributes">Language attributes</a>); however, 
         	the <a href="http://www.w3.org/TR/css3-selectors/#lang-pseudo">CSS3 Selectors specification</a> stipulates that 
-        	language attributes, including <code>xml:lang</code>, are matched in a case-insensitive way. [[!SELECT]]
+        	language attributes, including <code>xml:lang</code>, are matched in a case insensitive way. [[!SELECT]]
         </p>
         <!--End section: Attribute values-->
     </section>
@@ -704,15 +707,17 @@
     </figure>
 
     <p>
-        In the HTML syntax, the contents of raw text elements is raw text, by which it is referred to the fact
-        that the HTML parser will not treat contained code that look like tags (element tags and comment tags), character references,
-        CDATA etc as tags, character references, CDATA etc, but as raw text. (See HTML5 for the exact rules.)
+        In HTML syntax, the content of raw text elements is raw text.
+        In other words, the HTML parser does not treat contained code that looks like tags (element tags and comment tags, 
+        character references, CDATA, etc.) as tags, character references, CDATA, etc., but as raw text. 
+        (See HTML5 for the exact rules.)
         In the XHTML syntax, however, the same constructs <em>will</em> be treated as tags, character references, CDATA etc.
     </p>
-    <p>As result, in HTML, it is simpler than it is in XHTML, for authors to comply with the requirement of the default MIME
-        types of the raw text elements. On the other side, by the use of <code class="CDATA">CDATA</code>, the raw text contents
-        parsed as XHTML, can be made ven less semantic than the raw text data of HTML, leading to potential harms if the document
-        is parsed as HTML
+    <p>As result, it is simpler for authors to comply with the requirement of the default MIME
+        types of the raw text elements in HTML than it is in XHTML. 
+        On the other hand, with <code class="CDATA">CDATA</code>, the raw text contents
+        parsed as XHTML can be made even less semantic than the raw text data of HTML, 
+        leading to potential harms if the document is parsed as HTML.
     </p>
 
     <figure id="ambiguous-table">
@@ -740,9 +745,9 @@
             <tr><td><code>cdata content</code></td><td>the content of CDATA sections</td><td></td><td>uninterpreted</td><td>—</td></tr>
             <tr><td><code>&lt;/script</code> </td><td>if occuring inside  <code>script</code> element and followed by one of "tab" (U+0009), "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), U+0020 SPACE, ">" (U+003E), or "/" (U+002F)</td><td>terminates parent</td><td>uninterpreted</td><td>interpreted</td></tr>
             <tr><td><code>&lt;/style</code></td><td>if occuring inside <code>style</code> element and followed by one of "tab" (U+0009), "LF" (U+000A), "FF" (U+000C), "CR" (U+000D), U+0020 SPACE, ">" (U+003E), or "/" (U+002F)</td><td>terminates parent</td><td>uninterpreted</td><td>interpreted</td></tr>
-            <tr><td><code>&lt;foo>&lt;/bar></code></td><td>all other tags, wellformed or not</td><td>uninterpreted</td><td>uninterpreted</td><td>interpreted <small>subject to normal parsing rules</small></td></tr>
+            <tr><td><code>&lt;foo>&lt;/bar></code></td><td>all other tags, well-formed or not</td><td>uninterpreted</td><td>uninterpreted</td><td>interpreted <small>subject to normal parsing rules</small></td></tr>
             <tr><td><code>&#38;#foo;</code></td><td>character references</td><td>uninterpreted</td><td>uninterpreted</td><td>interpreted <small>subject to normal parsing rules</small></td></tr> </tbody>        <tbody>
-        <tr><th><code>none of the above strings</code></th><td>Any other string</td><td>uninterpreted</td><td>uninterpreted</td><td>uninterpreted</td></tr>
+        <tr><td><code>none of the above strings</code></td><td>Any other string</td><td>uninterpreted</td><td>uninterpreted</td><td>uninterpreted</td></tr>
         </tbody>
         </table>
     </figure>
@@ -750,7 +755,7 @@
 
     <p>Syntactically, the polyglot subset is found by</p>
     <ul><li><em>either</em> <strong>limiting the content to <dfn>safe content</dfn></strong>, that
-        is: text that gets interpreted the same way in HTML and in XML.</li>
+        is, text that gets interpreted the same way in HTML and in XML.</li>
         <li><em>or</em> trying to <strong>even out the constraints differences</strong> by
             wrapping the contents in a <code>CDATA</code> section. The <code>CDATA</code> code is then seen as text
             by the HTML parser (and can thus interfere with the scripting or styling language!), while the XML parser sees the

Received on Wednesday, 8 January 2014 00:44:59 UTC