CVS html5/html-xhtml-author-guide from CVS User lsilli on 2013-09-26 (public-html-commits@w3.org from September 2013)

From: CVS User lsilli <cvsmail@w3.org>
Date: Thu, 26 Sep 2013 14:39:45 +0000
To: public-html-commits@w3.org
Message-Id: <E1VPCjJ-0001p9-8E@roscoe.w3.org>
Update of /sources/public/html5/html-xhtml-author-guide
In directory roscoe:/tmp/cvs-serv7011/html-xhtml-author-guide

Modified Files:
	html-xhtml-authoring-guide.html 
Log Message:
1) Finalized, for now, the element contents section.
2) Created '3.11 Scripting and styling polyglot markup'
  (mostly be regrouping and rewriting stuff already in the spec)
3) Other fixes.

--- /sources/public/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html	2013/09/05 04:50:55	1.134
+++ /sources/public/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html	2013/09/26 14:39:45	1.135
@@ -8,7 +8,7 @@
         var respecConfig = {
             specStatus:   "ED",
             shortName:    "html-polyglot",
-            publishDate:  "2013-09-03",
+            publishDate:  "2013-09-26",
             previousPublishDate:  "2010-10-19",
             previousMaturity:  "WD",
             edDraftURI:           "http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html",
@@ -33,9 +33,7 @@
     ul.inline-list li:last-child:after {content:"";}
     </style>
 </head>
-
 <body>
-
 <section id="abstract">
     A document that uses <a title="polyglot markup">polyglot markup</a> is a document that is a stream of bytes that parses into identical document trees
     (with some exceptions, as noted in the <a href="#introduction">Introduction</a>) when processed as HTML and when processed as XML.
@@ -45,7 +43,6 @@
     Further constraints include those on void elements, named entity references, and the use of scripts and style.
     <!--End section: Abstract-->
 </section>
-
 <section id="sotd">
     <p>
         This document summarizes design guidelines for authors who wish their XHTML or HTML documents to validate on both HTML and XML parsers.
@@ -72,9 +69,7 @@
     </p>
     <!--End section: Status of This Document-->
 </section>
-
-<!--
-note: for principle sectoin
+<!-- note: for principle section
 		In <a>polyglot markup</a>, the strings that XML and HTML interpret differently are considered <dfn>ambiguous
         strings</dfn> and MUST NOT be used except when they are explicitly permitted
 (such as for the ambigous namespace prefix <code>xml:</code>, which is permitted as prefix for the <code>lang</code> in the XML namespace – <code>xml:lang</code>).
@@ -166,7 +161,7 @@
 <!-- end intro-->
 
 <section id="syntax">
-    <h2>The syntax of polyglot markup</h2>
+    <h2>Syntax</h2>
     <section id="principles"><h3>Principles</h3>
         <p>
             <dfn>Polyglot markup</dfn> results in:
@@ -206,16 +201,14 @@
     <!--End section: principles-->
 </section>
 <section id="writing"><h2>Writing HTML documents</h2>
-<section id="PI-and-xml" class="section">
+    <section id="PI-and-xml" class="section">
     <h3>Processing instructions and the XML declaration</h3>
     <p>
         Processing Instructions and the XML Declaration are both forbidden in <a>polyglot markup</a>.
     </p>
     <!--End section: Processing Instructions and the XML Declaration-->
 </section>
-
-
-<section id="character-encoding" class="section">
+    <section id="character-encoding" class="section">
     <h3>Specifying a document’s character encoding</h3>
     <p>
         <a title="polyglot markup">Polyglot markup</a> uses the UTF-8 character encoding, the only character encoding for which both HTML and XML require support.
@@ -267,8 +260,7 @@
     </p>
     <!--End section: Specifying a Document's Character Encoding-->
 </section>
-
-<section id="doctype" class="section">
+    <section id="doctype" class="section">
     <h3>The DOCTYPE</h3>
     <p>
         <a title="polyglot markup">Polyglot markup</a> uses a document type declaration (DOCTYPE) specified by <a href="http://www.w3.org/TR/html5/syntax.html#the-doctype">section 8.1.1</a> of [[!HTML5]].
@@ -299,9 +291,7 @@
     </p>
     <!--End section: The DOCTYPE-->
 </section>
-
-
-<section id="namespaces" class="section">
+    <section id="namespaces" class="section">
     <h3>Namespaces</h3>
     <p>
         The following rules apply to namespaces used in <a>polyglot markup</a>.
@@ -348,34 +338,23 @@
         <p>
             Note that there are other prefixed attributes that can be used beyond <code>xlink:href</code> (such as <code>xml:base</code>).
             <a title="polyglot markup">Polyglot markup</a> does not declare these prefixes via xmlns. The prefixes are implicitly declared
-            in XML and are automatically  applied to the appropriate attributes in HTML.
+            in XML and are automatically applied to the appropriate attributes in HTML.
         </p>
         <p>
-            The prefixed attributes, such as <code>xml:lang=""</code>, are "namespaced" within XHTML, SVG and MathML.
-            Thus, they can be styled via CSS3 namespaces. [[!CSS3NAMESPACE]]
-            However, for the HTML serialization, <code>xml:lang</code> would then not have the xml namespace effect.
-            A style such as the following is valid in XHTML, SVG, and MathML,
-            it does not work in HTML and is therefore not used in <a>polyglot markup</a>.
-        </p>
-		<pre class="example highlight">
-&lt;style type="text/css">
-@namespace xml   "http://www.w3.org/XML/1998/namespace";
-*[xml|lang]{background:lime;}
-&lt;/style>
-		</pre>
+            The namespaced attributes, such as <code>xml:lang=""</code> and <code>xmlns=""</code>, are "namespaced" within XHTML, SVG and MathML.
+            Thus, the rules for how they can be sued as CSS selectors is governed by CSS namespaces. [[!CSS3NAMESPACE]]
+            For more on the issues related to attribute selectors and namespaces, with and without prefix, see the section on <a
+            href="#scripting-and-styling-polyglot-markup">Scripting and styling polyglot markup</a>.
         <p>
 
-        </p>
         <!-- End section, "Attribute-Level Namespaces" -->
     </section>
     <!--End section: Namespaces-->
 </section>
-
-<section id="elements" class="section">
+    <section id="elements" class="section">
 <h3>Element syntax</h3>
 <p><a title="polyglot markup">Polyglot markup</a> conforms to the following rules regarding elements.</p>
-
-<section id="required-elements" class="section">
+        <section id="required-elements" class="section">
     <h6>Required elements and tags</h6>
 
     <p> HTML5’s concept of <dfn>optional tags</dfn> – start tags and/or end tags – covers <a
@@ -447,8 +426,7 @@
     </section>
 
 </section>
-
-<section id="excluded-eelements" class="section">
+        <section id="excluded-elements" class="section">
     <h3>Excluded elements and tags</h3>
 
     <p>
@@ -462,9 +440,7 @@
     </p>
     <!--End section: Elements that Cannot Be Used in Polyglot Markup-->
 </section>
-
-
-<section id="case-sensitivity" class="section">
+        <section id="case-sensitivity" class="section">
     <h3>Case-sensitivity</h3>
     <p>
         The following apply to any usage of element names, attribute names, or attribute values in markup, script, or CSS.
@@ -652,15 +628,14 @@
     </section>
     <!--End section: Case-Sensitivity-->
 </section>
-<!--End section: Elements -->
+    <!--End section: Elements -->
 </section>
-
-<section id="contents-of-elements" class="section">
+    <section id="contents-of-elements" class="section">
 <h3>Element contents</h3>
 <p>For the <a href="http://www.w3.org/TR/html5/syntax.html#elements-0">different kinds of elements</a> that HTML documents contain, <a>polyglot markup</a> conforms to the following contents rules.</p>
-<section id="empty-elements" class="section">
+        <section id="empty-elements" class="section">
     <h4>Void elements</h4>
-    <p>In the HTML syntax, void elements are elements that always are empty and never has an end tag. All elements
+    <p>In the HTML syntax, void elements are elements that always are empty and never have an end tag. All elements
         listed as void <a href="http://www.w3.org/TR/html5/syntax.html#void-elements" >in the HTML specification</a> or
         in an extension spec, MUST in <a title="polyglot markup">polyglot
             markup</a> have the syntactic form of an XML <a href="http://www.w3.org/TR/REC-xml/#dt-empty"
@@ -689,9 +664,7 @@
 
     <!--End section: void Elements-->
 </section>
-
-
-<section id="raw-text-elements">
+        <section id="raw-text-elements">
     <h4>Raw text elements (<code>script</code> and <code>style</code>)</h4>
     <p>
         In <a>polyglot markup</a>, the contents of all elements listed as raw text elements
@@ -767,13 +740,13 @@
         guarantee that the content is <a title="safe content">safe</a>.</p>
 
     <section id="safe-content">
-        <h5>The safe text option</h5>
-        <p>The <dfn>safe content</dfn> option comes in two variants:</p>
+        <h5>The safe text content option</h5>
+        <p>The <dfn>safe text content</dfn> option comes in two variants:</p>
         <ul>
-            <li>The <strong>external <a>safe content</a></strong> variant. This implies to include the scripts or stylesheet by linking to an
+            <li>The <strong>external <a>safe text content</a></strong> variant. This implies to include the scripts or stylesheet by linking to an
                 external file rather than including all the code
                 in-line. External files are parsed as the respective script or stylesheet, and are thus not limited
-                by the polyglot parsing restriction.
+                by the safe text content restrictions.
                 <figure>
                     <figcaption>Using external <a title="safe content">safe content</a>.</figcaption>
     <pre class="example highlight"
@@ -782,11 +755,11 @@
             />&lt;style>@import "external.css";&lt;/style></pre>
                 </figure>
             </li>
-            <li>The <strong>inline <a>safe content</a></strong> variant. This option implies to abstain from using  characters and constructs
+            <li>The <strong>inline <a>safe text content</a></strong> variant. This option implies to abstain from using  characters and constructs
                 which HTML and XML interpret differently, namely the characters <code>&lt;</code> and <code>&amp;</code>
                 as well as the <code>CDATA</code> end mark string – <code>]]&gt;</code>.
                 <figure>
-                    <figcaption>Using inline <a title="safe content">safe content</a></figcaption>
+                    <figcaption>Using inline <a title="safe content">safe text content</a></figcaption>
 <pre class="example highlight">&lt;!-- Unsafe content: &lt; and &amp; are not escaped<br
         />     This code is not XML well-formed. --><br/>&lt;style>q::before{content:"&lt;";}&lt;/style><br/>&lt;script>var a = "&amp;";&lt;/script>
 
@@ -795,14 +768,14 @@
 
 &lt;!-- Safe content: <code>&lt;</code> and <code>&amp;</code> escaped at scripting/stylesheet level --><br/>&lt;style>q::before{content:"\00003c";}&lt;/style><br/>&lt;script>var a = "\u0026";&lt;/script></pre>
                 </figure>
-                <p>For CSS, the inline <a>safe content</a> option would work very well most of the time, as <code>&lt;</code> and
+                <p>For CSS, the inline <a>safe text content</a> option would work very well most of the time, as <code>&lt;</code> and
                     <code>&amp;</code> are not key parts of CSS and not very often used. But when it comes to JavaScript,
                     the <code>&#38;</code> and the <code>&lt;</code> are key verbs (operators) of the
                     language, and thus one soon runs into trouble – it is better to use <em>external</em> <a>safe content</a>.</p>
             </li>
         </ul>
         <figure>
-            <figcaption>An example of inline safe content in <code>script</code></figcaption>
+            <figcaption>An example of inline safe text content in <code>script</code></figcaption>
 <pre class="example highlight"
         >&lt;!-- The following the example is <a>polyglot markup</a> because there are no
 <a>     ambiguous strings</a> within the <code>script</code> element. --><br
@@ -824,8 +797,10 @@
 
         <p>But while CDATA evens out the constraints, it introduces a new problem: When consumed as HTML, the start and end mark of the
             CDATA section is seen by the script or stylesheet interpreter and can thus cause syntax errors or even halt the script
-            and stylesheet execution. The only way to deal with it is to comment out the CDATA start and end mark
-            using the comment methods of the script or stylesheet language.</p>
+            and stylesheet execution. The way to deal with it is to comment out the CDATA start and end mark
+            using the comment methods of the script or stylesheet language. Additionally, if e.g. <code>script</code> is used as a
+            coding block container, it may be necessary to even comment out the scripting/styling comments by hiding them
+            inside a XML comment.</p>
 
         <section id="CDATA-rules-raw-text">
             <h6>Safe CDATA usage rules</h6>
@@ -834,10 +809,11 @@
             <ul>
                 <li> The CDATA section is subject to HTML’s restrictions on <code>&lt;script></code>/<code>&lt;style></code></li>
                 <li> Only one CDATA section permitted per raw text element</li>
-                <li> Before the CDATA section there can only be one node, which may consist of whitespaces, one XML comment and/or one scripting level comment.</li>
-                <li> After the CDATA section, there can only be whitespace.</li>
-                <!--      <li> Only single line comments are permitted. (This rules out CDATA for "text/css".)</li> --></ul>
-            <p>The <code>]]&gt;</code> string:</p>
+                <li> Before the CDATA section there can only be one node - preferrably only one line of code, which may
+                     consist of whitespace, or an XML comment or a construct of the  scripting/styling language (usually
+                     a comment of the scripting/styling language).</li>
+                <li> After the CDATA section: Same rules as for before the CDATA section.</li>
+            </ul><p>The <code>]]&gt;</code> string:</p>
             <ul>
                 <li> is always commented out if <code>&lt;![CDATA[</code> is commented out.</li>
                 <li> is never commented out if <code>&lt;![CDATA[</code> is not commented out.</li>
@@ -884,28 +860,25 @@
     </section>
 
 </section>
-
-
-<section id="escaped-raw-text-elements">
+        <section id="escaped-raw-text-elements">
     <h4>Escapable raw text elements</h4>
     <p>Escapable raw text elements are elements in which character references are permitted, but where the HTML parser treats elements as text rather than as markup. </p>
     <ul class="inline-list"><li><code>title</code></li><li><code>textarea</code></li></ul>
+    <p>Escapable raw text elements are subject to the same rules of safe text content, with the exception that polyglot character entities are permittd.</p>
 </section>
-
-
-<section id="foreign-elements" class="section">
+        <section id="foreign-elements" class="section">
     <h4>Foreign elements</h4>
-    <p>Foo</p>
+    <p>The exact rules of for foreign content elements are defined by the respective specifications. </p>
     <!--End section: White Space in textarea and pre Elements-->
 </section>
-
-<section id="normal-elements" class="section">
+        <section id="normal-elements" class="section">
     <h4>Normal elements</h4>
-    <p>Foo</p>
+    <p>Normal elements have no special restrictions other than those that normally apply to polyglot markup. But note that some elements, such as the <code>iframe</code>
+    element must be empty in the polyglot markup since this is is a requirement which the HTML specification sets on <code>iframe</code> in the XHTML syntax.</p>
     <!--End section: White Space in textarea and pre Elements-->
 </section>
 </section>
-<section id="text">
+    <section id="text">
     <h3>Text</h3>
     <section id="newlines-in-textarea-and-pre" class="section">
         <h4>Newlines in <code>textarea</code> and <code>pre</code> elements</h4>
@@ -915,10 +888,7 @@
     </section>
     <!-- End section: text-->
 </section>
-
-
-
-<section id="attributes" class="section">
+    <section id="attributes" class="section">
     <h2>Attributes</h2>
     <p>
         <a title="polyglot markup">Polyglot markup</a> surrounds all attribute values with quotation marks.
@@ -1003,29 +973,13 @@
         </section>
 
 
-        <section id="attributes-styled-via-css-namespaces" class="section">
-            <h4>Attributes styled using CSS namespaces</h4>
-            <p>
-                The prefixed attributes, such as <code>xml:lang=""</code>, are "namespaced" within XHTML, SVG and MathML.
-                Thus, they can be styled via CSS3 namespaces. [[!CSS3NAMESPACE]]
-                However, for the HTML serialization, <code>xml:lang</code> would then not have the xml namespace effect.
-                A style such as the following is valid in XHTML, SVG, and MathML,
-                it does not work in HTML and is therefore not used in <a>polyglot markup</a>.
-            </p>
-<pre class="example highlight">&lt;style type="text/css">
-@namespace xml   "http://www.w3.org/XML/1998/namespace";
-*[xml|lang]{background:lime;}
-&lt;/style></pre>
-        </section>
-
 
         <!-- End section: Attributes with Special Considerations -->
     </section>
 
     <!--End section: Attributes-->
 </section>
-
-<section id="named-entity-references" class="section">
+    <section id="named-entity-references" class="section">
     <h2>Named entity references</h2>
     <p>
         <a title="polyglot markup">Polyglot markup</a> uses only the following named entity references:</p>
@@ -1050,9 +1004,7 @@
 
     <!--End section: Named Entity References-->
 </section>
-
-
-<section id="comments" class="section">
+    <section id="comments" class="section">
     <h2>Comments</h2>
     <p>
         <a title="polyglot markup">Polyglot markup</a> does not begin a comment with either "<code>></code>" or "<code>&#x2D;></code>".
@@ -1060,25 +1012,83 @@
 
     <!--End section: Comments-->
 </section>
-<!--En section: authoring-->
-<section id="scripting-and-styling-polyglot-markup">
-    <h2>Scripting and styling polyglot markup</h2>
-    <section>
-        <h3>Scripting restrictions</h3>
-        <p>Although <code>document.write()</code> and <code>document.writeln()</code> are valid in an HTML document,
-            neither function works in XHTML.  Therefore, they are not used in <a>polyglot markup</a>.
-            Instead, on may use the <code>innerHTML</code> property, which works for both HTML and XHTML.
+    <!--En section: authoring-->
+    <section id="scripting-and-styling-polyglot-markup">
+    <h2>Scripting and styling <a>polyglot markup</a></h2>
+        <p>When applying JavaScript and CSS to <a>polyglot markup</a>, the goal is to get the same result whether consumed
+        as HTML or as XML. It is therefore important to be aware of scripting and styling features that give different
+        results in HTML vs XML. These issues comes in addition to the polyglot usage rules for <a
+        href="#raw-text-elements">raw text elements</a>.</p>
+
+    <section id="javascript-and-document.write">
+        <h3>JavaScript: <code>innerHTML</code> vs <code>document.write()</code> </h3>
+        <p>Although <code>document.write()</code> and <code>document.writeln()</code> works in HTML,
+           neither function works in XHTML.  The polyglot alternative is the <code>innerHTML</code> property,
+           which works for both HTML and XHTML.
         </p>
         <p class="note">
-            The <code>innerHTML</code> property takes a string. However, XML parsers will parse that string as XML in XHTML while HTML parsers
-            parse will parse that string as HTML in HTML.  And because of this difference in parsing, the code that <code>innerHTML</code> inserts
-            must follow the guidelines for <a>polyglot markup</a> or else the DOM generated by the XML parser will
-            differ from the DOM generated by the HTML parser.
+            The <code>innerHTML</code> property takes a string. However, XML parsers will parse that string as XML in XHTM
+            while HTML parsers parse will parse that string as HTML in HTML.  And because of this difference in parsing,
+            the code that <code>innerHTML</code> inserts must follow the guidelines for <a>polyglot markup</a> so that
+            the resulting DOM generated by the XML parser do not differ from the DOM generated by the HTML parser.
         </p>
     </section>
+    <section id="css-and-namespaced-attributes" class="section">
+        <h3>CSS: Attribute selectors that requires a namespace prefix</h3>
+        <p>CSS allows authors to target elements by referring to their attributes – so called attribute selectors:
+           <code class="css">[attr]{rule:foo}</code>. And for the bulk of attributes, this method can be used freely in
+            since <a>polyglot markup</a> relies on default namespaces, which do not apply to attributes.
+            However, some of the attributes required by <a>polyglot markup</a>, are namespaced – either by default (such
+            as for the <code>xmlns</code> attribute) or via a prefix that by default is namespaced (such as <code>xml:</code>,
+            <code>xmlns:</code>, <code>xlink:</code>). As result, a selector such as <code class="css">[xmlns]{rule:foo}</code>
+            will only work in HTML – it will not work in XHTML, and the same goes for prefixed
+            attributes – even if you escape the colon (<code class="css">[xml\:lang]{rule:foo}</code>), such selectors
+            will only work in HTML.</p>

[64 lines skipped]
Received on Thursday, 26 September 2013 14:39:47 UTC