mike: made a number of refinements to the Syntax section

mike: made a number of refinements to the Syntax section

http://dev.w3.org/cvsweb/html5/markup/Overview.html?r1=1.345&r2=1.346&f=h

===================================================================
RCS file: /sources/public/html5/markup/Overview.html,v
retrieving revision 1.345
retrieving revision 1.346
diff -u -d -r1.345 -r1.346
--- Overview.html 6 Aug 2009 10:34:34 -0000 1.345
+++ Overview.html 7 Aug 2009 14:50:28 -0000 1.346
@@ -9,7 +9,7 @@
 <body>
 <div class="head">
 <h1>HTML 5: The Markup Language</h1>
-<h2>Editor&#8217;s Draft <em>6 August 2009</em>
+<h2>Editor&#8217;s Draft <em>7 August 2009</em>
 </h2>
 <dl>
 <dt>Latest Editor&#8217;s Draft:</dt>
@@ -42,7 +42,7 @@
     
     
     <p>
-        This document is the 6 August 2009 Editor&#8217;s Draft of 
+        This document is the 7 August 2009 Editor&#8217;s Draft of 
         <cite>HTML 5: The Markup Language</cite>.
       </p>
     <p>
@@ -191,7 +191,7 @@
 <span class="toc-section-number">&#8199;</span><a href="syntax.html#syntax"><span class="toc-section-number">6.</span> HTML syntax</a>
 <ul>
 <li id="doctype-syntax-toc">
-<span class="toc-section-number"></span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">6.01.</span> The DOCTYPE</a>
+<span class="toc-section-number"></span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">6.01.</span> The doctype</a>
 </li>
 <li id="character-encoding-toc">
 <span class="toc-section-number"></span><a href="syntax.html#character-encoding"><span class="toc-section-number">6.02.</span> Character encoding declaration</a>

Index: syntax.html
===================================================================
RCS file: /sources/public/html5/markup/syntax.html,v
retrieving revision 1.24
retrieving revision 1.25
diff -u -d -r1.24 -r1.25
--- syntax.html 6 Aug 2009 11:02:08 -0000 1.24
+++ syntax.html 7 Aug 2009 14:50:29 -0000 1.25
@@ -15,7 +15,7 @@
   <h2>6. HTML syntax <a class="hash" href="#syntax">#</a> <a class="toc-bak" href="Overview.html#syntax-toc">T</a></h2>
   <div class="toc">
 <ul>
-<li id="doctype-syntax-toc"><span class="toc-section-number">&#8199;</span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">1.</span> The DOCTYPE</a>
+<li id="doctype-syntax-toc"><span class="toc-section-number">&#8199;</span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">1.</span> The doctype</a>
 </li>
 <li id="character-encoding-toc"><span class="toc-section-number">&#8199;</span><a href="syntax.html#character-encoding"><span class="toc-section-number">2.</span> Character encoding declaration</a>
 </li>
@@ -38,59 +38,133 @@
 </ul>
 </div>
   <div id="doctype-syntax" class="section">
-    <h2>6.01. The DOCTYPE <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="Overview.html#doctype-syntax-toc">T</a></h2>
-    <p>A <dfn id="doctype" title="syntax-doctype">DOCTYPE</dfn> is
-    an special instruction which, for legacy reasons that have to
-    do with processing modes in browsers, is a required part of
-    any
-    <a href="documents.html#syntax-document-html">document in the HTML syntax</a>.</p>
-    <p>The DOCTYPE must match either the
-    <a href="syntax.html#doctype.pattern">doctype</a>
-    or
-    <a href="syntax.html#doctype.legacy">doctype.legacy</a>
-    patterns defined this specification, or must match the
-    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a>
-    production defined in the XML specification
-    <a href="references.html#refsXML">[XML]</a>.</p>
-    <p>The <code>doctype</code> pattern is defined as follows:</p>
-    <dl class="pattern-def">
-      <dt><a id="doctype.pattern" href="syntax.html#doctype.pattern">doctype</a> =</dt>
-      <dd>
-        A string that is an <a href="terminology.html#ascii-case-insensitive">ASCII
-          case-insensitive</a> match for the following regular
-        expression:
-        <pre><code class="regexp">&lt;!doctype\s+html\s*&gt;</code></pre>
-      </dd>
-    </dl>
-    <div class="example">
-    <p>The following are examples of some DOCTYPEs that match the
-    <a href="syntax.html#doctype">doctype</a> pattern.</p>
-    <pre>&lt;!doctype html&gt;</pre>
-    <pre>&lt;!DOCTYPE HTML&gt;</pre>
-    </div>
-    <p>The <code>doctype.legacy</code> pattern is defined as follows:</p>
-    <dl class="pattern-def">
-      <dt><a id="doctype.legacy" href="syntax.html#doctype.legacy">doctype.legacy</a> =</dt>
-      <dd>
-        A string that is an <a href="terminology.html#ascii-case-insensitive">ASCII
-          case-insensitive</a> match for the following regular
-        expression:
-        <pre><code class="regexp">&lt;!doctype\s+html\s+system\s+("about:legacy-compat"|'about:legacy-compat')\s*&gt;</code></pre>
-        &#8230;except for the <code>about:legacy-compat</code> part,
-        which must match exactly (not case-insensitively).
-        </dd>
-    </dl>
+    <h2>6.01. The doctype <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="Overview.html#doctype-syntax-toc">T</a></h2>
+    <p>A
+    <dfn id="doctype" title="doctype">doctype</dfn>
+    (sometimes capitalized as &#8220;DOCTYPE&#8221;) is an special instruction
+    which, for legacy reasons that have to do with processing
+    modes in browsers, is a required part of any
+    <a href="documents.html#syntax-document-html">document in the HTML syntax</a>;
+    it must either be a 
+    <a href="syntax.html#deprecated-doctype">deprecated doctype</a>,
+    or must consist of the following parts, in exactly the
+    following order:</p>
+    <ol>
+      <li>A
+      "<code title="U+003C LESS-THAN SIGN">&lt;</code>"
+      character.</li>
+      <li>A
+      "<code title="U+0021 EXCLAMATION MARK">!</code>"
+      character.</li>
+      <li>Any
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>DOCTYPE</code>".</li>
+      <li>One or more
+      <a href="terminology.html#space">space characters</a>.</li>
+      <li>Any
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>".</li>
+      <li>Optionally, a
+      <a href="syntax.html#doctype-legacy-string">doctype legacy string</a>.</li>
+      <li>Optionally, one or more
+      <a href="terminology.html#space">space characters</a>.</li>
+      <li>A
+      "<code title="U+003E GREATER-THAN SIGN">&gt;</code>"
+      character.</li>
+    </ol>
+    <p>A
+    <dfn id="doctype-legacy-string" title="doctype-legacy-string">doctype legacy string</dfn>
+    consists of the following parts, in exactly the following
+    order.</p>
+    <ol>
+      <li>One or more
+      <a href="terminology.html#space">space characters</a>.</li>
+      <li>Any
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>SYSTEM</code>".</li>
+      <li>One or more
+      <a href="terminology.html#space">space characters</a></li>
+      <li>A <i>quote mark</i>, consisting of either
+      a
+      "<code title="U+0022 QUOTATION MARK">"</code>"
+      character or a
+      "<code title="U+0027 APOSTROPHE">'</code>"
+      character.</li>
+      <li>The literal string
+      "<code>about:legacy-compat</code>".</li>
+      <li>A matching <i>quote mark</i>, identical to the
+      <i>quote mark</i> used earlier (either a
+      "<code title="U+0022 QUOTATION MARK">"</code>"
+      character or a
+      "<code title="U+0027 APOSTROPHE">'</code>"
+      character).</li>
+    </ol>
     <div class="example">
-    <p>The following are examples of some DOCTYPEs that match the
-    <a href="syntax.html#doctype.legacy">doctype.legacy</a> pattern.</p>
-    <pre>&lt;!doctype html system 'about:legacy-compat'&gt;</pre>
-    <pre>&lt;!DOCTYPE HTML system "about:legacy-compat"&gt;</pre>
+    <p>The following are examples of some conformant
+    <a href="syntax.html#doctype">doctypes</a>.</p>
+    <pre>&lt;!DOCTYPE html&gt;</pre>
+    <pre>&lt;!doctype HTML system "about:legacy-compat"&gt;</pre>
     </div>
-    <p>The following are examples of some DOCTYPEs that match the 
-    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a>
-    production defined in the XML specification
-    <a href="references.html#refsXML">[XML]</a>.</p>
+    <p>A
+    <dfn id="deprecated-doctype" title="deprecated-doctype">deprecated doctype</dfn>
+    is a
+    <dfn id="doctype-declaration" title="doctype-declaration">document type declaration</dfn>
+    as defined in the XML specification
+    <a href="references.html#refsXML">[XML]</a>,
+    with the further restriction that it must meet one of the
+    following sets of constraints:</p>
+    <ul>
+      <li>The
+      <a href="syntax.html#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD HTML 4.0//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/REC-html40/strict.dtd</code>".</li>
+      <li>The
+      <a href="syntax.html#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD HTML 4.01//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/html4/strict.dtd</code>".</li>
+      <li>The
+      <a href="syntax.html#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD XHTML 1.0 Strict//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code>".</li>
+      <li>The
+      <a href="syntax.html#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD XHTML 1.1//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</code>".</li>
+    </ul>
     <div class="example">
+    <p>The following are examples of
+    <a href="syntax.html#deprecated-doctype">deprecated doctypes</a>.</p>
     <pre>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;</pre>
     <pre>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
@@ -402,27 +476,29 @@
             syntax:</p>
             <pre>&lt;input <em>disabled</em>&gt;</pre>
           </div>
-          <p>If an attribute using the empty attribute syntax is
-          followed by another attribute, then there must be at
-          least one
-          <a href="terminology.html#space">space character</a>
-          between the value and the other attribute.</p>
         </dd>
         <dt><dfn id="syntax-attr-unquoted" title="syntax-attr-unquoted">Unquoted attribute-value syntax</dfn></dt>
         <dd>
-        <p>An attribute and its value may be specified by providing
-          the <a href="syntax.html#attribute-name">attribute name</a>,
-          followed by zero or more
-          <a href="terminology.html#space">space characters</a>,
-          followed by a single
-          "<code title="U+003D EQUALS SIGN">=</code>"
-          character, followed by zero or more
-          <a href="terminology.html#space">space characters</a>,
-          followed by the
-          <a href="syntax.html#syntax-attribute-value">attribute value</a>.</p>
+        <p>An
+          <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn>
+          is specified by providing the following parts in exactly
+          the following order:</p>
+          <ol>
+            <li>an
+            <a href="syntax.html#attribute-name">attribute name</a></li>
+            <li>zero or more
+            <a href="terminology.html#space">space characters</a></li>
+            <li>a single
+            "<code title="U+003D EQUALS SIGN">=</code>"
+            character</li>
+            <li>zero or more
+            <a href="terminology.html#space">space characters</a></li>
+            <li>an
+            <a href="syntax.html#syntax-attribute-value">attribute value</a></li>
+          </ol>
           <p>In addition to the general requirements given above for
-          attribute values, an 
-          <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn>:</p>
+          attribute values, an unquoted attribute value has the
+          following restrictions:</p>
           <ul>
             <li>must not contain any literal
             <a href="terminology.html#space">space characters</a></li>
@@ -441,11 +517,6 @@
             syntax:</p>
             <pre>&lt;input <em>value=yes</em>&gt;</pre>
           </div>
-          <p>If the value an attribute using the unquoted
-            attribute syntax is followed by another attribute,
-            then there must be at least one
-            <a href="terminology.html#space">space character</a>
-            between the value and the other attribute.</p>
           <p>If the value of an attribute using the unquoted
             attribute syntax is followed by a
             "<code title="U+002F SOLIDUS">/</code>"
@@ -457,28 +528,37 @@
         </dd>
         <dt><dfn id="syntax-attr-single-quoted">Single-quoted attribute-value syntax</dfn></dt>
         <dd>
-          <p>An attribute and its value may be specified by
-          providing the
-          <a href="syntax.html#attribute-name">attribute name</a>,
-          followed by zero or more
-          <a href="terminology.html#space">space characters</a>,
-          followed by a single
-          "<code title="U+003D EQUALS SIGN">=</code>"
-          character, followed by zero or more
-          <a href="terminology.html#space">space characters</a>,
-          followed by a single
-          "<code title="U+0027 APOSTROPHE">'</code>"
-          character, followed by the
-          <a href="syntax.html#syntax-attribute-value">attribute value</a>,
-          followed by a single
-          "<code title="U+0027 APOSTROPHE">'</code>"
-          character.</p>
-          <p>In addition to the general requirements given above
-          for attribute values, a
+          <p>A
           <dfn id="attr-value-single-quoted" title="attr-value-single-quoted">single-quoted attribute value</dfn>
-          must not contain any literal
-          "<code title="U+0027 APOSTROPHE">'</code>"
-          characters.</p>
+          is specified by providing the following parts in exactly
+          the following order:</p>
+          <ol>
+            <li>an
+            <a href="syntax.html#attribute-name">attribute name</a></li>
+            <li>zero or more
+            <a href="terminology.html#space">space characters</a></li>
+            <li>a
+            "<code title="U+003D EQUALS SIGN">=</code>"
+            character</li>
+            <li>zero or more
+            <a href="terminology.html#space">space characters</a></li>
+            <li>a single
+            "<code title="U+0027 APOSTROPHE">'</code>"
+            character</li>
+            <li>an
+            <a href="syntax.html#syntax-attribute-value">attribute value</a></li>
+            <li>a
+            "<code title="U+0027 APOSTROPHE">'</code>"
+            character.</li>
+          </ol>
+          <p>In addition to the general requirements given above
+          for attribute values, a single-quoted attribute value
+          has the following restriction:</p>
+          <ul>
+            <li>must not contain any literal
+            "<code title="U+0027 APOSTROPHE">'</code>"
+            characters</li>
+          </ul>
           <div class="example">
             <p>In the following example, the
             <code title="attr-input-type">type</code> attribute
@@ -486,47 +566,46 @@
             syntax:</p>
             <pre>&lt;input <em>type='checkbox'</em>&gt;</pre>
           </div>
-          <p>If the value of an attribute using the single-quoted
-          attribute syntax is followed by another attribute, then
-          there must be at least one
-          <a href="terminology.html#space">space character</a>
-          after the value and before the other attribute.</p>
         </dd>
         <dt><dfn id="syntax-attr-double-quoted">Double-quoted attribute-value syntax</dfn></dt>
         <dd>
-          <p>An attribute and its value may be specified by
-          providing the
-          <a href="syntax.html#attribute-name">attribute name</a>,
-          followed by zero or more
-          <a href="terminology.html#space">space characters</a>,
-          followed by a single
-          "<code title="U+003D EQUALS SIGN character">=</code>"
-          character, followed by zero or more
-          <a href="terminology.html#space">space characters</a>,
-          followed by a single
-          "<code title="U+0022 QUOTATION MARK">"</code>" character,
-          followed by the
-          <a href="syntax.html#syntax-attribute-value">attribute value</a>,
-          and followed by a
-          "<code title="double U+0022 QUOTATION MARK">"</code>"
-          character.</p>
-          <p>In addition to the general requirements given above for
-          attribute values, a
+          <p>A
           <dfn id="attr-value-double-quoted" title="attr-value-double-quoted">double-quoted attribute value</dfn>
-          must not contain any literal
-          "<code title="U+0022 QUOTATION MARK">"</code>"
-          characters.</p>
+          is specified by providing the following parts in exactly
+          the following order:</p>
+          <ol>
+            <li>an
+            <a href="syntax.html#attribute-name">attribute name</a></li>
+            <li>zero or more
+            <a href="terminology.html#space">space characters</a></li>
+            <li>a single
+            "<code title="U+003D EQUALS SIGN character">=</code>"
+            character</li>
+            <li>zero or more
+            <a href="terminology.html#space">space characters</a></li>
+            <li>a single
+            "<code title="U+0022 QUOTATION MARK">"</code>"
+            character</li>
+            <li>an
+            <a href="syntax.html#syntax-attribute-value">attribute value</a></li>
+            <li>a
+            "<code title="double U+0022 QUOTATION MARK">"</code>"
+            character</li>
+          </ol>
+          <p>In addition to the general requirements given above for
+          attribute values, a double-quoted attribute value has
+          the following restriction:</p>
+          <ul>
+            <li>must not contain any literal
+            "<code title="U+0022 QUOTATION MARK">"</code>"
+            characters</li>
+          </ul>
           <div class="example">
             <p>In the following example, the
             <code>title</code> attribute is
             given with the double-quoted attribute value syntax:</p>
             <pre>&lt;code title="U+003C LESS-THAN SIGN"&gt;&amp;lt;&lt;/code&gt;</pre>
           </div>
-          <p>If the value of attribute using the double-quoted
-          attribute syntax is followed by another attribute, then
-          there must be at least one
-          <a href="terminology.html#space">space character</a>
-          after the value and before the other attribute.</p>
         </dd>
       </dl>
     </div>
@@ -706,14 +785,20 @@
     </ul>
     <dl>
       <dt><dfn id="named-charref">Named character reference</dfn></dt>
-      <dd><p>A named character reference is an
-        "<code title="U+0026 AMPERSAND">&amp;</code>"
-        character followed by one of the entity names defined in
-        <cite>XML Entity definitions for Characters</cite>
-        <a href="references.html#refsEntities">[Entities]</a>,
-        using the same case, followed by a
-        "<code title="U+003B SEMICOLON">;</code>"
-        character.</p>
+      <dd><p>Named character references consist of the following
+        parts in exactly the following order:</p>
+        <ol>
+          <li>An
+          "<code title="U+0026 AMPERSAND">&amp;</code>"
+          character.</li>
+          <li>One of the entity names defined in
+          <cite>XML Entity definitions for Characters</cite>
+          <a href="references.html#refsEntities">[Entities]</a>,
+          using the same case.</li>
+          <li>A
+          "<code title="U+003B SEMICOLON">;</code>"
+          character.</li>
+        </ol>
         <div class="example">
           <p>The following is an example of a named character
           reference for the character
@@ -723,20 +808,27 @@
         </div>
       </dd>
       <dt><dfn id="dec-charref">Decimal numeric character reference</dfn></dt>
-      <dd><p>A decimal numerical character reference is an
-        "<code title="U+0026 AMPERSAND">&amp;</code>"
-        character, followed by a 
-        "<code title="U+0023 NUMBER SIGN">#</code>"
-        character, followed by one or more digits in the range
-        <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
-        representing a base-ten integer that itself is a Unicode
-        code point that is not
-        U+0000,
-        U+000D,
-        in the range U+0080&#8211;U+009F,
-        or in the range 0xD8000&#8211;0xDFFF (surrogates).
-        The digits must then be followed by a
-        "<code title="U+003B SEMICOLON">;</code>" character.</p>
+      <dd><p>Decimal numerical character references consist of the
+        following parts, in exactly the following order.</p>
+        <ol>
+          <li>An
+          "<code title="U+0026 AMPERSAND">&amp;</code>"
+          character.</li>
+          <li>A
+          "<code title="U+0023 NUMBER SIGN">#</code>"
+          character.</li>
+          <li>One or more digits in the range
+          <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
+          representing a base-ten integer that itself is a Unicode
+          code point that is not
+          U+0000,
+          U+000D,
+          in the range U+0080&#8211;U+009F,
+          or in the range 0xD8000&#8211;0xDFFF (surrogates).</li>
+          <li>A
+          "<code title="U+003B SEMICOLON">;</code>"
+          character.</li>
+        </ol>
         <div class="example">
           <p>The following is an example of a decimal numeric
           character reference for the character
@@ -746,30 +838,36 @@
         </div>
       </dd>
       <dt><dfn id="hex-charref">Hexadecimal numeric character reference</dfn></dt>
-      <dd><p>A hexadecimal numeric character reference is an
-        "<code title="U+0026 AMPERSAND">&amp;</code>"
-        character, followed by a 
-        "<code title="U+0023 NUMBER SIGN">#</code>"
-      character, followed by either a
-      "<code title="U+0078 LATIN SMALL LETTER X">x</code>"
-      character
-      or a
-      "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>"
-      character, followed by
-      one or more digits in the range
-      <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
-      <code title="U+0061 LATIN SMALL LETTER A&#8211;U+0066 LATIN SMALL LETTER F">a&#8211;f</code>,
-      and
-      <code title="U+0041 LATIN CAPITAL LETTER A&#8211;U+0046 LATIN CAPITAL LETTER F">A&#8211;F</code>,
-      representing a base-sixteen integer that itself is a Unicode
-      code point that is not
-      U+0000,
-      U+000D,
-      in the range U+0080&#8211;U+009F,
-      or in the range 0xD800&#8211;0xDFFF (surrogates).
-      The digits must then be followed by a 
-      "<code title="U+003B SEMICOLON">;</code>"
-      character.</p>
+      <dd><p>Hexadecimal numeric character references consist of
+        the following parts, in exactly the following order.</p>
+        <ol>
+          <li>An
+          "<code title="U+0026 AMPERSAND">&amp;</code>"
+          character.</li>
+          <li>A
+          "<code title="U+0023 NUMBER SIGN">#</code>"
+          character.</li>
+          <li>Either a
+          "<code title="U+0078 LATIN SMALL LETTER X">x</code>"
+          character
+          or a
+          "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>"
+          character.</li>
+          <li>One or more digits in the range
+          <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
+          <code title="U+0061 LATIN SMALL LETTER A&#8211;U+0066 LATIN SMALL LETTER F">a&#8211;f</code>,
+          and
+          <code title="U+0041 LATIN CAPITAL LETTER A&#8211;U+0046 LATIN CAPITAL LETTER F">A&#8211;F</code>,
+          representing a base-sixteen integer that itself is a
+          Unicode code point that is not
+          U+0000,
+          U+000D,
+          in the range U+0080&#8211;U+009F,
+          or in the range 0xD800&#8211;0xDFFF (surrogates).</li>
+          <li>A
+          "<code title="U+003B SEMICOLON">;</code>"
+          character.</li>
+        </ol>
         <div class="example">
           <p>The following is an example of a hexadecimal numeric
           character reference for the character
@@ -844,7 +942,11 @@
       that is not itself in an
       <a href="syntax.html#syntax-escape">escaping text span</a>,
       and ends at the next
-      <a href="syntax.html#syntax-escape-end">escaping text span end</a>.</p>
+      <a href="syntax.html#syntax-escape-end">escaping text span end</a>.
+      Escaping text spans have the following restriction:</p>
+    <ul>
+      <li>must not contain any <a href="syntax.html#syntax-charref">character references</a></li>
+    </ul>
     <p>An
       <dfn id="syntax-escape-start">escaping text span start</dfn>
       is the
@@ -875,20 +977,16 @@
          <a href="syntax.html#syntax-text">text</a>;
          it is not a
          <a href="syntax.html#comment-end-delimiter">comment end delimiter</a>.</li>
+         <li>Any sequences of characters within an
+         <a href="syntax.html#syntax-escape">escaping text span</a>
+         that look like
+         <a href="syntax.html#syntax-charref">character references</a>
+         are
+         <a href="syntax.html#syntax-text">text</a>,
+         not 
+         <a href="syntax.html#syntax-charref">character references</a>.</li>
        </ul>
      </div>
-     <p>There cannot be any
-      <a href="syntax.html#syntax-charref">character references</a>
-      inside an
-      <a href="syntax.html#syntax-escape">escaping text span</a>;
-      any sequences of characters within an
-      <a href="syntax.html#syntax-escape">escaping text span</a>
-      that may look like
-      <a href="syntax.html#syntax-charref">character references</a>
-      are in fact 
-      <a href="syntax.html#syntax-text">text</a>,
-      not 
-      <a href="syntax.html#syntax-charref">character references</a>.</p>
      <p>An
      <a href="syntax.html#syntax-escape-start">escaping text span start</a>
      may share its

Index: documents.html
===================================================================
RCS file: /sources/public/html5/markup/documents.html,v
retrieving revision 1.8
retrieving revision 1.9
diff -u -d -r1.8 -r1.9
--- documents.html 6 Aug 2009 10:34:34 -0000 1.8
+++ documents.html 7 Aug 2009 14:50:29 -0000 1.9
@@ -96,7 +96,7 @@
       <li>Any number of
       <a href="syntax.html#syntax-comments">comments</a> and
       <a href="terminology.html#space">space characters</a>.</li>
-      <li>A <a href="syntax.html#doctype">DOCTYPE</a>.</li>
+      <li>A <a href="syntax.html#doctype">doctype</a>.</li>
       <li>Any number of
       <a href="syntax.html#syntax-comments">comments</a> and
       <a href="terminology.html#space">space characters</a>.</li>
@@ -118,7 +118,7 @@
       character.</li>
       <li>Any number of comments and space characters, as defined
       in the XML specification <a href="references.html#refsXML">[XML]</a>.</li>
-      <li>Optionally, a DOCTYPE, as defined
+      <li>Optionally, a doctype declaration, as defined
       in the XML specification <a href="references.html#refsXML">[XML]</a>.</li>
       <li>Any number of comments and space characters, as defined
       in the XML specification <a href="references.html#refsXML">[XML]</a>.</li>

Index: spec.html
===================================================================
RCS file: /sources/public/html5/markup/spec.html,v
retrieving revision 1.90
retrieving revision 1.91
diff -u -d -r1.90 -r1.91
--- spec.html 6 Aug 2009 11:02:08 -0000 1.90
+++ spec.html 7 Aug 2009 14:50:29 -0000 1.91
@@ -9,7 +9,7 @@
 <body>
 <div class="head">
 <h1>HTML 5: The Markup Language</h1>
-<h2>Editor&#8217;s Draft <em>6 August 2009</em>
+<h2>Editor&#8217;s Draft <em>7 August 2009</em>
 </h2>
 <dl>
 <dt>Latest Editor&#8217;s Draft:</dt>
@@ -41,7 +41,7 @@
     
     
     <p>
-        This document is the 6 August 2009 Editor&#8217;s Draft of 
+        This document is the 7 August 2009 Editor&#8217;s Draft of 
         <cite>HTML 5: The Markup Language</cite>.
       </p>
     <p>
@@ -190,7 +190,7 @@
 <span class="toc-section-number">&#8199;</span><a href="#syntax"><span class="toc-section-number">6.</span> HTML syntax</a>
 <ul>
 <li id="doctype-syntax-toc">
-<span class="toc-section-number"></span><a href="#doctype-syntax"><span class="toc-section-number">6.01.</span> The DOCTYPE</a>
+<span class="toc-section-number"></span><a href="#doctype-syntax"><span class="toc-section-number">6.01.</span> The doctype</a>
 </li>
 <li id="character-encoding-toc">
 <span class="toc-section-number"></span><a href="#character-encoding"><span class="toc-section-number">6.02.</span> Character encoding declaration</a>
@@ -907,7 +907,7 @@
       <li>Any number of
       <a href="#syntax-comments">comments</a> and
       <a href="#space">space characters</a>.</li>
-      <li>A <a href="#doctype">DOCTYPE</a>.</li>
+      <li>A <a href="#doctype">doctype</a>.</li>
       <li>Any number of
       <a href="#syntax-comments">comments</a> and
       <a href="#space">space characters</a>.</li>
@@ -929,7 +929,7 @@
       character.</li>
       <li>Any number of comments and space characters, as defined
       in the XML specification <a href="#refsXML">[XML]</a>.</li>
-      <li>Optionally, a DOCTYPE, as defined
+      <li>Optionally, a doctype declaration, as defined
       in the XML specification <a href="#refsXML">[XML]</a>.</li>
       <li>Any number of comments and space characters, as defined
       in the XML specification <a href="#refsXML">[XML]</a>.</li>
@@ -1014,7 +1014,7 @@
   <div class="toc">
 <ul>
 <li id="doctype-syntax-toc">
-<span class="toc-section-number">&#8199;</span><a href="#doctype-syntax"><span class="toc-section-number">1.</span> The DOCTYPE</a>
+<span class="toc-section-number">&#8199;</span><a href="#doctype-syntax"><span class="toc-section-number">1.</span> The doctype</a>
 </li>
 <li id="character-encoding-toc">
 <span class="toc-section-number">&#8199;</span><a href="#character-encoding"><span class="toc-section-number">2.</span> Character encoding declaration</a>
@@ -1046,62 +1046,135 @@
 </ul>
 </div>
   <div id="doctype-syntax" class="section">
-    <h2>6.01. The DOCTYPE <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="#doctype-syntax-toc">T</a>
+    <h2>6.01. The doctype <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="#doctype-syntax-toc">T</a>
 </h2>
-    <p>A <dfn id="doctype" title="syntax-doctype">DOCTYPE</dfn> is
-    an special instruction which, for legacy reasons that have to
-    do with processing modes in browsers, is a required part of
-    any
-    <a href="#syntax-document-html">document in the HTML syntax</a>.</p>
-    <p>The DOCTYPE must match either the
-    <a href="#doctype.pattern">doctype</a>
-    or
-    <a href="#doctype.legacy">doctype.legacy</a>
-    patterns defined this specification, or must match the
-    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a>
-    production defined in the XML specification
-    <a href="#refsXML">[XML]</a>.</p>
-    <p>The <code>doctype</code> pattern is defined as follows:</p>
-    <dl class="pattern-def">
-<dt>
-<a id="doctype.pattern" href="#doctype.pattern">doctype</a> =</dt>
-      <dd>
-        A string that is an <a href="#ascii-case-insensitive">ASCII
-          case-insensitive</a> match for the following regular
-        expression:
-        <pre><code class="regexp">&lt;!doctype\s+html\s*&gt;</code></pre>
-      </dd>
-    </dl>
+    <p>A
+    <dfn id="doctype" title="doctype">doctype</dfn>
+    (sometimes capitalized as &#8220;DOCTYPE&#8221;) is an special instruction
+    which, for legacy reasons that have to do with processing
+    modes in browsers, is a required part of any
+    <a href="#syntax-document-html">document in the HTML syntax</a>;
+    it must either be a 
+    <a href="#deprecated-doctype">deprecated doctype</a>,
+    or must consist of the following parts, in exactly the
+    following order:</p>
+    <ol>
+<li>A
+      "<code title="U+003C LESS-THAN SIGN">&lt;</code>"
+      character.</li>
+      <li>A
+      "<code title="U+0021 EXCLAMATION MARK">!</code>"
+      character.</li>
+      <li>Any
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>DOCTYPE</code>".</li>
+      <li>One or more
+      <a href="#space">space characters</a>.</li>
+      <li>Any
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>".</li>
+      <li>Optionally, a
+      <a href="#doctype-legacy-string">doctype legacy string</a>.</li>
+      <li>Optionally, one or more
+      <a href="#space">space characters</a>.</li>
+      <li>A
+      "<code title="U+003E GREATER-THAN SIGN">&gt;</code>"
+      character.</li>
+    </ol>
+<p>A
+    <dfn id="doctype-legacy-string" title="doctype-legacy-string">doctype legacy string</dfn>
+    consists of the following parts, in exactly the following
+    order.</p>
+    <ol>
+<li>One or more
+      <a href="#space">space characters</a>.</li>
+      <li>Any
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>SYSTEM</code>".</li>
+      <li>One or more
+      <a href="#space">space characters</a>
+</li>
+      <li>A <i>quote mark</i>, consisting of either
+      a
+      "<code title="U+0022 QUOTATION MARK">"</code>"
+      character or a
+      "<code title="U+0027 APOSTROPHE">'</code>"
+      character.</li>
+      <li>The literal string
+      "<code>about:legacy-compat</code>".</li>
+      <li>A matching <i>quote mark</i>, identical to the
+      <i>quote mark</i> used earlier (either a
+      "<code title="U+0022 QUOTATION MARK">"</code>"
+      character or a
+      "<code title="U+0027 APOSTROPHE">'</code>"
+      character).</li>
+    </ol>
 <div class="example">
-    <p>The following are examples of some DOCTYPEs that match the
-    <a href="#doctype">doctype</a> pattern.</p>
-    <pre>&lt;!doctype html&gt;</pre>
-    <pre>&lt;!DOCTYPE HTML&gt;</pre>
+    <p>The following are examples of some conformant
+    <a href="#doctype">doctypes</a>.</p>
+    <pre>&lt;!DOCTYPE html&gt;</pre>
+    <pre>&lt;!doctype HTML system "about:legacy-compat"&gt;</pre>
     </div>
-    <p>The <code>doctype.legacy</code> pattern is defined as follows:</p>
-    <dl class="pattern-def">
-<dt>
-<a id="doctype.legacy" href="#doctype.legacy">doctype.legacy</a> =</dt>
-      <dd>
-        A string that is an <a href="#ascii-case-insensitive">ASCII
-          case-insensitive</a> match for the following regular
-        expression:
-        <pre><code class="regexp">&lt;!doctype\s+html\s+system\s+("about:legacy-compat"|'about:legacy-compat')\s*&gt;</code></pre>
-        &#8230;except for the <code>about:legacy-compat</code> part,
-        which must match exactly (not case-insensitively).
-        </dd>
-    </dl>
+    <p>A
+    <dfn id="deprecated-doctype" title="deprecated-doctype">deprecated doctype</dfn>
+    is a
+    <dfn id="doctype-declaration" title="doctype-declaration">document type declaration</dfn>
+    as defined in the XML specification
+    <a href="#refsXML">[XML]</a>,
+    with the further restriction that it must meet one of the
+    following sets of constraints:</p>
+    <ul>
+<li>The
+      <a href="#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD HTML 4.0//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/REC-html40/strict.dtd</code>".</li>
+      <li>The
+      <a href="#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD HTML 4.01//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/html4/strict.dtd</code>".</li>
+      <li>The
+      <a href="#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD XHTML 1.0 Strict//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code>".</li>
+      <li>The
+      <a href="#doctype-declaration">document type declaration&#8217;s</a>
+      name part is an
+      <a href="#ascii-case-insensitive">ASCII case-insensitive</a>
+      match for the string
+      "<code>HTML</code>",
+      its public identifier is an exact match for the literal string
+      "<code>-//W3C//DTD XHTML 1.1//EN</code>",
+      and its system identifier is either missing is an exact
+      match for the literal string
+      "<code>http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</code>".</li>
+    </ul>
 <div class="example">
-    <p>The following are examples of some DOCTYPEs that match the
-    <a href="#doctype.legacy">doctype.legacy</a> pattern.</p>
-    <pre>&lt;!doctype html system 'about:legacy-compat'&gt;</pre>
-    <pre>&lt;!DOCTYPE HTML system "about:legacy-compat"&gt;</pre>
-    </div>
-    <p>The following are examples of some DOCTYPEs that match the 
-    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a>
-    production defined in the XML specification
-    <a href="#refsXML">[XML]</a>.</p>
-    <div class="example">
+    <p>The following are examples of
+    <a href="#deprecated-doctype">deprecated doctypes</a>.</p>
     <pre>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;</pre>
     <pre>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
@@ -1423,27 +1496,33 @@
             syntax:</p>
             <pre>&lt;input <em>disabled</em>&gt;</pre>
           </div>
-          <p>If an attribute using the empty attribute syntax is
-          followed by another attribute, then there must be at
-          least one
-          <a href="#space">space character</a>
-          between the value and the other attribute.</p>
         </dd>
         <dt><dfn id="syntax-attr-unquoted" title="syntax-attr-unquoted">Unquoted attribute-value syntax</dfn></dt>
         <dd>
-        <p>An attribute and its value may be specified by providing
-          the <a href="#attribute-name">attribute name</a>,
-          followed by zero or more
-          <a href="#space">space characters</a>,
-          followed by a single
-          "<code title="U+003D EQUALS SIGN">=</code>"
-          character, followed by zero or more
-          <a href="#space">space characters</a>,
-          followed by the
-          <a href="#syntax-attribute-value">attribute value</a>.</p>
-          <p>In addition to the general requirements given above for
-          attribute values, an 
-          <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn>:</p>
+        <p>An
+          <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn>
+          is specified by providing the following parts in exactly
+          the following order:</p>
+          <ol>
+<li>an
+            <a href="#attribute-name">attribute name</a>
+</li>
+            <li>zero or more
+            <a href="#space">space characters</a>
+</li>
+            <li>a single
+            "<code title="U+003D EQUALS SIGN">=</code>"
+            character</li>
+            <li>zero or more
+            <a href="#space">space characters</a>
+</li>
+            <li>an
+            <a href="#syntax-attribute-value">attribute value</a>
+</li>
+          </ol>
+<p>In addition to the general requirements given above for
+          attribute values, an unquoted attribute value has the
+          following restrictions:</p>
           <ul>
 <li>must not contain any literal
             <a href="#space">space characters</a>
@@ -1463,11 +1542,6 @@
             syntax:</p>
             <pre>&lt;input <em>value=yes</em>&gt;</pre>
           </div>
-          <p>If the value an attribute using the unquoted
-            attribute syntax is followed by another attribute,
-            then there must be at least one
-            <a href="#space">space character</a>
-            between the value and the other attribute.</p>
           <p>If the value of an attribute using the unquoted
             attribute syntax is followed by a
             "<code title="U+002F SOLIDUS">/</code>"
@@ -1479,76 +1553,92 @@
         </dd>
         <dt><dfn id="syntax-attr-single-quoted">Single-quoted attribute-value syntax</dfn></dt>
         <dd>
-          <p>An attribute and its value may be specified by
-          providing the
-          <a href="#attribute-name">attribute name</a>,
-          followed by zero or more
-          <a href="#space">space characters</a>,
-          followed by a single
-          "<code title="U+003D EQUALS SIGN">=</code>"
-          character, followed by zero or more
-          <a href="#space">space characters</a>,
-          followed by a single
-          "<code title="U+0027 APOSTROPHE">'</code>"
-          character, followed by the
-          <a href="#syntax-attribute-value">attribute value</a>,
-          followed by a single
-          "<code title="U+0027 APOSTROPHE">'</code>"
-          character.</p>
-          <p>In addition to the general requirements given above
-          for attribute values, a
+          <p>A
           <dfn id="attr-value-single-quoted" title="attr-value-single-quoted">single-quoted attribute value</dfn>
-          must not contain any literal
-          "<code title="U+0027 APOSTROPHE">'</code>"
-          characters.</p>
-          <div class="example">
+          is specified by providing the following parts in exactly
+          the following order:</p>
+          <ol>
+<li>an
+            <a href="#attribute-name">attribute name</a>
+</li>
+            <li>zero or more
+            <a href="#space">space characters</a>
+</li>
+            <li>a
+            "<code title="U+003D EQUALS SIGN">=</code>"
+            character</li>
+            <li>zero or more
+            <a href="#space">space characters</a>
+</li>
+            <li>a single
+            "<code title="U+0027 APOSTROPHE">'</code>"
+            character</li>
+            <li>an
+            <a href="#syntax-attribute-value">attribute value</a>
+</li>
+            <li>a
+            "<code title="U+0027 APOSTROPHE">'</code>"
+            character.</li>
+          </ol>
+<p>In addition to the general requirements given above
+          for attribute values, a single-quoted attribute value
+          has the following restriction:</p>
+          <ul>
+<li>must not contain any literal
+            "<code title="U+0027 APOSTROPHE">'</code>"
+            characters</li>
+          </ul>
+<div class="example">
             <p>In the following example, the
             <code title="attr-input-type">type</code> attribute
             is given with the single-quoted attribute value
             syntax:</p>
             <pre>&lt;input <em>type='checkbox'</em>&gt;</pre>
           </div>
-          <p>If the value of an attribute using the single-quoted
-          attribute syntax is followed by another attribute, then
-          there must be at least one
-          <a href="#space">space character</a>
-          after the value and before the other attribute.</p>
         </dd>
         <dt><dfn id="syntax-attr-double-quoted">Double-quoted attribute-value syntax</dfn></dt>
         <dd>
-          <p>An attribute and its value may be specified by
-          providing the
-          <a href="#attribute-name">attribute name</a>,
-          followed by zero or more
-          <a href="#space">space characters</a>,
-          followed by a single
-          "<code title="U+003D EQUALS SIGN character">=</code>"
-          character, followed by zero or more
-          <a href="#space">space characters</a>,
-          followed by a single
-          "<code title="U+0022 QUOTATION MARK">"</code>" character,
-          followed by the
-          <a href="#syntax-attribute-value">attribute value</a>,
-          and followed by a
-          "<code title="double U+0022 QUOTATION MARK">"</code>"
-          character.</p>
-          <p>In addition to the general requirements given above for
-          attribute values, a
+          <p>A
           <dfn id="attr-value-double-quoted" title="attr-value-double-quoted">double-quoted attribute value</dfn>
-          must not contain any literal
-          "<code title="U+0022 QUOTATION MARK">"</code>"
-          characters.</p>
-          <div class="example">
+          is specified by providing the following parts in exactly
+          the following order:</p>
+          <ol>
+<li>an
+            <a href="#attribute-name">attribute name</a>
+</li>
+            <li>zero or more
+            <a href="#space">space characters</a>
+</li>
+            <li>a single
+            "<code title="U+003D EQUALS SIGN character">=</code>"
+            character</li>
+            <li>zero or more
+            <a href="#space">space characters</a>
+</li>
+            <li>a single
+            "<code title="U+0022 QUOTATION MARK">"</code>"
+            character</li>
+            <li>an
+            <a href="#syntax-attribute-value">attribute value</a>
+</li>
+            <li>a
+            "<code title="double U+0022 QUOTATION MARK">"</code>"
+            character</li>
+          </ol>
+<p>In addition to the general requirements given above for
+          attribute values, a double-quoted attribute value has
+          the following restriction:</p>
+          <ul>
+<li>must not contain any literal
+            "<code title="U+0022 QUOTATION MARK">"</code>"
+            characters</li>
+          </ul>
+<div class="example">
             <p>In the following example, the
             <code>title</code> attribute is
             given with the double-quoted attribute value syntax:</p>
             <pre>&lt;code title="U+003C LESS-THAN SIGN"&gt;&amp;lt;&lt;/code&gt;</pre>
           </div>
-          <p>If the value of attribute using the double-quoted
-          attribute syntax is followed by another attribute, then
-          there must be at least one
-          <a href="#space">space character</a>
-          after the value and before the other attribute.</p>
         </dd>
       </dl>
 </div>
@@ -1740,15 +1830,21 @@
 <dl>
 <dt><dfn id="named-charref">Named character reference</dfn></dt>
       <dd>
-<p>A named character reference is an
-        "<code title="U+0026 AMPERSAND">&amp;</code>"
-        character followed by one of the entity names defined in
-        <cite>XML Entity definitions for Characters</cite>
-        <a href="#refsEntities">[Entities]</a>,
-        using the same case, followed by a
-        "<code title="U+003B SEMICOLON">;</code>"
-        character.</p>
-        <div class="example">
+<p>Named character references consist of the following
+        parts in exactly the following order:</p>
+        <ol>
+<li>An
+          "<code title="U+0026 AMPERSAND">&amp;</code>"
+          character.</li>
+          <li>One of the entity names defined in
+          <cite>XML Entity definitions for Characters</cite>
+          <a href="#refsEntities">[Entities]</a>,
+          using the same case.</li>
+          <li>A
+          "<code title="U+003B SEMICOLON">;</code>"
+          character.</li>
+        </ol>
+<div class="example">
           <p>The following is an example of a named character
           reference for the character
           "<code title="U+2020 DAGGER">&#8224;</code>"
@@ -1758,21 +1854,28 @@
       </dd>
       <dt><dfn id="dec-charref">Decimal numeric character reference</dfn></dt>
       <dd>
-<p>A decimal numerical character reference is an
-        "<code title="U+0026 AMPERSAND">&amp;</code>"
-        character, followed by a 
-        "<code title="U+0023 NUMBER SIGN">#</code>"
-        character, followed by one or more digits in the range
-        <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
-        representing a base-ten integer that itself is a Unicode
-        code point that is not
-        U+0000,
-        U+000D,
-        in the range U+0080&#8211;U+009F,
-        or in the range 0xD8000&#8211;0xDFFF (surrogates).
-        The digits must then be followed by a
-        "<code title="U+003B SEMICOLON">;</code>" character.</p>
-        <div class="example">
+<p>Decimal numerical character references consist of the
+        following parts, in exactly the following order.</p>
+        <ol>
+<li>An
+          "<code title="U+0026 AMPERSAND">&amp;</code>"
+          character.</li>
+          <li>A
+          "<code title="U+0023 NUMBER SIGN">#</code>"
+          character.</li>
+          <li>One or more digits in the range
+          <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
+          representing a base-ten integer that itself is a Unicode
+          code point that is not
+          U+0000,
+          U+000D,
+          in the range U+0080&#8211;U+009F,
+          or in the range 0xD8000&#8211;0xDFFF (surrogates).</li>
+          <li>A
+          "<code title="U+003B SEMICOLON">;</code>"
+          character.</li>
+        </ol>
+<div class="example">
           <p>The following is an example of a decimal numeric
           character reference for the character
           "<code title="U+2020 DAGGER">&#8224;</code>"
@@ -1782,31 +1885,37 @@
       </dd>
       <dt><dfn id="hex-charref">Hexadecimal numeric character reference</dfn></dt>
       <dd>
-<p>A hexadecimal numeric character reference is an
-        "<code title="U+0026 AMPERSAND">&amp;</code>"
-        character, followed by a 
-        "<code title="U+0023 NUMBER SIGN">#</code>"
-      character, followed by either a
-      "<code title="U+0078 LATIN SMALL LETTER X">x</code>"
-      character
-      or a
-      "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>"
-      character, followed by
-      one or more digits in the range
-      <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
-      <code title="U+0061 LATIN SMALL LETTER A&#8211;U+0066 LATIN SMALL LETTER F">a&#8211;f</code>,
-      and
-      <code title="U+0041 LATIN CAPITAL LETTER A&#8211;U+0046 LATIN CAPITAL LETTER F">A&#8211;F</code>,
-      representing a base-sixteen integer that itself is a Unicode
-      code point that is not
-      U+0000,
-      U+000D,
-      in the range U+0080&#8211;U+009F,
-      or in the range 0xD800&#8211;0xDFFF (surrogates).
-      The digits must then be followed by a 
-      "<code title="U+003B SEMICOLON">;</code>"
-      character.</p>
-        <div class="example">
+<p>Hexadecimal numeric character references consist of
+        the following parts, in exactly the following order.</p>
+        <ol>
+<li>An
+          "<code title="U+0026 AMPERSAND">&amp;</code>"
+          character.</li>
+          <li>A
+          "<code title="U+0023 NUMBER SIGN">#</code>"
+          character.</li>
+          <li>Either a
+          "<code title="U+0078 LATIN SMALL LETTER X">x</code>"
+          character
+          or a
+          "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>"
+          character.</li>
+          <li>One or more digits in the range
+          <code title="U+0030 DIGIT ZERO&#8211;U+0039 DIGIT NINE">0&#8211;9</code>,
+          <code title="U+0061 LATIN SMALL LETTER A&#8211;U+0066 LATIN SMALL LETTER F">a&#8211;f</code>,
+          and
+          <code title="U+0041 LATIN CAPITAL LETTER A&#8211;U+0046 LATIN CAPITAL LETTER F">A&#8211;F</code>,
+          representing a base-sixteen integer that itself is a
+          Unicode code point that is not
+          U+0000,
+          U+000D,
+          in the range U+0080&#8211;U+009F,
+          or in the range 0xD800&#8211;0xDFFF (surrogates).</li>
+          <li>A
+          "<code title="U+003B SEMICOLON">;</code>"
+          character.</li>
+        </ol>
+<div class="example">
           <p>The following is an example of a hexadecimal numeric
           character reference for the character
           "<code title="U+2020 DAGGER">&#8224;</code>"
@@ -1882,8 +1991,13 @@
       that is not itself in an
       <a href="#syntax-escape">escaping text span</a>,
       and ends at the next
-      <a href="#syntax-escape-end">escaping text span end</a>.</p>
-    <p>An
+      <a href="#syntax-escape-end">escaping text span end</a>.
+      Escaping text spans have the following restriction:</p>
+    <ul>
+<li>must not contain any <a href="#syntax-charref">character references</a>
+</li>
+    </ul>
+<p>An
       <dfn id="syntax-escape-start">escaping text span start</dfn>
       is the
       <a href="#syntax-text" title="syntax-text">text</a>
@@ -1913,20 +2027,16 @@
          <a href="#syntax-text">text</a>;
          it is not a
          <a href="#comment-end-delimiter">comment end delimiter</a>.</li>
+         <li>Any sequences of characters within an
+         <a href="#syntax-escape">escaping text span</a>
+         that look like
+         <a href="#syntax-charref">character references</a>
+         are
+         <a href="#syntax-text">text</a>,
+         not 
+         <a href="#syntax-charref">character references</a>.</li>
        </ul>
 </div>
-     <p>There cannot be any
-      <a href="#syntax-charref">character references</a>
-      inside an
-      <a href="#syntax-escape">escaping text span</a>;
-      any sequences of characters within an
-      <a href="#syntax-escape">escaping text span</a>
-      that may look like
-      <a href="#syntax-charref">character references</a>
-      are in fact 
-      <a href="#syntax-text">text</a>,
-      not 
-      <a href="#syntax-charref">character references</a>.</p>
      <p>An
      <a href="#syntax-escape-start">escaping text span start</a>
      may share its

Received on Friday, 7 August 2009 14:51:43 UTC