- From: poot <cvsmail@w3.org>
- Date: Fri, 7 Aug 2009 23:50:57 +0900 (JST)
- To: public-html-diffs@w3.org
mike: made a number of refinements to the Syntax section http://dev.w3.org/cvsweb/html5/markup/Overview.html?r1=1.345&r2=1.346&f=h =================================================================== RCS file: /sources/public/html5/markup/Overview.html,v retrieving revision 1.345 retrieving revision 1.346 diff -u -d -r1.345 -r1.346 --- Overview.html 6 Aug 2009 10:34:34 -0000 1.345 +++ Overview.html 7 Aug 2009 14:50:28 -0000 1.346 @@ -9,7 +9,7 @@ <body> <div class="head"> <h1>HTML 5: The Markup Language</h1> -<h2>Editor’s Draft <em>6 August 2009</em> +<h2>Editor’s Draft <em>7 August 2009</em> </h2> <dl> <dt>Latest Editor’s Draft:</dt> @@ -42,7 +42,7 @@ <p> - This document is the 6 August 2009 Editor’s Draft of + This document is the 7 August 2009 Editor’s Draft of <cite>HTML 5: The Markup Language</cite>. </p> <p> @@ -191,7 +191,7 @@ <span class="toc-section-number"> </span><a href="syntax.html#syntax"><span class="toc-section-number">6.</span> HTML syntax</a> <ul> <li id="doctype-syntax-toc"> -<span class="toc-section-number"></span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">6.01.</span> The DOCTYPE</a> +<span class="toc-section-number"></span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">6.01.</span> The doctype</a> </li> <li id="character-encoding-toc"> <span class="toc-section-number"></span><a href="syntax.html#character-encoding"><span class="toc-section-number">6.02.</span> Character encoding declaration</a> Index: syntax.html =================================================================== RCS file: /sources/public/html5/markup/syntax.html,v retrieving revision 1.24 retrieving revision 1.25 diff -u -d -r1.24 -r1.25 --- syntax.html 6 Aug 2009 11:02:08 -0000 1.24 +++ syntax.html 7 Aug 2009 14:50:29 -0000 1.25 @@ -15,7 +15,7 @@ <h2>6. HTML syntax <a class="hash" href="#syntax">#</a> <a class="toc-bak" href="Overview.html#syntax-toc">T</a></h2> <div class="toc"> <ul> -<li id="doctype-syntax-toc"><span class="toc-section-number"> </span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">1.</span> The DOCTYPE</a> +<li id="doctype-syntax-toc"><span class="toc-section-number"> </span><a href="syntax.html#doctype-syntax"><span class="toc-section-number">1.</span> The doctype</a> </li> <li id="character-encoding-toc"><span class="toc-section-number"> </span><a href="syntax.html#character-encoding"><span class="toc-section-number">2.</span> Character encoding declaration</a> </li> @@ -38,59 +38,133 @@ </ul> </div> <div id="doctype-syntax" class="section"> - <h2>6.01. The DOCTYPE <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="Overview.html#doctype-syntax-toc">T</a></h2> - <p>A <dfn id="doctype" title="syntax-doctype">DOCTYPE</dfn> is - an special instruction which, for legacy reasons that have to - do with processing modes in browsers, is a required part of - any - <a href="documents.html#syntax-document-html">document in the HTML syntax</a>.</p> - <p>The DOCTYPE must match either the - <a href="syntax.html#doctype.pattern">doctype</a> - or - <a href="syntax.html#doctype.legacy">doctype.legacy</a> - patterns defined this specification, or must match the - <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a> - production defined in the XML specification - <a href="references.html#refsXML">[XML]</a>.</p> - <p>The <code>doctype</code> pattern is defined as follows:</p> - <dl class="pattern-def"> - <dt><a id="doctype.pattern" href="syntax.html#doctype.pattern">doctype</a> =</dt> - <dd> - A string that is an <a href="terminology.html#ascii-case-insensitive">ASCII - case-insensitive</a> match for the following regular - expression: - <pre><code class="regexp"><!doctype\s+html\s*></code></pre> - </dd> - </dl> - <div class="example"> - <p>The following are examples of some DOCTYPEs that match the - <a href="syntax.html#doctype">doctype</a> pattern.</p> - <pre><!doctype html></pre> - <pre><!DOCTYPE HTML></pre> - </div> - <p>The <code>doctype.legacy</code> pattern is defined as follows:</p> - <dl class="pattern-def"> - <dt><a id="doctype.legacy" href="syntax.html#doctype.legacy">doctype.legacy</a> =</dt> - <dd> - A string that is an <a href="terminology.html#ascii-case-insensitive">ASCII - case-insensitive</a> match for the following regular - expression: - <pre><code class="regexp"><!doctype\s+html\s+system\s+("about:legacy-compat"|'about:legacy-compat')\s*></code></pre> - …except for the <code>about:legacy-compat</code> part, - which must match exactly (not case-insensitively). - </dd> - </dl> + <h2>6.01. The doctype <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="Overview.html#doctype-syntax-toc">T</a></h2> + <p>A + <dfn id="doctype" title="doctype">doctype</dfn> + (sometimes capitalized as “DOCTYPE”) is an special instruction + which, for legacy reasons that have to do with processing + modes in browsers, is a required part of any + <a href="documents.html#syntax-document-html">document in the HTML syntax</a>; + it must either be a + <a href="syntax.html#deprecated-doctype">deprecated doctype</a>, + or must consist of the following parts, in exactly the + following order:</p> + <ol> + <li>A + "<code title="U+003C LESS-THAN SIGN"><</code>" + character.</li> + <li>A + "<code title="U+0021 EXCLAMATION MARK">!</code>" + character.</li> + <li>Any + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>DOCTYPE</code>".</li> + <li>One or more + <a href="terminology.html#space">space characters</a>.</li> + <li>Any + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>".</li> + <li>Optionally, a + <a href="syntax.html#doctype-legacy-string">doctype legacy string</a>.</li> + <li>Optionally, one or more + <a href="terminology.html#space">space characters</a>.</li> + <li>A + "<code title="U+003E GREATER-THAN SIGN">></code>" + character.</li> + </ol> + <p>A + <dfn id="doctype-legacy-string" title="doctype-legacy-string">doctype legacy string</dfn> + consists of the following parts, in exactly the following + order.</p> + <ol> + <li>One or more + <a href="terminology.html#space">space characters</a>.</li> + <li>Any + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>SYSTEM</code>".</li> + <li>One or more + <a href="terminology.html#space">space characters</a></li> + <li>A <i>quote mark</i>, consisting of either + a + "<code title="U+0022 QUOTATION MARK">"</code>" + character or a + "<code title="U+0027 APOSTROPHE">'</code>" + character.</li> + <li>The literal string + "<code>about:legacy-compat</code>".</li> + <li>A matching <i>quote mark</i>, identical to the + <i>quote mark</i> used earlier (either a + "<code title="U+0022 QUOTATION MARK">"</code>" + character or a + "<code title="U+0027 APOSTROPHE">'</code>" + character).</li> + </ol> <div class="example"> - <p>The following are examples of some DOCTYPEs that match the - <a href="syntax.html#doctype.legacy">doctype.legacy</a> pattern.</p> - <pre><!doctype html system 'about:legacy-compat'></pre> - <pre><!DOCTYPE HTML system "about:legacy-compat"></pre> + <p>The following are examples of some conformant + <a href="syntax.html#doctype">doctypes</a>.</p> + <pre><!DOCTYPE html></pre> + <pre><!doctype HTML system "about:legacy-compat"></pre> </div> - <p>The following are examples of some DOCTYPEs that match the - <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a> - production defined in the XML specification - <a href="references.html#refsXML">[XML]</a>.</p> + <p>A + <dfn id="deprecated-doctype" title="deprecated-doctype">deprecated doctype</dfn> + is a + <dfn id="doctype-declaration" title="doctype-declaration">document type declaration</dfn> + as defined in the XML specification + <a href="references.html#refsXML">[XML]</a>, + with the further restriction that it must meet one of the + following sets of constraints:</p> + <ul> + <li>The + <a href="syntax.html#doctype-declaration">document type declaration’s</a> + name part is an + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD HTML 4.0//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/REC-html40/strict.dtd</code>".</li> + <li>The + <a href="syntax.html#doctype-declaration">document type declaration’s</a> + name part is an + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD HTML 4.01//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/html4/strict.dtd</code>".</li> + <li>The + <a href="syntax.html#doctype-declaration">document type declaration’s</a> + name part is an + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD XHTML 1.0 Strict//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code>".</li> + <li>The + <a href="syntax.html#doctype-declaration">document type declaration’s</a> + name part is an + <a href="terminology.html#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD XHTML 1.1//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</code>".</li> + </ul> <div class="example"> + <p>The following are examples of + <a href="syntax.html#deprecated-doctype">deprecated doctypes</a>.</p> <pre><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"></pre> <pre><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" @@ -402,27 +476,29 @@ syntax:</p> <pre><input <em>disabled</em>></pre> </div> - <p>If an attribute using the empty attribute syntax is - followed by another attribute, then there must be at - least one - <a href="terminology.html#space">space character</a> - between the value and the other attribute.</p> </dd> <dt><dfn id="syntax-attr-unquoted" title="syntax-attr-unquoted">Unquoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by providing - the <a href="syntax.html#attribute-name">attribute name</a>, - followed by zero or more - <a href="terminology.html#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN">=</code>" - character, followed by zero or more - <a href="terminology.html#space">space characters</a>, - followed by the - <a href="syntax.html#syntax-attribute-value">attribute value</a>.</p> + <p>An + <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn> + is specified by providing the following parts in exactly + the following order:</p> + <ol> + <li>an + <a href="syntax.html#attribute-name">attribute name</a></li> + <li>zero or more + <a href="terminology.html#space">space characters</a></li> + <li>a single + "<code title="U+003D EQUALS SIGN">=</code>" + character</li> + <li>zero or more + <a href="terminology.html#space">space characters</a></li> + <li>an + <a href="syntax.html#syntax-attribute-value">attribute value</a></li> + </ol> <p>In addition to the general requirements given above for - attribute values, an - <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn>:</p> + attribute values, an unquoted attribute value has the + following restrictions:</p> <ul> <li>must not contain any literal <a href="terminology.html#space">space characters</a></li> @@ -441,11 +517,6 @@ syntax:</p> <pre><input <em>value=yes</em>></pre> </div> - <p>If the value an attribute using the unquoted - attribute syntax is followed by another attribute, - then there must be at least one - <a href="terminology.html#space">space character</a> - between the value and the other attribute.</p> <p>If the value of an attribute using the unquoted attribute syntax is followed by a "<code title="U+002F SOLIDUS">/</code>" @@ -457,28 +528,37 @@ </dd> <dt><dfn id="syntax-attr-single-quoted">Single-quoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by - providing the - <a href="syntax.html#attribute-name">attribute name</a>, - followed by zero or more - <a href="terminology.html#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN">=</code>" - character, followed by zero or more - <a href="terminology.html#space">space characters</a>, - followed by a single - "<code title="U+0027 APOSTROPHE">'</code>" - character, followed by the - <a href="syntax.html#syntax-attribute-value">attribute value</a>, - followed by a single - "<code title="U+0027 APOSTROPHE">'</code>" - character.</p> - <p>In addition to the general requirements given above - for attribute values, a + <p>A <dfn id="attr-value-single-quoted" title="attr-value-single-quoted">single-quoted attribute value</dfn> - must not contain any literal - "<code title="U+0027 APOSTROPHE">'</code>" - characters.</p> + is specified by providing the following parts in exactly + the following order:</p> + <ol> + <li>an + <a href="syntax.html#attribute-name">attribute name</a></li> + <li>zero or more + <a href="terminology.html#space">space characters</a></li> + <li>a + "<code title="U+003D EQUALS SIGN">=</code>" + character</li> + <li>zero or more + <a href="terminology.html#space">space characters</a></li> + <li>a single + "<code title="U+0027 APOSTROPHE">'</code>" + character</li> + <li>an + <a href="syntax.html#syntax-attribute-value">attribute value</a></li> + <li>a + "<code title="U+0027 APOSTROPHE">'</code>" + character.</li> + </ol> + <p>In addition to the general requirements given above + for attribute values, a single-quoted attribute value + has the following restriction:</p> + <ul> + <li>must not contain any literal + "<code title="U+0027 APOSTROPHE">'</code>" + characters</li> + </ul> <div class="example"> <p>In the following example, the <code title="attr-input-type">type</code> attribute @@ -486,47 +566,46 @@ syntax:</p> <pre><input <em>type='checkbox'</em>></pre> </div> - <p>If the value of an attribute using the single-quoted - attribute syntax is followed by another attribute, then - there must be at least one - <a href="terminology.html#space">space character</a> - after the value and before the other attribute.</p> </dd> <dt><dfn id="syntax-attr-double-quoted">Double-quoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by - providing the - <a href="syntax.html#attribute-name">attribute name</a>, - followed by zero or more - <a href="terminology.html#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN character">=</code>" - character, followed by zero or more - <a href="terminology.html#space">space characters</a>, - followed by a single - "<code title="U+0022 QUOTATION MARK">"</code>" character, - followed by the - <a href="syntax.html#syntax-attribute-value">attribute value</a>, - and followed by a - "<code title="double U+0022 QUOTATION MARK">"</code>" - character.</p> - <p>In addition to the general requirements given above for - attribute values, a + <p>A <dfn id="attr-value-double-quoted" title="attr-value-double-quoted">double-quoted attribute value</dfn> - must not contain any literal - "<code title="U+0022 QUOTATION MARK">"</code>" - characters.</p> + is specified by providing the following parts in exactly + the following order:</p> + <ol> + <li>an + <a href="syntax.html#attribute-name">attribute name</a></li> + <li>zero or more + <a href="terminology.html#space">space characters</a></li> + <li>a single + "<code title="U+003D EQUALS SIGN character">=</code>" + character</li> + <li>zero or more + <a href="terminology.html#space">space characters</a></li> + <li>a single + "<code title="U+0022 QUOTATION MARK">"</code>" + character</li> + <li>an + <a href="syntax.html#syntax-attribute-value">attribute value</a></li> + <li>a + "<code title="double U+0022 QUOTATION MARK">"</code>" + character</li> + </ol> + <p>In addition to the general requirements given above for + attribute values, a double-quoted attribute value has + the following restriction:</p> + <ul> + <li>must not contain any literal + "<code title="U+0022 QUOTATION MARK">"</code>" + characters</li> + </ul> <div class="example"> <p>In the following example, the <code>title</code> attribute is given with the double-quoted attribute value syntax:</p> <pre><code title="U+003C LESS-THAN SIGN">&lt;</code></pre> </div> - <p>If the value of attribute using the double-quoted - attribute syntax is followed by another attribute, then - there must be at least one - <a href="terminology.html#space">space character</a> - after the value and before the other attribute.</p> </dd> </dl> </div> @@ -706,14 +785,20 @@ </ul> <dl> <dt><dfn id="named-charref">Named character reference</dfn></dt> - <dd><p>A named character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character followed by one of the entity names defined in - <cite>XML Entity definitions for Characters</cite> - <a href="references.html#refsEntities">[Entities]</a>, - using the same case, followed by a - "<code title="U+003B SEMICOLON">;</code>" - character.</p> + <dd><p>Named character references consist of the following + parts in exactly the following order:</p> + <ol> + <li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>One of the entity names defined in + <cite>XML Entity definitions for Characters</cite> + <a href="references.html#refsEntities">[Entities]</a>, + using the same case.</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> <div class="example"> <p>The following is an example of a named character reference for the character @@ -723,20 +808,27 @@ </div> </dd> <dt><dfn id="dec-charref">Decimal numeric character reference</dfn></dt> - <dd><p>A decimal numerical character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character, followed by a - "<code title="U+0023 NUMBER SIGN">#</code>" - character, followed by one or more digits in the range - <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, - representing a base-ten integer that itself is a Unicode - code point that is not - U+0000, - U+000D, - in the range U+0080–U+009F, - or in the range 0xD8000–0xDFFF (surrogates). - The digits must then be followed by a - "<code title="U+003B SEMICOLON">;</code>" character.</p> + <dd><p>Decimal numerical character references consist of the + following parts, in exactly the following order.</p> + <ol> + <li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>A + "<code title="U+0023 NUMBER SIGN">#</code>" + character.</li> + <li>One or more digits in the range + <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, + representing a base-ten integer that itself is a Unicode + code point that is not + U+0000, + U+000D, + in the range U+0080–U+009F, + or in the range 0xD8000–0xDFFF (surrogates).</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> <div class="example"> <p>The following is an example of a decimal numeric character reference for the character @@ -746,30 +838,36 @@ </div> </dd> <dt><dfn id="hex-charref">Hexadecimal numeric character reference</dfn></dt> - <dd><p>A hexadecimal numeric character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character, followed by a - "<code title="U+0023 NUMBER SIGN">#</code>" - character, followed by either a - "<code title="U+0078 LATIN SMALL LETTER X">x</code>" - character - or a - "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>" - character, followed by - one or more digits in the range - <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, - <code title="U+0061 LATIN SMALL LETTER A–U+0066 LATIN SMALL LETTER F">a–f</code>, - and - <code title="U+0041 LATIN CAPITAL LETTER A–U+0046 LATIN CAPITAL LETTER F">A–F</code>, - representing a base-sixteen integer that itself is a Unicode - code point that is not - U+0000, - U+000D, - in the range U+0080–U+009F, - or in the range 0xD800–0xDFFF (surrogates). - The digits must then be followed by a - "<code title="U+003B SEMICOLON">;</code>" - character.</p> + <dd><p>Hexadecimal numeric character references consist of + the following parts, in exactly the following order.</p> + <ol> + <li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>A + "<code title="U+0023 NUMBER SIGN">#</code>" + character.</li> + <li>Either a + "<code title="U+0078 LATIN SMALL LETTER X">x</code>" + character + or a + "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>" + character.</li> + <li>One or more digits in the range + <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, + <code title="U+0061 LATIN SMALL LETTER A–U+0066 LATIN SMALL LETTER F">a–f</code>, + and + <code title="U+0041 LATIN CAPITAL LETTER A–U+0046 LATIN CAPITAL LETTER F">A–F</code>, + representing a base-sixteen integer that itself is a + Unicode code point that is not + U+0000, + U+000D, + in the range U+0080–U+009F, + or in the range 0xD800–0xDFFF (surrogates).</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> <div class="example"> <p>The following is an example of a hexadecimal numeric character reference for the character @@ -844,7 +942,11 @@ that is not itself in an <a href="syntax.html#syntax-escape">escaping text span</a>, and ends at the next - <a href="syntax.html#syntax-escape-end">escaping text span end</a>.</p> + <a href="syntax.html#syntax-escape-end">escaping text span end</a>. + Escaping text spans have the following restriction:</p> + <ul> + <li>must not contain any <a href="syntax.html#syntax-charref">character references</a></li> + </ul> <p>An <dfn id="syntax-escape-start">escaping text span start</dfn> is the @@ -875,20 +977,16 @@ <a href="syntax.html#syntax-text">text</a>; it is not a <a href="syntax.html#comment-end-delimiter">comment end delimiter</a>.</li> + <li>Any sequences of characters within an + <a href="syntax.html#syntax-escape">escaping text span</a> + that look like + <a href="syntax.html#syntax-charref">character references</a> + are + <a href="syntax.html#syntax-text">text</a>, + not + <a href="syntax.html#syntax-charref">character references</a>.</li> </ul> </div> - <p>There cannot be any - <a href="syntax.html#syntax-charref">character references</a> - inside an - <a href="syntax.html#syntax-escape">escaping text span</a>; - any sequences of characters within an - <a href="syntax.html#syntax-escape">escaping text span</a> - that may look like - <a href="syntax.html#syntax-charref">character references</a> - are in fact - <a href="syntax.html#syntax-text">text</a>, - not - <a href="syntax.html#syntax-charref">character references</a>.</p> <p>An <a href="syntax.html#syntax-escape-start">escaping text span start</a> may share its Index: documents.html =================================================================== RCS file: /sources/public/html5/markup/documents.html,v retrieving revision 1.8 retrieving revision 1.9 diff -u -d -r1.8 -r1.9 --- documents.html 6 Aug 2009 10:34:34 -0000 1.8 +++ documents.html 7 Aug 2009 14:50:29 -0000 1.9 @@ -96,7 +96,7 @@ <li>Any number of <a href="syntax.html#syntax-comments">comments</a> and <a href="terminology.html#space">space characters</a>.</li> - <li>A <a href="syntax.html#doctype">DOCTYPE</a>.</li> + <li>A <a href="syntax.html#doctype">doctype</a>.</li> <li>Any number of <a href="syntax.html#syntax-comments">comments</a> and <a href="terminology.html#space">space characters</a>.</li> @@ -118,7 +118,7 @@ character.</li> <li>Any number of comments and space characters, as defined in the XML specification <a href="references.html#refsXML">[XML]</a>.</li> - <li>Optionally, a DOCTYPE, as defined + <li>Optionally, a doctype declaration, as defined in the XML specification <a href="references.html#refsXML">[XML]</a>.</li> <li>Any number of comments and space characters, as defined in the XML specification <a href="references.html#refsXML">[XML]</a>.</li> Index: spec.html =================================================================== RCS file: /sources/public/html5/markup/spec.html,v retrieving revision 1.90 retrieving revision 1.91 diff -u -d -r1.90 -r1.91 --- spec.html 6 Aug 2009 11:02:08 -0000 1.90 +++ spec.html 7 Aug 2009 14:50:29 -0000 1.91 @@ -9,7 +9,7 @@ <body> <div class="head"> <h1>HTML 5: The Markup Language</h1> -<h2>Editor’s Draft <em>6 August 2009</em> +<h2>Editor’s Draft <em>7 August 2009</em> </h2> <dl> <dt>Latest Editor’s Draft:</dt> @@ -41,7 +41,7 @@ <p> - This document is the 6 August 2009 Editor’s Draft of + This document is the 7 August 2009 Editor’s Draft of <cite>HTML 5: The Markup Language</cite>. </p> <p> @@ -190,7 +190,7 @@ <span class="toc-section-number"> </span><a href="#syntax"><span class="toc-section-number">6.</span> HTML syntax</a> <ul> <li id="doctype-syntax-toc"> -<span class="toc-section-number"></span><a href="#doctype-syntax"><span class="toc-section-number">6.01.</span> The DOCTYPE</a> +<span class="toc-section-number"></span><a href="#doctype-syntax"><span class="toc-section-number">6.01.</span> The doctype</a> </li> <li id="character-encoding-toc"> <span class="toc-section-number"></span><a href="#character-encoding"><span class="toc-section-number">6.02.</span> Character encoding declaration</a> @@ -907,7 +907,7 @@ <li>Any number of <a href="#syntax-comments">comments</a> and <a href="#space">space characters</a>.</li> - <li>A <a href="#doctype">DOCTYPE</a>.</li> + <li>A <a href="#doctype">doctype</a>.</li> <li>Any number of <a href="#syntax-comments">comments</a> and <a href="#space">space characters</a>.</li> @@ -929,7 +929,7 @@ character.</li> <li>Any number of comments and space characters, as defined in the XML specification <a href="#refsXML">[XML]</a>.</li> - <li>Optionally, a DOCTYPE, as defined + <li>Optionally, a doctype declaration, as defined in the XML specification <a href="#refsXML">[XML]</a>.</li> <li>Any number of comments and space characters, as defined in the XML specification <a href="#refsXML">[XML]</a>.</li> @@ -1014,7 +1014,7 @@ <div class="toc"> <ul> <li id="doctype-syntax-toc"> -<span class="toc-section-number"> </span><a href="#doctype-syntax"><span class="toc-section-number">1.</span> The DOCTYPE</a> +<span class="toc-section-number"> </span><a href="#doctype-syntax"><span class="toc-section-number">1.</span> The doctype</a> </li> <li id="character-encoding-toc"> <span class="toc-section-number"> </span><a href="#character-encoding"><span class="toc-section-number">2.</span> Character encoding declaration</a> @@ -1046,62 +1046,135 @@ </ul> </div> <div id="doctype-syntax" class="section"> - <h2>6.01. The DOCTYPE <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="#doctype-syntax-toc">T</a> + <h2>6.01. The doctype <a class="hash" href="#doctype-syntax">#</a> <a class="toc-bak" href="#doctype-syntax-toc">T</a> </h2> - <p>A <dfn id="doctype" title="syntax-doctype">DOCTYPE</dfn> is - an special instruction which, for legacy reasons that have to - do with processing modes in browsers, is a required part of - any - <a href="#syntax-document-html">document in the HTML syntax</a>.</p> - <p>The DOCTYPE must match either the - <a href="#doctype.pattern">doctype</a> - or - <a href="#doctype.legacy">doctype.legacy</a> - patterns defined this specification, or must match the - <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a> - production defined in the XML specification - <a href="#refsXML">[XML]</a>.</p> - <p>The <code>doctype</code> pattern is defined as follows:</p> - <dl class="pattern-def"> -<dt> -<a id="doctype.pattern" href="#doctype.pattern">doctype</a> =</dt> - <dd> - A string that is an <a href="#ascii-case-insensitive">ASCII - case-insensitive</a> match for the following regular - expression: - <pre><code class="regexp"><!doctype\s+html\s*></code></pre> - </dd> - </dl> + <p>A + <dfn id="doctype" title="doctype">doctype</dfn> + (sometimes capitalized as “DOCTYPE”) is an special instruction + which, for legacy reasons that have to do with processing + modes in browsers, is a required part of any + <a href="#syntax-document-html">document in the HTML syntax</a>; + it must either be a + <a href="#deprecated-doctype">deprecated doctype</a>, + or must consist of the following parts, in exactly the + following order:</p> + <ol> +<li>A + "<code title="U+003C LESS-THAN SIGN"><</code>" + character.</li> + <li>A + "<code title="U+0021 EXCLAMATION MARK">!</code>" + character.</li> + <li>Any + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>DOCTYPE</code>".</li> + <li>One or more + <a href="#space">space characters</a>.</li> + <li>Any + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>".</li> + <li>Optionally, a + <a href="#doctype-legacy-string">doctype legacy string</a>.</li> + <li>Optionally, one or more + <a href="#space">space characters</a>.</li> + <li>A + "<code title="U+003E GREATER-THAN SIGN">></code>" + character.</li> + </ol> +<p>A + <dfn id="doctype-legacy-string" title="doctype-legacy-string">doctype legacy string</dfn> + consists of the following parts, in exactly the following + order.</p> + <ol> +<li>One or more + <a href="#space">space characters</a>.</li> + <li>Any + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>SYSTEM</code>".</li> + <li>One or more + <a href="#space">space characters</a> +</li> + <li>A <i>quote mark</i>, consisting of either + a + "<code title="U+0022 QUOTATION MARK">"</code>" + character or a + "<code title="U+0027 APOSTROPHE">'</code>" + character.</li> + <li>The literal string + "<code>about:legacy-compat</code>".</li> + <li>A matching <i>quote mark</i>, identical to the + <i>quote mark</i> used earlier (either a + "<code title="U+0022 QUOTATION MARK">"</code>" + character or a + "<code title="U+0027 APOSTROPHE">'</code>" + character).</li> + </ol> <div class="example"> - <p>The following are examples of some DOCTYPEs that match the - <a href="#doctype">doctype</a> pattern.</p> - <pre><!doctype html></pre> - <pre><!DOCTYPE HTML></pre> + <p>The following are examples of some conformant + <a href="#doctype">doctypes</a>.</p> + <pre><!DOCTYPE html></pre> + <pre><!doctype HTML system "about:legacy-compat"></pre> </div> - <p>The <code>doctype.legacy</code> pattern is defined as follows:</p> - <dl class="pattern-def"> -<dt> -<a id="doctype.legacy" href="#doctype.legacy">doctype.legacy</a> =</dt> - <dd> - A string that is an <a href="#ascii-case-insensitive">ASCII - case-insensitive</a> match for the following regular - expression: - <pre><code class="regexp"><!doctype\s+html\s+system\s+("about:legacy-compat"|'about:legacy-compat')\s*></code></pre> - …except for the <code>about:legacy-compat</code> part, - which must match exactly (not case-insensitively). - </dd> - </dl> + <p>A + <dfn id="deprecated-doctype" title="deprecated-doctype">deprecated doctype</dfn> + is a + <dfn id="doctype-declaration" title="doctype-declaration">document type declaration</dfn> + as defined in the XML specification + <a href="#refsXML">[XML]</a>, + with the further restriction that it must meet one of the + following sets of constraints:</p> + <ul> +<li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD HTML 4.0//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/REC-html40/strict.dtd</code>".</li> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD HTML 4.01//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/html4/strict.dtd</code>".</li> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD XHTML 1.0 Strict//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code>".</li> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD XHTML 1.1//EN</code>", + and its system identifier is either missing is an exact + match for the literal string + "<code>http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</code>".</li> + </ul> <div class="example"> - <p>The following are examples of some DOCTYPEs that match the - <a href="#doctype.legacy">doctype.legacy</a> pattern.</p> - <pre><!doctype html system 'about:legacy-compat'></pre> - <pre><!DOCTYPE HTML system "about:legacy-compat"></pre> - </div> - <p>The following are examples of some DOCTYPEs that match the - <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl"><code class="defined-elsewhere">doctypedecl</code></a> - production defined in the XML specification - <a href="#refsXML">[XML]</a>.</p> - <div class="example"> + <p>The following are examples of + <a href="#deprecated-doctype">deprecated doctypes</a>.</p> <pre><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"></pre> <pre><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" @@ -1423,27 +1496,33 @@ syntax:</p> <pre><input <em>disabled</em>></pre> </div> - <p>If an attribute using the empty attribute syntax is - followed by another attribute, then there must be at - least one - <a href="#space">space character</a> - between the value and the other attribute.</p> </dd> <dt><dfn id="syntax-attr-unquoted" title="syntax-attr-unquoted">Unquoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by providing - the <a href="#attribute-name">attribute name</a>, - followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN">=</code>" - character, followed by zero or more - <a href="#space">space characters</a>, - followed by the - <a href="#syntax-attribute-value">attribute value</a>.</p> - <p>In addition to the general requirements given above for - attribute values, an - <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn>:</p> + <p>An + <dfn id="attr-value-unquoted" title="attr-value-unquoted">unquoted attribute value</dfn> + is specified by providing the following parts in exactly + the following order:</p> + <ol> +<li>an + <a href="#attribute-name">attribute name</a> +</li> + <li>zero or more + <a href="#space">space characters</a> +</li> + <li>a single + "<code title="U+003D EQUALS SIGN">=</code>" + character</li> + <li>zero or more + <a href="#space">space characters</a> +</li> + <li>an + <a href="#syntax-attribute-value">attribute value</a> +</li> + </ol> +<p>In addition to the general requirements given above for + attribute values, an unquoted attribute value has the + following restrictions:</p> <ul> <li>must not contain any literal <a href="#space">space characters</a> @@ -1463,11 +1542,6 @@ syntax:</p> <pre><input <em>value=yes</em>></pre> </div> - <p>If the value an attribute using the unquoted - attribute syntax is followed by another attribute, - then there must be at least one - <a href="#space">space character</a> - between the value and the other attribute.</p> <p>If the value of an attribute using the unquoted attribute syntax is followed by a "<code title="U+002F SOLIDUS">/</code>" @@ -1479,76 +1553,92 @@ </dd> <dt><dfn id="syntax-attr-single-quoted">Single-quoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by - providing the - <a href="#attribute-name">attribute name</a>, - followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN">=</code>" - character, followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+0027 APOSTROPHE">'</code>" - character, followed by the - <a href="#syntax-attribute-value">attribute value</a>, - followed by a single - "<code title="U+0027 APOSTROPHE">'</code>" - character.</p> - <p>In addition to the general requirements given above - for attribute values, a + <p>A <dfn id="attr-value-single-quoted" title="attr-value-single-quoted">single-quoted attribute value</dfn> - must not contain any literal - "<code title="U+0027 APOSTROPHE">'</code>" - characters.</p> - <div class="example"> + is specified by providing the following parts in exactly + the following order:</p> + <ol> +<li>an + <a href="#attribute-name">attribute name</a> +</li> + <li>zero or more + <a href="#space">space characters</a> +</li> + <li>a + "<code title="U+003D EQUALS SIGN">=</code>" + character</li> + <li>zero or more + <a href="#space">space characters</a> +</li> + <li>a single + "<code title="U+0027 APOSTROPHE">'</code>" + character</li> + <li>an + <a href="#syntax-attribute-value">attribute value</a> +</li> + <li>a + "<code title="U+0027 APOSTROPHE">'</code>" + character.</li> + </ol> +<p>In addition to the general requirements given above + for attribute values, a single-quoted attribute value + has the following restriction:</p> + <ul> +<li>must not contain any literal + "<code title="U+0027 APOSTROPHE">'</code>" + characters</li> + </ul> +<div class="example"> <p>In the following example, the <code title="attr-input-type">type</code> attribute is given with the single-quoted attribute value syntax:</p> <pre><input <em>type='checkbox'</em>></pre> </div> - <p>If the value of an attribute using the single-quoted - attribute syntax is followed by another attribute, then - there must be at least one - <a href="#space">space character</a> - after the value and before the other attribute.</p> </dd> <dt><dfn id="syntax-attr-double-quoted">Double-quoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by - providing the - <a href="#attribute-name">attribute name</a>, - followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN character">=</code>" - character, followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+0022 QUOTATION MARK">"</code>" character, - followed by the - <a href="#syntax-attribute-value">attribute value</a>, - and followed by a - "<code title="double U+0022 QUOTATION MARK">"</code>" - character.</p> - <p>In addition to the general requirements given above for - attribute values, a + <p>A <dfn id="attr-value-double-quoted" title="attr-value-double-quoted">double-quoted attribute value</dfn> - must not contain any literal - "<code title="U+0022 QUOTATION MARK">"</code>" - characters.</p> - <div class="example"> + is specified by providing the following parts in exactly + the following order:</p> + <ol> +<li>an + <a href="#attribute-name">attribute name</a> +</li> + <li>zero or more + <a href="#space">space characters</a> +</li> + <li>a single + "<code title="U+003D EQUALS SIGN character">=</code>" + character</li> + <li>zero or more + <a href="#space">space characters</a> +</li> + <li>a single + "<code title="U+0022 QUOTATION MARK">"</code>" + character</li> + <li>an + <a href="#syntax-attribute-value">attribute value</a> +</li> + <li>a + "<code title="double U+0022 QUOTATION MARK">"</code>" + character</li> + </ol> +<p>In addition to the general requirements given above for + attribute values, a double-quoted attribute value has + the following restriction:</p> + <ul> +<li>must not contain any literal + "<code title="U+0022 QUOTATION MARK">"</code>" + characters</li> + </ul> +<div class="example"> <p>In the following example, the <code>title</code> attribute is given with the double-quoted attribute value syntax:</p> <pre><code title="U+003C LESS-THAN SIGN">&lt;</code></pre> </div> - <p>If the value of attribute using the double-quoted - attribute syntax is followed by another attribute, then - there must be at least one - <a href="#space">space character</a> - after the value and before the other attribute.</p> </dd> </dl> </div> @@ -1740,15 +1830,21 @@ <dl> <dt><dfn id="named-charref">Named character reference</dfn></dt> <dd> -<p>A named character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character followed by one of the entity names defined in - <cite>XML Entity definitions for Characters</cite> - <a href="#refsEntities">[Entities]</a>, - using the same case, followed by a - "<code title="U+003B SEMICOLON">;</code>" - character.</p> - <div class="example"> +<p>Named character references consist of the following + parts in exactly the following order:</p> + <ol> +<li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>One of the entity names defined in + <cite>XML Entity definitions for Characters</cite> + <a href="#refsEntities">[Entities]</a>, + using the same case.</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> +<div class="example"> <p>The following is an example of a named character reference for the character "<code title="U+2020 DAGGER">†</code>" @@ -1758,21 +1854,28 @@ </dd> <dt><dfn id="dec-charref">Decimal numeric character reference</dfn></dt> <dd> -<p>A decimal numerical character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character, followed by a - "<code title="U+0023 NUMBER SIGN">#</code>" - character, followed by one or more digits in the range - <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, - representing a base-ten integer that itself is a Unicode - code point that is not - U+0000, - U+000D, - in the range U+0080–U+009F, - or in the range 0xD8000–0xDFFF (surrogates). - The digits must then be followed by a - "<code title="U+003B SEMICOLON">;</code>" character.</p> - <div class="example"> +<p>Decimal numerical character references consist of the + following parts, in exactly the following order.</p> + <ol> +<li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>A + "<code title="U+0023 NUMBER SIGN">#</code>" + character.</li> + <li>One or more digits in the range + <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, + representing a base-ten integer that itself is a Unicode + code point that is not + U+0000, + U+000D, + in the range U+0080–U+009F, + or in the range 0xD8000–0xDFFF (surrogates).</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> +<div class="example"> <p>The following is an example of a decimal numeric character reference for the character "<code title="U+2020 DAGGER">†</code>" @@ -1782,31 +1885,37 @@ </dd> <dt><dfn id="hex-charref">Hexadecimal numeric character reference</dfn></dt> <dd> -<p>A hexadecimal numeric character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character, followed by a - "<code title="U+0023 NUMBER SIGN">#</code>" - character, followed by either a - "<code title="U+0078 LATIN SMALL LETTER X">x</code>" - character - or a - "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>" - character, followed by - one or more digits in the range - <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, - <code title="U+0061 LATIN SMALL LETTER A–U+0066 LATIN SMALL LETTER F">a–f</code>, - and - <code title="U+0041 LATIN CAPITAL LETTER A–U+0046 LATIN CAPITAL LETTER F">A–F</code>, - representing a base-sixteen integer that itself is a Unicode - code point that is not - U+0000, - U+000D, - in the range U+0080–U+009F, - or in the range 0xD800–0xDFFF (surrogates). - The digits must then be followed by a - "<code title="U+003B SEMICOLON">;</code>" - character.</p> - <div class="example"> +<p>Hexadecimal numeric character references consist of + the following parts, in exactly the following order.</p> + <ol> +<li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>A + "<code title="U+0023 NUMBER SIGN">#</code>" + character.</li> + <li>Either a + "<code title="U+0078 LATIN SMALL LETTER X">x</code>" + character + or a + "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>" + character.</li> + <li>One or more digits in the range + <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, + <code title="U+0061 LATIN SMALL LETTER A–U+0066 LATIN SMALL LETTER F">a–f</code>, + and + <code title="U+0041 LATIN CAPITAL LETTER A–U+0046 LATIN CAPITAL LETTER F">A–F</code>, + representing a base-sixteen integer that itself is a + Unicode code point that is not + U+0000, + U+000D, + in the range U+0080–U+009F, + or in the range 0xD800–0xDFFF (surrogates).</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> +<div class="example"> <p>The following is an example of a hexadecimal numeric character reference for the character "<code title="U+2020 DAGGER">†</code>" @@ -1882,8 +1991,13 @@ that is not itself in an <a href="#syntax-escape">escaping text span</a>, and ends at the next - <a href="#syntax-escape-end">escaping text span end</a>.</p> - <p>An + <a href="#syntax-escape-end">escaping text span end</a>. + Escaping text spans have the following restriction:</p> + <ul> +<li>must not contain any <a href="#syntax-charref">character references</a> +</li> + </ul> +<p>An <dfn id="syntax-escape-start">escaping text span start</dfn> is the <a href="#syntax-text" title="syntax-text">text</a> @@ -1913,20 +2027,16 @@ <a href="#syntax-text">text</a>; it is not a <a href="#comment-end-delimiter">comment end delimiter</a>.</li> + <li>Any sequences of characters within an + <a href="#syntax-escape">escaping text span</a> + that look like + <a href="#syntax-charref">character references</a> + are + <a href="#syntax-text">text</a>, + not + <a href="#syntax-charref">character references</a>.</li> </ul> </div> - <p>There cannot be any - <a href="#syntax-charref">character references</a> - inside an - <a href="#syntax-escape">escaping text span</a>; - any sequences of characters within an - <a href="#syntax-escape">escaping text span</a> - that may look like - <a href="#syntax-charref">character references</a> - are in fact - <a href="#syntax-text">text</a>, - not - <a href="#syntax-charref">character references</a>.</p> <p>An <a href="#syntax-escape-start">escaping text span start</a> may share its
Received on Friday, 7 August 2009 14:51:43 UTC