- From: Michael Smith via cvs-syncmail <cvsmail@w3.org>
- Date: Fri, 07 Aug 2009 14:50:31 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/markup/src In directory hutz:/tmp/cvs-serv6373/src Modified Files: documents.html syntax.html Log Message: made a number of refinements to the Syntax section Index: documents.html =================================================================== RCS file: /sources/public/html5/markup/src/documents.html,v retrieving revision 1.5 retrieving revision 1.6 diff -u -d -r1.5 -r1.6 --- documents.html 6 Aug 2009 10:34:35 -0000 1.5 +++ documents.html 7 Aug 2009 14:50:29 -0000 1.6 @@ -84,7 +84,7 @@ <li>Any number of <a href="#syntax-comments">comments</a> and <a href="#space">space characters</a>.</li> - <li>A <a href="#doctype">DOCTYPE</a>.</li> + <li>A <a href="#doctype">doctype</a>.</li> <li>Any number of <a href="#syntax-comments">comments</a> and <a href="#space">space characters</a>.</li> @@ -107,7 +107,7 @@ character.</li> <li>Any number of comments and space characters, as defined in the XML specification <a href="#refsXML">[XML]</a>.</li> - <li>Optionally, a DOCTYPE, as defined + <li>Optionally, a doctype declaration, as defined in the XML specification <a href="#refsXML">[XML]</a>.</li> <li>Any number of comments and space characters, as defined in the XML specification <a href="#refsXML">[XML]</a>.</li> Index: syntax.html =================================================================== RCS file: /sources/public/html5/markup/src/syntax.html,v retrieving revision 1.65 retrieving revision 1.66 diff -u -d -r1.65 -r1.66 --- syntax.html 6 Aug 2009 11:02:08 -0000 1.65 +++ syntax.html 7 Aug 2009 14:50:29 -0000 1.66 @@ -3,65 +3,143 @@ <h2>HTML syntax</h2> <div class="toc"/> <section id="doctype-syntax"> - <h2>The DOCTYPE</h2> - <p>A <dfn id="doctype" title="syntax-doctype">DOCTYPE</dfn> is - an special instruction which, for legacy reasons that have to - do with processing modes in browsers, is a required part of - any - <a href="#syntax-document-html">document in the HTML syntax</a>.</p> - <p>The DOCTYPE must match either the - <a href="#doctype.pattern">doctype</a> - or - <a href="#doctype.legacy">doctype.legacy</a> - patterns defined this specification, or must match the - <a - href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl" - ><code class="defined-elsewhere">doctypedecl</code></a> - production defined in the XML specification - <a href="#refsXML">[XML]</a>.</p> - <p>The <code>doctype</code> pattern is defined as follows:</p> - <dl class="pattern-def"> - <dt><a id="doctype.pattern" - href="#doctype.pattern">doctype</a> =</dt> - <dd> - A string that is an <a href="#ascii-case-insensitive">ASCII - case-insensitive</a> match for the following regular - expression: - <pre><code class="regexp"><!doctype\s+html\s*></code></pre> - </dd> - </dl> - <div class="example"> - <p>The following are examples of some DOCTYPEs that match the - <a href="#doctype">doctype</a> pattern.</p> - <pre><!doctype html></pre> - <pre><!DOCTYPE HTML></pre> - </div> - <p>The <code>doctype.legacy</code> pattern is defined as follows:</p> - <dl class="pattern-def"> - <dt><a id="doctype.legacy" - href="#doctype.legacy">doctype.legacy</a> =</dt> - <dd> - A string that is an <a href="#ascii-case-insensitive">ASCII - case-insensitive</a> match for the following regular - expression: - <pre><code class="regexp"><!doctype\s+html\s+system\s+("about:legacy-compat"|'about:legacy-compat')\s*></code></pre> - …except for the <code>about:legacy-compat</code> part, - which must match exactly (not case-insensitively). - </dd> - </dl> + <h2>The doctype</h2> + <p>A + <dfn + id="doctype" + title="doctype">doctype</dfn> + (sometimes capitalized as “DOCTYPE”) is an special instruction + which, for legacy reasons that have to do with processing + modes in browsers, is a required part of any + <a href="#syntax-document-html">document in the HTML syntax</a>; + it must either be a + <a href="#deprecated-doctype">deprecated doctype</a>, + or must consist of the following parts, in exactly the + following order:</p> + <ol> + <li>A + "<code title="U+003C LESS-THAN SIGN"><</code>" + character.</li> + <li>A + "<code title="U+0021 EXCLAMATION MARK">!</code>" + character.</li> + <li>Any + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>DOCTYPE</code>".</li> + <li>One or more + <a href="#space">space characters</a>.</li> + <li>Any + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>".</li> + <li>Optionally, a + <a href="#doctype-legacy-string" >doctype legacy string</a>.</li> + <li>Optionally, one or more + <a href="#space">space characters</a>.</li> + <li>A + "<code title="U+003E GREATER-THAN SIGN">></code>" + character.</li> + </ol> + <p>A + <dfn + id="doctype-legacy-string" + title="doctype-legacy-string">doctype legacy string</dfn> + consists of the following parts, in exactly the following + order.</p> + <ol> + <li>One or more + <a href="#space">space characters</a>.</li> + <li>Any + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>SYSTEM</code>".</li> + <li>One or more + <a href="#space">space characters</a></li> + <li>A <i>quote mark</i>, consisting of either + a + "<code title="U+0022 QUOTATION MARK">"</code>" + character or a + "<code title="U+0027 APOSTROPHE">'</code>" + character.</li> + <li>The literal string + "<code>about:legacy-compat</code>".</li> + <li>A matching <i>quote mark</i>, identical to the + <i>quote mark</i> used earlier (either a + "<code title="U+0022 QUOTATION MARK">"</code>" + character or a + "<code title="U+0027 APOSTROPHE">'</code>" + character).</li> + </ol> <div class="example"> - <p>The following are examples of some DOCTYPEs that match the - <a href="#doctype.legacy">doctype.legacy</a> pattern.</p> - <pre><!doctype html system 'about:legacy-compat'></pre> - <pre><!DOCTYPE HTML system "about:legacy-compat"></pre> + <p>The following are examples of some conformant + <a href="#doctype">doctypes</a>.</p> + <pre><!DOCTYPE html></pre> + <pre><!doctype HTML system "about:legacy-compat"></pre> </div> - <p>The following are examples of some DOCTYPEs that match the - <a - href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-doctypedecl" - ><code class="defined-elsewhere">doctypedecl</code></a> - production defined in the XML specification - <a href="#refsXML">[XML]</a>.</p> + <p>A + <dfn + id="deprecated-doctype" + title="deprecated-doctype" + >deprecated doctype</dfn> + is a + <dfn + id="doctype-declaration" + title="doctype-declaration" + >document type declaration</dfn> + as defined in the XML specification + <a href="#refsXML">[XML]</a>, + with the further restriction that it must meet one of the + following sets of constraints:</p> + <ul> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD HTML 4.0//EN</code>", + and its system identifier is either missing or is an exact + match for the literal string + "<code>http://www.w3.org/TR/REC-html40/strict.dtd</code>".</li> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD HTML 4.01//EN</code>", + and its system identifier is either missing or is an exact + match for the literal string + "<code>http://www.w3.org/TR/html4/strict.dtd</code>".</li> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD XHTML 1.0 Strict//EN</code>", + and its system identifier is either missing or is an exact + match for the literal string + "<code>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code>".</li> + <li>The + <a href="#doctype-declaration">document type declaration’s</a> + name part is an + <a href="#ascii-case-insensitive">ASCII case-insensitive</a> + match for the string + "<code>HTML</code>", + its public identifier is an exact match for the literal string + "<code>-//W3C//DTD XHTML 1.1//EN</code>", + and its system identifier is either missing or is an exact + match for the literal string + "<code>http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</code>".</li> + </ul> <div class="example"> + <p>The following are examples of + <a href="#deprecated-doctype">deprecated doctypes</a>.</p> <pre><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"></pre> <pre><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" @@ -383,30 +461,32 @@ syntax:</p> <pre><input <em>disabled</em>></pre> </div> - <p>If an attribute using the empty attribute syntax is - followed by another attribute, then there must be at - least one - <a href="#space">space character</a> - between the value and the other attribute.</p> </dd> <dt><dfn id="syntax-attr-unquoted" title="syntax-attr-unquoted" >Unquoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by providing - the <a href="#attribute-name">attribute name</a>, - followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN">=</code>" - character, followed by zero or more - <a href="#space">space characters</a>, - followed by the - <a href="#syntax-attribute-value" >attribute value</a>.</p> - <p>In addition to the general requirements given above for - attribute values, an + <p>An <dfn id="attr-value-unquoted" - title="attr-value-unquoted">unquoted attribute value</dfn>:</p> + title="attr-value-unquoted">unquoted attribute value</dfn> + is specified by providing the following parts in exactly + the following order:</p> + <ol> + <li>an + <a href="#attribute-name">attribute name</a></li> + <li>zero or more + <a href="#space">space characters</a></li> + <li>a single + "<code title="U+003D EQUALS SIGN">=</code>" + character</li> + <li>zero or more + <a href="#space">space characters</a></li> + <li>an + <a href="#syntax-attribute-value" >attribute value</a></li> + </ol> + <p>In addition to the general requirements given above for + attribute values, an unquoted attribute value has the + following restrictions:</p> <ul> <li>must not contain any literal <a href="#space">space characters</a></li> @@ -425,11 +505,6 @@ syntax:</p> <pre><input <em>value=yes</em>></pre> </div> - <p>If the value an attribute using the unquoted - attribute syntax is followed by another attribute, - then there must be at least one - <a href="#space">space character</a> - between the value and the other attribute.</p> <p>If the value of an attribute using the unquoted attribute syntax is followed by a "<code title="U+002F SOLIDUS">/</code>" @@ -441,30 +516,39 @@ </dd> <dt><dfn id="syntax-attr-single-quoted">Single-quoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by - providing the - <a href="#attribute-name">attribute name</a>, - followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN">=</code>" - character, followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+0027 APOSTROPHE">'</code>" - character, followed by the - <a href="#syntax-attribute-value">attribute value</a>, - followed by a single - "<code title="U+0027 APOSTROPHE">'</code>" - character.</p> - <p>In addition to the general requirements given above - for attribute values, a + <p>A <dfn id="attr-value-single-quoted" title="attr-value-single-quoted" >single-quoted attribute value</dfn> - must not contain any literal - "<code title="U+0027 APOSTROPHE">'</code>" - characters.</p> + is specified by providing the following parts in exactly + the following order:</p> + <ol> + <li>an + <a href="#attribute-name">attribute name</a></li> + <li>zero or more + <a href="#space">space characters</a></li> + <li>a + "<code title="U+003D EQUALS SIGN">=</code>" + character</li> + <li>zero or more + <a href="#space">space characters</a></li> + <li>a single + "<code title="U+0027 APOSTROPHE">'</code>" + character</li> + <li>an + <a href="#syntax-attribute-value">attribute value</a></li> + <li>a + "<code title="U+0027 APOSTROPHE">'</code>" + character.</li> + </ol> + <p>In addition to the general requirements given above + for attribute values, a single-quoted attribute value + has the following restriction:</p> + <ul> + <li>must not contain any literal + "<code title="U+0027 APOSTROPHE">'</code>" + characters</li> + </ul> <div class="example"> <p>In the following example, the <code title="attr-input-type">type</code> attribute @@ -472,49 +556,48 @@ syntax:</p> <pre><input <em>type='checkbox'</em>></pre> </div> - <p>If the value of an attribute using the single-quoted - attribute syntax is followed by another attribute, then - there must be at least one - <a href="#space">space character</a> - after the value and before the other attribute.</p> </dd> <dt><dfn id="syntax-attr-double-quoted">Double-quoted attribute-value syntax</dfn></dt> <dd> - <p>An attribute and its value may be specified by - providing the - <a href="#attribute-name">attribute name</a>, - followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+003D EQUALS SIGN character">=</code>" - character, followed by zero or more - <a href="#space">space characters</a>, - followed by a single - "<code title="U+0022 QUOTATION MARK">"</code>" character, - followed by the - <a href="#syntax-attribute-value">attribute value</a>, - and followed by a - "<code title="double U+0022 QUOTATION MARK">"</code>" - character.</p> - <p>In addition to the general requirements given above for - attribute values, a + <p>A <dfn id="attr-value-double-quoted" title="attr-value-double-quoted" >double-quoted attribute value</dfn> - must not contain any literal - "<code title="U+0022 QUOTATION MARK">"</code>" - characters.</p> + is specified by providing the following parts in exactly + the following order:</p> + <ol> + <li>an + <a href="#attribute-name">attribute name</a></li> + <li>zero or more + <a href="#space">space characters</a></li> + <li>a single + "<code title="U+003D EQUALS SIGN character">=</code>" + character</li> + <li>zero or more + <a href="#space">space characters</a></li> + <li>a single + "<code title="U+0022 QUOTATION MARK">"</code>" + character</li> + <li>an + <a href="#syntax-attribute-value">attribute value</a></li> + <li>a + "<code title="double U+0022 QUOTATION MARK">"</code>" + character</li> + </ol> + <p>In addition to the general requirements given above for + attribute values, a double-quoted attribute value has + the following restriction:</p> + <ul> + <li>must not contain any literal + "<code title="U+0022 QUOTATION MARK">"</code>" + characters</li> + </ul> <div class="example"> <p>In the following example, the <code>title</code> attribute is given with the double-quoted attribute value syntax:</p> <pre><code title="U+003C LESS-THAN SIGN">&lt;</code></pre> </div> - <p>If the value of attribute using the double-quoted - attribute syntax is followed by another attribute, then - there must be at least one - <a href="#space">space character</a> - after the value and before the other attribute.</p> </dd> </dl> </section> @@ -705,14 +788,20 @@ </ul> <dl> <dt><dfn id="named-charref">Named character reference</dfn></dt> - <dd><p>A named character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character followed by one of the entity names defined in - <cite>XML Entity definitions for Characters</cite> - <a href="#refsEntities">[Entities]</a>, - using the same case, followed by a - "<code title="U+003B SEMICOLON">;</code>" - character.</p> + <dd><p>Named character references consist of the following + parts in exactly the following order:</p> + <ol> + <li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>One of the entity names defined in + <cite>XML Entity definitions for Characters</cite> + <a href="#refsEntities">[Entities]</a>, + using the same case.</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> <div class="example"> <p>The following is an example of a named character reference for the character @@ -722,20 +811,27 @@ </div> </dd> <dt><dfn id="dec-charref">Decimal numeric character reference</dfn></dt> - <dd><p>A decimal numerical character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character, followed by a - "<code title="U+0023 NUMBER SIGN">#</code>" - character, followed by one or more digits in the range - <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, - representing a base-ten integer that itself is a Unicode - code point that is not - U+0000, - U+000D, - in the range U+0080–U+009F, - or in the range 0xD8000–0xDFFF (surrogates). - The digits must then be followed by a - "<code title="U+003B SEMICOLON">;</code>" character.</p> + <dd><p>Decimal numerical character references consist of the + following parts, in exactly the following order.</p> + <ol> + <li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>A + "<code title="U+0023 NUMBER SIGN">#</code>" + character.</li> + <li>One or more digits in the range + <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, + representing a base-ten integer that itself is a Unicode + code point that is not + U+0000, + U+000D, + in the range U+0080–U+009F, + or in the range 0xD8000–0xDFFF (surrogates).</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> <div class="example"> <p>The following is an example of a decimal numeric character reference for the character @@ -745,30 +841,36 @@ </div> </dd> <dt><dfn id="hex-charref">Hexadecimal numeric character reference</dfn></dt> - <dd><p>A hexadecimal numeric character reference is an - "<code title="U+0026 AMPERSAND">&</code>" - character, followed by a - "<code title="U+0023 NUMBER SIGN">#</code>" - character, followed by either a - "<code title="U+0078 LATIN SMALL LETTER X">x</code>" - character - or a - "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>" - character, followed by - one or more digits in the range - <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, - <code title="U+0061 LATIN SMALL LETTER A–U+0066 LATIN SMALL LETTER F">a–f</code>, - and - <code title="U+0041 LATIN CAPITAL LETTER A–U+0046 LATIN CAPITAL LETTER F">A–F</code>, - representing a base-sixteen integer that itself is a Unicode - code point that is not - U+0000, - U+000D, - in the range U+0080–U+009F, - or in the range 0xD800–0xDFFF (surrogates). - The digits must then be followed by a - "<code title="U+003B SEMICOLON">;</code>" - character.</p> + <dd><p>Hexadecimal numeric character references consist of + the following parts, in exactly the following order.</p> + <ol> + <li>An + "<code title="U+0026 AMPERSAND">&</code>" + character.</li> + <li>A + "<code title="U+0023 NUMBER SIGN">#</code>" + character.</li> + <li>Either a + "<code title="U+0078 LATIN SMALL LETTER X">x</code>" + character + or a + "<code title="U+0058 LATIN CAPITAL LETTER X">X</code>" + character.</li> + <li>One or more digits in the range + <code title="U+0030 DIGIT ZERO–U+0039 DIGIT NINE">0–9</code>, + <code title="U+0061 LATIN SMALL LETTER A–U+0066 LATIN SMALL LETTER F">a–f</code>, + and + <code title="U+0041 LATIN CAPITAL LETTER A–U+0046 LATIN CAPITAL LETTER F">A–F</code>, + representing a base-sixteen integer that itself is a + Unicode code point that is not + U+0000, + U+000D, + in the range U+0080–U+009F, + or in the range 0xD800–0xDFFF (surrogates).</li> + <li>A + "<code title="U+003B SEMICOLON">;</code>" + character.</li> + </ol> <div class="example"> <p>The following is an example of a hexadecimal numeric character reference for the character @@ -859,7 +961,12 @@ that is not itself in an <a href="#syntax-escape">escaping text span</a>, and ends at the next - <a href="#syntax-escape-end">escaping text span end</a>.</p> + <a href="#syntax-escape-end">escaping text span end</a>. + Escaping text spans have the following restriction:</p> + <ul> + <li>must not contain any <a + href="#syntax-charref">character references</a></li> + </ul> <p>An <dfn id="syntax-escape-start">escaping text span start</dfn> is the @@ -890,20 +997,16 @@ <a href="#syntax-text">text</a>; it is not a <a href="#comment-end-delimiter">comment end delimiter</a>.</li> + <li>Any sequences of characters within an + <a href="#syntax-escape">escaping text span</a> + that look like + <a href="#syntax-charref">character references</a> + are + <a href="#syntax-text">text</a>, + not + <a href="#syntax-charref">character references</a>.</li> </ul> </div> - <p>There cannot be any - <a href="#syntax-charref">character references</a> - inside an - <a href="#syntax-escape">escaping text span</a>; - any sequences of characters within an - <a href="#syntax-escape">escaping text span</a> - that may look like - <a href="#syntax-charref">character references</a> - are in fact - <a href="#syntax-text">text</a>, - not - <a href="#syntax-charref">character references</a>.</p> <p>An <a href="#syntax-escape-start">escaping text span start</a> may share its
Received on Friday, 7 August 2009 14:50:40 UTC