- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Wed, 21 Oct 2009 11:59:31 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec In directory hutz:/tmp/cvs-serv2938 Modified Files: Overview.html Log Message: A general editorial cleanup, primarily around how Unicode characters are presented. (whatwg r4261) Index: Overview.html =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.3403 retrieving revision 1.3404 diff -u -d -r1.3403 -r1.3404 --- Overview.html 21 Oct 2009 11:46:12 -0000 1.3403 +++ Overview.html 21 Oct 2009 11:59:28 -0000 1.3404 @@ -4244,9 +4244,9 @@ <!-- http://www.hixie.ch/tests/adhoc/html/navigation/javascript-url/ --> - <!-- XXX this should be tested in the case of a browsing context - that was navigated to about:blank after having been elsewhere, - as opposed to the about:blank used at the time of the browsing + <!-- this should be tested in the case of a browsing context that + was navigated to about:blank after having been elsewhere, as + opposed to the about:blank used at the time of the browsing context's creation. --> <p>If <var title="">fallback base url</var> is @@ -45377,9 +45377,9 @@ "NETWORK" followed by a U+003A COLON character (:)), then set <var title="">mode</var> to "online whitelist" and jump back to the step labeled "start of line".</li> - <li><p>If <var title="">line</var> ends with a U+003A COLON (:) - character, then set <var title="">mode</var> to "unknown" and jump - back to the step labeled "start of line".</li> + <li><p>If <var title="">line</var> ends with a U+003A COLON + character (:), then set <var title="">mode</var> to "unknown" and + jump back to the step labeled "start of line".</li> <li><p>This is either a data line or it is syntactically incorrect.</li> @@ -53511,14 +53511,14 @@ incompatible with some specifications. Including the DOCTYPE in a document ensures that the browser makes a best-effort attempt at following the relevant specifications.<p>A DOCTYPE must consist of the following characters, in this - order:<ol class="brief"><li>A U+003C LESS-THAN SIGN (<code><</code>) character.</li> - <li>A U+0021 EXCLAMATION MARK (<code>!</code>) character.</li> + order:<ol class="brief"><li>A U+003C LESS-THAN SIGN character (<).</li> + <li>A U+0021 EXCLAMATION MARK character (!).</li> <li>A string that is an <a href="#ascii-case-insensitive">ASCII case-insensitive</a> match for the string "<code title="">DOCTYPE</code>".</li> <li>One or more <a href="#space-character" title="space character">space characters</a>.</li> <li>A string that is an <a href="#ascii-case-insensitive">ASCII case-insensitive</a> match for the string "<code title="">HTML</code>".</li> <li>Optionally, a <a href="#doctype-legacy-string">DOCTYPE legacy string</a> (defined below).</li> <li>Zero or more <a href="#space-character" title="space character">space characters</a>.</li> - <li>A U+003E GREATER-THAN SIGN (<code>></code>) character.</li> + <li>A U+003E GREATER-THAN SIGN character (>).</li> </ol><p class="note">In other words, <code><!DOCTYPE HTML></code>, case-insensitively.<p>For the purposes of HTML generators that cannot output HTML markup with the short DOCTYPE "<code title=""><!DOCTYPE @@ -53597,9 +53597,7 @@ end tag, no content can be put between the start tag and the end tag). <a href="#foreign-elements">Foreign elements</a> whose start tag is <em>not</em> marked as self-closing can have <a href="#syntax-text" title="syntax-text">text</a>, <a href="#syntax-charref" title="syntax-charref">character references</a>, <a href="#syntax-cdata" title="syntax-cdata">CDATA sections</a>, other <a href="#syntax-elements" title="syntax-elements">elements</a>, and <a href="#syntax-comments" title="syntax-comments">comments</a>, but the text must not - contain the character U+003C LESS-THAN SIGN (<code><</code>) or - an <a href="#syntax-ambiguous-ampersand" title="syntax-ambiguous-ampersand">ambiguous - ampersand</a>.<div class="note"> + contain the character U+003C LESS-THAN SIGN (<) or an <a href="#syntax-ambiguous-ampersand" title="syntax-ambiguous-ampersand">ambiguous ampersand</a>.<div class="note"> <p>The HTML syntax does not support namespace declarations, even in <a href="#foreign-elements">foreign elements</a>.</p> @@ -53622,9 +53620,8 @@ specification does not define any elements called "<code title="">cdr:license</code>" in the SVG namespace.</p> </div><p><a href="#normal-elements">Normal elements</a> can have <a href="#syntax-text" title="syntax-text">text</a>, <a href="#syntax-charref" title="syntax-charref">character references</a>, other <a href="#syntax-elements" title="syntax-elements">elements</a>, and <a href="#syntax-comments" title="syntax-comments">comments</a>, but the text must not - contain the character U+003C LESS-THAN SIGN (<code><</code>) or - an <a href="#syntax-ambiguous-ampersand" title="syntax-ambiguous-ampersand">ambiguous - ampersand</a>. Some <a href="#normal-elements">normal elements</a> also have <a href="#element-restrictions">yet more restrictions</a> on what + contain the character U+003C LESS-THAN SIGN (<) or an <a href="#syntax-ambiguous-ampersand" title="syntax-ambiguous-ampersand">ambiguous ampersand</a>. Some + <a href="#normal-elements">normal elements</a> also have <a href="#element-restrictions">yet more restrictions</a> on what content they are allowed to hold, beyond the restrictions imposed by the content model and those described in this paragraph. Those restrictions are described below.<p>Tags contain a <dfn id="syntax-tag-name" title="syntax-tag-name">tag name</dfn>, @@ -53637,7 +53634,7 @@ letters that, when converted to all-lowercase, matches the element's tag name; tag names are case-insensitive.<h5 id="start-tags"><span class="secno">9.1.2.1 </span>Start tags</h5><p><dfn id="syntax-start-tag" title="syntax-start-tag">Start tags</dfn> must have the following format:<ol><li>The first character of a start tag must be a U+003C LESS-THAN - SIGN (<code><</code>).</li> + SIGN character (<).</li> <li>The next few characters of a start tag must be the element's <a href="#syntax-tag-name" title="syntax-tag-name">tag name</a>.</li> @@ -53655,20 +53652,20 @@ <li>Then, if the element is one of the <a href="#void-elements">void elements</a>, or if the element is a <a href="#foreign-elements" title="foreign elements">foreign - element</a>, then there may be a single U+002F SOLIDUS - (<code>/</code>) character. This character has no effect on - <a href="#void-elements">void elements</a>, but on <a href="#foreign-elements">foreign elements</a> it - marks the start tag as self-closing.</li> + element</a>, then there may be a single U+002F SOLIDUS character + (/). This character has no effect on <a href="#void-elements">void elements</a>, + but on <a href="#foreign-elements">foreign elements</a> it marks the start tag as + self-closing.</li> <li>Finally, start tags must be closed by a U+003E GREATER-THAN - SIGN (<code>></code>) character.</li> + SIGN character (>).</li> </ol><h5 id="end-tags"><span class="secno">9.1.2.2 </span>End tags</h5><p><dfn id="syntax-end-tag" title="syntax-end-tag">End tags</dfn> must have the following format:<ol><li>The first character of an end tag must be a U+003C LESS-THAN - SIGN (<code><</code>).</li> + SIGN character (<).</li> <li>The second character of an end tag must be a U+002F SOLIDUS - (<code>/</code>).</li> + character (/).</li> <li>The next few characters of an end tag must be the element's <a href="#syntax-tag-name" title="syntax-tag-name">tag name</a>.</li> @@ -53676,8 +53673,8 @@ <li>After the tag name, there may be one or more <a href="#space-character" title="space character">space characters</a>.</li> - <li>Finally, end tags must be closed by a U+003E GREATER-THAN - SIGN (<code>></code>) character.</li> + <li>Finally, end tags must be closed by a U+003E GREATER-THAN SIGN + character (>).</li> </ol><h5 id="attributes"><span class="secno">9.1.2.3 </span>Attributes</h5><p><dfn id="syntax-attributes" title="syntax-attributes">Attributes</dfn> for an element are expressed inside the element's start tag.<p>Attributes have a name and a value. <dfn id="syntax-attribute-name" title="syntax-attribute-name">Attribute names</dfn> must consist of @@ -53724,12 +53721,11 @@ character">space characters</a>, followed by the <a href="#syntax-attribute-value" title="syntax-attribute-value">attribute value</a>, which, in addition to the requirements given above for attribute values, must not contain any literal <a href="#space-character" title="space character">space - characters</a>, any U+0022 QUOTATION MARK (<code>"</code>) - characters, U+0027 APOSTROPHE (<code>'</code>) characters, - U+003D EQUALS SIGN (<code>=</code>) characters, U+003C LESS-THAN - SIGN (<code><</code>) characters, U+003E GREATER-THAN SIGN - (<code>></code>) characters, or U+0060 GRAVE ACCENT (`) - characters, and must not be the empty string.</p> + characters</a>, any U+0022 QUOTATION MARK characters ("), + U+0027 APOSTROPHE characters ('), U+003D EQUALS SIGN + characters (=), U+003C LESS-THAN SIGN characters (<), U+003E + GREATER-THAN SIGN characters (>), or U+0060 GRAVE ACCENT + characters (`), and must not be the empty string.</p> <!-- The ` character is in this list on a temporary basis, waiting for IE to fix it's parsing bug whereby it treats ` as an @@ -53786,11 +53782,11 @@ characters</a>, followed by a single U+003D EQUALS SIGN character, followed by zero or more <a href="#space-character" title="space character">space characters</a>, followed by a single U+0027 - APOSTROPHE (<code>'</code>) character, followed by the <a href="#syntax-attribute-value" title="syntax-attribute-value">attribute value</a>, which, in + APOSTROPHE character ('), followed by the <a href="#syntax-attribute-value" title="syntax-attribute-value">attribute value</a>, which, in addition to the requirements given above for attribute values, - must not contain any literal U+0027 APOSTROPHE (<code>'</code>) - characters, and finally followed by a second single U+0027 - APOSTROPHE (<code>'</code>) character.</p> + must not contain any literal U+0027 APOSTROPHE characters ('), and + finally followed by a second single U+0027 APOSTROPHE character + (').</p> <div class="example"> @@ -53816,11 +53812,11 @@ characters</a>, followed by a single U+003D EQUALS SIGN character, followed by zero or more <a href="#space-character" title="space character">space characters</a>, followed by a single U+0022 - QUOTATION MARK (<code>"</code>) character, followed by the <a href="#syntax-attribute-value" title="syntax-attribute-value">attribute value</a>, which, in + QUOTATION MARK character ("), followed by the <a href="#syntax-attribute-value" title="syntax-attribute-value">attribute value</a>, which, in addition to the requirements given above for attribute values, - must not contain any literal U+0022 QUOTATION MARK - (<code>"</code>) characters, and finally followed by a second - single U+0022 QUOTATION MARK (<code>"</code>) character.</p> + must not contain any literal U+0022 QUOTATION MARK characters ("), + and finally followed by a second single U+0022 QUOTATION MARK + character (").</p> <div class="example"> @@ -53993,9 +53989,9 @@ LINE FEED (LF) characters, or pairs of U+000D CARRIAGE RETURN (CR), U+000A LINE FEED (LF) characters in that order.<h4 id="character-references"><span class="secno">9.1.4 </span>Character references</h4><p>In certain cases described in other sections, <a href="#syntax-text" title="syntax-text">text</a> may be mixed with <dfn id="syntax-charref" title="syntax-charref">character references</dfn>. These can be used to escape characters that couldn't otherwise legally be included in - <a href="#syntax-text" title="syntax-text">text</a>.<p>Character references must start with a U+0026 AMPERSAND - (<code>&</code>). Following this, there are three possible kinds - of character references:<dl><dt>Named character references</dt> + <a href="#syntax-text" title="syntax-text">text</a>.<p>Character references must start with a U+0026 AMPERSAND character + (&). Following this, there are three possible kinds of character + references:<dl><dt>Named character references</dt> <dd>The ampersand must be followed by one of the names given in the <a href="#named-character-references">named character references</a> section, using the same @@ -54006,22 +54002,22 @@ <dt>Decimal numeric character reference</dt> <dd>The ampersand must be followed by a U+0023 NUMBER SIGN - (<code>#</code>) character, followed by one or more digits in the - range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), representing - a base-ten integer that corresponds to a Unicode code point that is - allowed according to the definition below. The digits must then be - followed by a U+003B SEMICOLON character (;).</dd> + character (#), followed by one or more digits in the range U+0030 + DIGIT ZERO (0) to U+0039 DIGIT NINE (9), representing a base-ten + integer that corresponds to a Unicode code point that is allowed + according to the definition below. The digits must then be followed + by a U+003B SEMICOLON character (;).</dd> <dt>Hexadecimal numeric character reference</dt> <dd>The ampersand must be followed by a U+0023 NUMBER SIGN - (<code>#</code>) character, which must be followed by either a - U+0078 LATIN SMALL LETTER X character (x) or a U+0058 LATIN CAPITAL - LETTER X character (X), which must then be followed by one or more - digits in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), - U+0061 LATIN SMALL LETTER A to U+0066 LATIN SMALL LETTER F, and - U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F, + character (#), which must be followed by either a U+0078 LATIN + SMALL LETTER X character (x) or a U+0058 LATIN CAPITAL LETTER X + character (X), which must then be followed by one or more digits in + the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U+0061 + LATIN SMALL LETTER A to U+0066 LATIN SMALL LETTER F, and U+0041 + LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F, representing a base-sixteen integer that corresponds to a Unicode code point that is allowed according to the definition below. The digits must then be followed by a U+003B SEMICOLON character @@ -54035,8 +54031,7 @@ ampersand</dfn> is a U+0026 AMPERSAND character (&) that is followed by some <a href="#syntax-text" title="syntax-text">text</a> other than a <a href="#space-character">space character</a>, a U+003C LESS-THAN SIGN character - (<), or another U+0026 AMPERSAND character - (<code>&</code>).<h4 id="cdata-sections"><span class="secno">9.1.5 </span>CDATA sections</h4><p><dfn id="syntax-cdata" title="syntax-cdata">CDATA sections</dfn> must start with + (<), or another U+0026 AMPERSAND character (&).<h4 id="cdata-sections"><span class="secno">9.1.5 </span>CDATA sections</h4><p><dfn id="syntax-cdata" title="syntax-cdata">CDATA sections</dfn> must start with the character sequence U+003C LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+005B LEFT SQUARE BRACKET, U+0043 LATIN CAPITAL LETTER C, U+0044 LATIN CAPITAL LETTER D, U+0041 LATIN CAPITAL LETTER A, U+0054 @@ -54053,11 +54048,11 @@ MARK, U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS (<code title=""><!--</code>). Following this sequence, the comment may have <a href="#syntax-text" title="syntax-text">text</a>, with the additional restriction that the text must not start with a single U+003E - GREATER-THAN SIGN (>) character, nor start with a U+002D - HYPHEN-MINUS character (-) followed by a - U+003E GREATER-THAN SIGN (>) character, nor contain two - consecutive U+002D HYPHEN-MINUS (<code title="">-</code>) - characters, nor end with a U+002D HYPHEN-MINUS (<code title="">-</code>) character. Finally, the comment must be ended by + GREATER-THAN SIGN character (>), nor start with a U+002D + HYPHEN-MINUS character (-) followed by a U+003E GREATER-THAN SIGN + (>) character, nor contain two consecutive U+002D HYPHEN-MINUS + characters (<code title="">--</code>), nor end with a U+002D + HYPHEN-MINUS character (-). Finally, the comment must be ended by the three character sequence U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN (<code title="">--></code>).<div class="impl"> @@ -56536,8 +56531,8 @@ <h5 id="markup-declaration-open-state"><span class="secno">9.2.4.44 </span><dfn>Markup declaration open state</dfn></h5> - <p>If the next two characters are both U+002D HYPHEN-MINUS (-) - characters, consume those two characters, create a comment token + <p>If the next two characters are both U+002D HYPHEN-MINUS + characters (-), consume those two characters, create a comment token whose data is the empty string, and switch to the <a href="#comment-start-state">comment start state</a>.</p> @@ -56646,8 +56641,8 @@ <dt>U+000C FORM FEED (FF)</dt> <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>--> <dt>U+0020 SPACE</dt> - <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS (-) - characters and the <a href="#current-input-character">current input character</a> to the + <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS + characters (-) and the <a href="#current-input-character">current input character</a> to the comment token's data. Switch to the <a href="#comment-end-space-state">comment end space state</a>.</dd> @@ -56669,8 +56664,8 @@ be treated as live code --> <dt>Anything else</dt> - <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS (-) - characters and the <a href="#current-input-character">current input character</a> to the + <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS + characters (-) and the <a href="#current-input-character">current input character</a> to the comment token's data. Switch to the <a href="#comment-state">comment state</a>.</dd> @@ -56679,7 +56674,7 @@ <p>Consume the <a href="#next-input-character">next input character</a>:</p> <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt> - <dd>Append two U+002D HYPHEN-MINUS (-) characters and a U+0021 + <dd>Append two U+002D HYPHEN-MINUS characters (-) and a U+0021 EXCLAMATION MARK character (!) to the comment token's data. Switch to the <a href="#comment-end-dash-state">comment end dash state</a>.</dd> @@ -56693,7 +56688,7 @@ comment in comment end state --> <dt>Anything else</dt> - <dd>Append two U+002D HYPHEN-MINUS (-) characters, a U+0021 + <dd>Append two U+002D HYPHEN-MINUS characters (-), a U+0021 EXCLAMATION MARK character (!), and the <a href="#current-input-character">current input character</a> to the comment token's data. Switch to the <a href="#comment-state">comment state</a>.</dd> @@ -57344,17 +57339,18 @@ error</a>. No characters are consumed, and nothing is returned.</p> - <p>If the last character matched is not a U+003B SEMICOLON (<code title="">;</code>), there is a <a href="#parse-error">parse error</a>.</p> + <p>If the last character matched is not a U+003B SEMICOLON + character (;), there is a <a href="#parse-error">parse error</a>.</p> <p>If the character reference is being consumed <a href="#character-reference-in-attribute-value-state" title="character reference in attribute value state">as part of an attribute</a>, and the last character matched is not a U+003B - SEMICOLON character (<code title="">;</code>), and the next - character is in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT - NINE (9), U+0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL - LETTER Z, or U+0061 LATIN SMALL LETTER A to U+007A LATIN SMALL - LETTER Z, then, for historical reasons, all the characters that - were matched after the U+0026 AMPERSAND character (&) must be - unconsumed, and nothing is returned.</p> + SEMICOLON character (;), and the next character is in the range + U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U+0041 LATIN + CAPITAL LETTER A to U+005A LATIN CAPITAL LETTER Z, or U+0061 LATIN + SMALL LETTER A to U+007A LATIN SMALL LETTER Z, then, for + historical reasons, all the characters that were matched after the + U+0026 AMPERSAND character (&) must be unconsumed, and nothing + is returned.</p> <p>Otherwise, return a character token for the character corresponding to the character reference name (as given by the @@ -61439,19 +61435,18 @@ <dd> - <p>Append a U+003C LESS-THAN SIGN (<code title=""><</code>) - character, followed by the element's tag name. (For nodes - created by the <a href="#html-parser">HTML parser</a> or <code title="">Document.createElement()</code>, the tag name will be + <p>Append a U+003C LESS-THAN SIGN character character (<), + followed by the element's tag name. (For nodes created by the + <a href="#html-parser">HTML parser</a> or <code title="">Document.createElement()</code>, the tag name will be lowercase.)</p> <p>For each attribute that the element has, append a U+0020 SPACE character, the attribute's name (which, for attributes set by the <a href="#html-parser">HTML parser</a> or by <code title="">Element.setAttributeNode()</code> or <code title="">Element.setAttribute()</code>, will be lowercase), a - U+003D EQUALS SIGN (<code title="">=</code>) character, a - U+0022 QUOTATION MARK (<code title="">"</code>) - character, the attribute's value, <a href="#escapingString" title="escaping a - string">escaped as described below</a> in <i>attribute - mode</i>, and a second U+0022 QUOTATION MARK (<code title="">"</code>) character.</p> + U+003D EQUALS SIGN character (=), a U+0022 QUOTATION MARK + character ("), the attribute's value, <a href="#escapingString" title="escaping a string">escaped as described below</a> in + <i>attribute mode</i>, and a second U+0022 QUOTATION MARK + character (").</p> <p>While the exact order of attributes is UA-defined, and may depend on factors such as the order that the attributes were @@ -61459,8 +61454,7 @@ such that consecutive invocations of this algorithm serialize an element's attributes in the same order.</p> - <p>Append a U+003E GREATER-THAN SIGN (<code title="">></code>) - character.</p> + <p>Append a U+003E GREATER-THAN SIGN character (>).</p> <p>If <var title="">current node</var> is an <code><a href="#the-area-element">area</a></code>, <code><a href="#the-base-element">base</a></code>, <code><a href="#basefont">basefont</a></code>, @@ -61481,8 +61475,10 @@ <p>Append the value of running the <a href="#html-fragment-serialization-algorithm">HTML fragment serialization algorithm</a> on the <var title="">current node</var> element (thus recursing into this algorithm for - that element), followed by a U+003C LESS-THAN SIGN (<code title=""><</code>) character, a U+002F SOLIDUS (<code title="">/</code>) character, the element's tag name again, - and finally a U+003E GREATER-THAN SIGN (<code title="">></code>) character.</p> + that element), followed by a U+003C LESS-THAN SIGN character + (<), a U+002F SOLIDUS character (/), the element's tag name + again, and finally a U+003E GREATER-THAN SIGN character + (>).</p> </dd> @@ -64068,7 +64064,7 @@ string "<code title="">]]></code>".</li> (these can be split)--> <li>A <code>Comment</code> node whose data contains two adjacent - U+002D HYPHEN-MINUS (-) characters or ends with such a + U+002D HYPHEN-MINUS characters (-) or ends with such a character.</li> <li>A <code>ProcessingInstruction</code> node whose target name is
Received on Wednesday, 21 October 2009 11:59:39 UTC