- From: poot <cvsmail@w3.org>
- Date: Thu, 1 Apr 2010 14:34:34 +0900 (JST)
- To: public-html-diffs@w3.org
hixie: Make map to U+000D and not U+000A. This has ramifications throughout the parser. (whatwg r4933) http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.3953&r2=1.3954&f=h http://html5.org/tools/web-apps-tracker?from=4932&to=4933 =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.3953 retrieving revision 1.3954 diff -u -d -r1.3953 -r1.3954 --- Overview.html 1 Apr 2010 01:00:43 -0000 1.3953 +++ Overview.html 1 Apr 2010 01:21:40 -0000 1.3954 @@ -51637,7 +51637,10 @@ to be put, as described in the other sections.<h5 id="newlines"><span class="secno">8.1.3.1 </span>Newlines</h5><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p><dfn id="syntax-newlines" title="syntax-newlines">Newlines</dfn> in HTML may be represented either as U+000D CARRIAGE RETURN (CR) characters, U+000A LINE FEED (LF) characters, or pairs of U+000D CARRIAGE RETURN (CR), - U+000A LINE FEED (LF) characters in that order.<h4 id="character-references"><span class="secno">8.1.4 </span>Character references</h4><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p>In certain cases described in other sections, <a href="#syntax-text" title="syntax-text">text</a> may be mixed with <dfn id="syntax-charref" title="syntax-charref">character references</dfn>. These can be used + U+000A LINE FEED (LF) characters in that order.<p>Where <a href="#syntax-charref" title="syntax-charref">character references</a> + are allowed, a character reference of a U+000A LINE FEED (LF) + character (but not a U+000D CARRIAGE RETURN (CR) character) also + represents a <a href="#syntax-newlines" title="syntax-newlines">newline</a>.<h4 id="character-references"><span class="secno">8.1.4 </span>Character references</h4><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p>In certain cases described in other sections, <a href="#syntax-text" title="syntax-text">text</a> may be mixed with <dfn id="syntax-charref" title="syntax-charref">character references</dfn>. These can be used to escape characters that couldn't otherwise legally be included in <a href="#syntax-text" title="syntax-text">text</a>.<p>Character references must start with a U+0026 AMPERSAND character (&). Following this, there are three possible kinds of character @@ -51674,9 +51677,9 @@ (;).</dd> </dl><p>The numeric character reference forms described above are allowed - to reference any Unicode code point other than U+0000, permanently - undefined Unicode characters (noncharacters), and control characters - other than <a href="#space-character" title="space character">space + to reference any Unicode code point other than U+0000, U+000D, + permanently undefined Unicode characters (noncharacters), and + control characters other than <a href="#space-character" title="space character">space characters</a>.<p>An <dfn id="syntax-ambiguous-ampersand" title="syntax-ambiguous-ampersand">ambiguous ampersand</dfn> is a U+0026 AMPERSAND character (&) that is followed by some <a href="#syntax-text" title="syntax-text">text</a> other than a @@ -54978,7 +54981,7 @@ <table><thead><tr><th>Number <th colspan="2">Unicode character <tbody><tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER - <tr><td>0x0D <td>U+000A <td>LINE FEED (LF) + <tr><td>0x0D <td>U+000D <td>CARRIAGE RETURN (CR) <tr><td>0x80 <td>U+20AC <td>EURO SIGN (€) <tr><td>0x81 <td>U+0081 <td><control> <tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK (‚) @@ -55400,7 +55403,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p>Ignore the token.</p> </dd> @@ -55606,7 +55609,7 @@ <dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p>Ignore the token.</p> </dd> @@ -55678,7 +55681,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p>Ignore the token.</p> <!-- :-( --> </dd> @@ -55744,7 +55747,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="#current-node">current node</a>.</p> @@ -55929,7 +55932,7 @@ <dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dt>A comment token</dt> <dt>A start tag whose tag name is one of: "link", "meta", "noframes", "style"</dt> <dd> @@ -55966,7 +55969,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="#current-node">current node</a>.</p> @@ -56064,8 +56067,8 @@ character</a> into the <a href="#current-node">current node</a>.</p> <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A - LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN - (CR),--> or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok + LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN + (CR), or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> </dd> @@ -56261,6 +56264,9 @@ one. (Newlines at the start of <code><a href="#the-pre-element">pre</a></code> blocks are ignored as an authoring convenience.)</p> + <!-- <pre>[CR]X will eat the [CR], <pre>X will eat the + , but <pre>X will not eat the . --> + <p>Set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> </dd> @@ -56997,6 +57003,8 @@ token, then ignore that token and move on to the next one. (Newlines at the start of <code><a href="#the-textarea-element">textarea</a></code> elements are ignored as an authoring convenience.)</li> + + <!-- see comment in <pre> start tag bit --> <li><p>Switch the tokenizer to the <a href="#rcdata-state">RCDATA state</a>.</li> @@ -57624,7 +57632,7 @@ <p>If any of the tokens in the <var><a href="#pending-table-character-tokens">pending table character tokens</a></var> list are character tokens that are not one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED - (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE, then + (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE, then reprocess those character tokens using the rules given in the "anything else" entry in the <a href="#parsing-main-intable" title="insertion mode: in table">in table</a>" insertion mode.</p> @@ -57703,7 +57711,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="#current-node">current node</a>.</p> @@ -58249,8 +58257,8 @@ character</a> into the <a href="#current-node">current node</a>.</p> <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A - LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN - (CR),--> or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok + LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN + (CR), or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> </dd> @@ -58470,7 +58478,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p>Process the token <a href="#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="#insertion-mode">insertion mode</a>.</p> @@ -58528,7 +58536,7 @@ <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="#current-node">current node</a>.</p> @@ -58622,7 +58630,7 @@ <!-- due to rules in the "in frameset" mode, this can't be entered in the fragment case --> <dl class="switch"><dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the character</a> into the <a href="#current-node">current node</a>.</p> @@ -58683,7 +58691,7 @@ <dt>A DOCTYPE token</dt> <dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dt>A start tag whose tag name is "html"</dt> <dd> <p>Process the token <a href="#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="#insertion-mode">insertion @@ -58717,7 +58725,7 @@ <dt>A DOCTYPE token</dt> <dt>A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF), - <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt> + U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt> <dt>A start tag whose tag name is "html"</dt> <dd> <p>Process the token <a href="#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="#insertion-mode">insertion
Received on Thursday, 1 April 2010 05:35:04 UTC