- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Thu, 01 Apr 2010 01:21:43 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec
In directory hutz:/tmp/cvs-serv823
Modified Files:
Overview.html
Log Message:
Make map to U+000D and not U+000A. This has ramifications throughout the parser. (whatwg r4933)
Index: Overview.html
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.3953
retrieving revision 1.3954
diff -u -d -r1.3953 -r1.3954
--- Overview.html 1 Apr 2010 01:00:43 -0000 1.3953
+++ Overview.html 1 Apr 2010 01:21:40 -0000 1.3954
@@ -51637,7 +51637,10 @@
to be put, as described in the other sections.<h5 id="newlines"><span class="secno">8.1.3.1 </span>Newlines</h5><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p><dfn id="syntax-newlines" title="syntax-newlines">Newlines</dfn> in HTML may be
represented either as U+000D CARRIAGE RETURN (CR) characters, U+000A
LINE FEED (LF) characters, or pairs of U+000D CARRIAGE RETURN (CR),
- U+000A LINE FEED (LF) characters in that order.<h4 id="character-references"><span class="secno">8.1.4 </span>Character references</h4><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p>In certain cases described in other sections, <a href="#syntax-text" title="syntax-text">text</a> may be mixed with <dfn id="syntax-charref" title="syntax-charref">character references</dfn>. These can be used
+ U+000A LINE FEED (LF) characters in that order.<p>Where <a href="#syntax-charref" title="syntax-charref">character references</a>
+ are allowed, a character reference of a U+000A LINE FEED (LF)
+ character (but not a U+000D CARRIAGE RETURN (CR) character) also
+ represents a <a href="#syntax-newlines" title="syntax-newlines">newline</a>.<h4 id="character-references"><span class="secno">8.1.4 </span>Character references</h4><p class="XXX annotation"><b>Status: </b><i>Last call for comments</i><p>In certain cases described in other sections, <a href="#syntax-text" title="syntax-text">text</a> may be mixed with <dfn id="syntax-charref" title="syntax-charref">character references</dfn>. These can be used
to escape characters that couldn't otherwise legally be included in
<a href="#syntax-text" title="syntax-text">text</a>.<p>Character references must start with a U+0026 AMPERSAND character
(&). Following this, there are three possible kinds of character
@@ -51674,9 +51677,9 @@
(;).</dd>
</dl><p>The numeric character reference forms described above are allowed
- to reference any Unicode code point other than U+0000, permanently
- undefined Unicode characters (noncharacters), and control characters
- other than <a href="#space-character" title="space character">space
+ to reference any Unicode code point other than U+0000, U+000D,
+ permanently undefined Unicode characters (noncharacters), and
+ control characters other than <a href="#space-character" title="space character">space
characters</a>.<p>An <dfn id="syntax-ambiguous-ampersand" title="syntax-ambiguous-ampersand">ambiguous
ampersand</dfn> is a U+0026 AMPERSAND character (&) that is
followed by some <a href="#syntax-text" title="syntax-text">text</a> other than a
@@ -54978,7 +54981,7 @@
<table><thead><tr><th>Number <th colspan="2">Unicode character
<tbody><tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER
- <tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
+ <tr><td>0x0D <td>U+000D <td>CARRIAGE RETURN (CR)
<tr><td>0x80 <td>U+20AC <td>EURO SIGN (€)
<tr><td>0x81 <td>U+0081 <td><control>
<tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK (‚)
@@ -55400,7 +55403,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p>Ignore the token.</p>
</dd>
@@ -55606,7 +55609,7 @@
<dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p>Ignore the token.</p>
</dd>
@@ -55678,7 +55681,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p>Ignore the token.</p> <!-- :-( -->
</dd>
@@ -55744,7 +55747,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
the <a href="#current-node">current node</a>.</p>
@@ -55929,7 +55932,7 @@
<dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dt>A comment token</dt>
<dt>A start tag whose tag name is one of: "link", "meta", "noframes", "style"</dt>
<dd>
@@ -55966,7 +55969,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
the <a href="#current-node">current node</a>.</p>
@@ -56064,8 +56067,8 @@
character</a> into the <a href="#current-node">current node</a>.</p>
<p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
- LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
- (CR),--> or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok
+ LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+ (CR), or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok
flag</a> to "not ok".</p>
</dd>
@@ -56261,6 +56264,9 @@
one. (Newlines at the start of <code><a href="#the-pre-element">pre</a></code> blocks are
ignored as an authoring convenience.)</p>
+ <!-- <pre>[CR]X will eat the [CR], <pre>X will eat the
+ , but <pre>X will not eat the . -->
+
<p>Set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
</dd>
@@ -56997,6 +57003,8 @@
token, then ignore that token and move on to the next
one. (Newlines at the start of <code><a href="#the-textarea-element">textarea</a></code> elements are
ignored as an authoring convenience.)</li>
+
+ <!-- see comment in <pre> start tag bit -->
<li><p>Switch the tokenizer to the <a href="#rcdata-state">RCDATA
state</a>.</li>
@@ -57624,7 +57632,7 @@
<p>If any of the tokens in the <var><a href="#pending-table-character-tokens">pending table character
tokens</a></var> list are character tokens that are not one of U+0009
CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED
- (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE, then
+ (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE, then
reprocess those character tokens using the rules given in the
"anything else" entry in the <a href="#parsing-main-intable" title="insertion mode: in
table">in table</a>" insertion mode.</p>
@@ -57703,7 +57711,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
the <a href="#current-node">current node</a>.</p>
@@ -58249,8 +58257,8 @@
character</a> into the <a href="#current-node">current node</a>.</p>
<p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
- LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
- (CR),--> or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok
+ LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+ (CR), or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok
flag</a> to "not ok".</p>
</dd>
@@ -58470,7 +58478,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p>Process the token <a href="#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="#insertion-mode">insertion
mode</a>.</p>
@@ -58528,7 +58536,7 @@
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
the <a href="#current-node">current node</a>.</p>
@@ -58622,7 +58630,7 @@
<!-- due to rules in the "in frameset" mode, this can't be entered in the fragment case -->
<dl class="switch"><dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the character</a> into
the <a href="#current-node">current node</a>.</p>
@@ -58683,7 +58691,7 @@
<dt>A DOCTYPE token</dt>
<dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dt>A start tag whose tag name is "html"</dt>
<dd>
<p>Process the token <a href="#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="#insertion-mode">insertion
@@ -58717,7 +58725,7 @@
<dt>A DOCTYPE token</dt>
<dt>A character token that is one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
- <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+ U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
<dt>A start tag whose tag name is "html"</dt>
<dd>
<p>Process the token <a href="#using-the-rules-for">using the rules for</a> the "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>" <a href="#insertion-mode">insertion
Received on Thursday, 1 April 2010 01:21:45 UTC