- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Tue, 02 Nov 2010 02:09:01 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec
In directory hutz:/tmp/cvs-serv15288
Modified Files:
Overview.html
Log Message:
Parser: don't convert 0000 to FFFD in the input stream processor, instead do it (mostly) in the tokenizer, so that we can instead swallow 0000s in body. (whatwg r5666)
Index: Overview.html
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.4532
retrieving revision 1.4533
diff -u -d -r1.4532 -r1.4533
--- Overview.html 2 Nov 2010 01:06:09 -0000 1.4532
+++ Overview.html 2 Nov 2010 02:08:58 -0000 1.4533
@@ -54457,12 +54457,12 @@
motivated by a desire to increase the resilience of user agents in
the face of naïve transcoders.</p>
- <p>All U+0000 NULL characters and code points in the range U+D800 to
- U+DFFF<!-- surrogates not allowed e.g. in UTF-8, and we don't want
- them to suddenly turn into codepoints when they go through a UTF-16
- pipe --> in the input must be replaced by U+FFFD REPLACEMENT
- CHARACTERs. Any occurrences of such characters and code points are
- <a href="#parse-error" title="parse error">parse errors</a>.</p>
+ <p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not
+ allowed e.g. in UTF-8, and we don't want them to suddenly turn into
+ codepoints when they go through a UTF-16 pipe --> in the input must
+ be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of
+ such characters and code points are <a href="#parse-error" title="parse error">parse
+ errors</a>.</p>
<p>Any occurrences of any characters in the ranges U+0001 to U+0008,
<!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF,
@@ -55095,6 +55095,10 @@
<dt>U+003C LESS-THAN SIGN (<)</dt>
<dd>Switch to the <a href="#tag-open-state">tag open state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit the <a href="#current-input-character">current input
+ character</a> as a character token.</dd>
+
<dt>EOF</dt>
<dd>Emit an end-of-file token.</dd>
@@ -55126,6 +55130,10 @@
<dt>U+003C LESS-THAN SIGN (<)</dt>
<dd>Switch to the <a href="#rcdata-less-than-sign-state">RCDATA less-than sign state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd>Emit an end-of-file token.</dd>
@@ -55153,6 +55161,10 @@
<dl class="switch"><dt>U+003C LESS-THAN SIGN (<)</dt>
<dd>Switch to the <a href="#rawtext-less-than-sign-state">RAWTEXT less-than sign state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd>Emit an end-of-file token.</dd>
@@ -55167,6 +55179,10 @@
<dl class="switch"><dt>U+003C LESS-THAN SIGN (<)</dt>
<dd>Switch to the <a href="#script-data-less-than-sign-state">script data less-than sign state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd>Emit an end-of-file token.</dd>
@@ -55178,7 +55194,11 @@
<p>Consume the <a href="#next-input-character">next input character</a>:</p>
- <dl class="switch"><dt>EOF</dt>
+ <dl class="switch"><dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
+ <dt>EOF</dt>
<dd>Emit an end-of-file token.</dd>
<dt>Anything else</dt>
@@ -55270,6 +55290,10 @@
character</a> (add 0x0020 to the character's code point) to the
current tag token's tag name.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current tag token's tag name.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55576,6 +55600,10 @@
<dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55596,6 +55624,11 @@
<dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign
state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-escaped-state">script data
+ escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER character
+ token.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55619,6 +55652,11 @@
<dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E
GREATER-THAN SIGN character token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-escaped-state">script data
+ escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER character
+ token.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55769,6 +55807,10 @@
sign state</a>. Emit a U+003C LESS-THAN SIGN character
token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55790,6 +55832,11 @@
sign state</a>. Emit a U+003C LESS-THAN SIGN character
token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-double-escaped-state">script data
+ double escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55815,6 +55862,11 @@
<dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E
GREATER-THAN SIGN character token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-double-escaped-state">script data
+ double escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER
+ character token.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -55893,6 +55945,12 @@
value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Start a new attribute in the current
+ tag token. Set that attribute's name to a U+FFFD REPLACEMENT
+ CHARACTER character, and its value to the empty string. Switch to
+ the <a href="#attribute-name-state">attribute name state</a>.</dd>
+
<dt>U+0022 QUOTATION MARK (")</dt>
<dt>U+0027 APOSTROPHE (')</dt>
<dt>U+003C LESS-THAN SIGN (<)</dt>
@@ -55906,8 +55964,8 @@
<dt>Anything else</dt>
<dd>Start a new attribute in the current tag token. Set that
- attribute's name to the <a href="#current-input-character">current input character</a>, and its value to
- the empty string. Switch to the <a href="#attribute-name-state">attribute name
+ attribute's name to the <a href="#current-input-character">current input character</a>, and
+ its value to the empty string. Switch to the <a href="#attribute-name-state">attribute name
state</a>.</dd>
</dl><h5 id="attribute-name-state"><span class="secno">8.2.4.35 </span><dfn>Attribute name state</dfn></h5>
@@ -55936,6 +55994,10 @@
character</a> (add 0x0020 to the character's code point) to the
current attribute's name.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current attribute's name.</dd>
+
<dt>U+0022 QUOTATION MARK (")</dt>
<dt>U+0027 APOSTROPHE (')</dt>
<dt>U+003C LESS-THAN SIGN (<)</dt>
@@ -55987,6 +56049,12 @@
and its value to the empty string. Switch to the <a href="#attribute-name-state">attribute
name state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Start a new attribute in the current
+ tag token. Set that attribute's name to a U+FFFD REPLACEMENT
+ CHARACTER character, and its value to the empty string. Switch to
+ the <a href="#attribute-name-state">attribute name state</a>.</dd>
+
<dt>U+0022 QUOTATION MARK (")</dt>
<dt>U+0027 APOSTROPHE (')</dt>
<dt>U+003C LESS-THAN SIGN (<)</dt>
@@ -56024,6 +56092,11 @@
<dt>U+0027 APOSTROPHE (')</dt>
<dd>Switch to the <a href="#attribute-value-single-quoted-state">attribute value (single-quoted) state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current attribute's value. Switch to the
+ <a href="#attribute-value-unquoted-state">attribute value (unquoted) state</a>.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
state</a>. Emit the current tag token.</dd>
@@ -56056,6 +56129,10 @@
state</a>, with the <a href="#additional-allowed-character">additional allowed character</a>
being U+0022 QUOTATION MARK (").</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current attribute's value.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the
<a href="#data-state">data state</a>.</dd>
@@ -56105,6 +56182,10 @@
<dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag
token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current attribute's value.</dd>
+
<dt>U+0022 QUOTATION MARK (")</dt>
<dt>U+0027 APOSTROPHE (')</dt>
<dt>U+003C LESS-THAN SIGN (<)</dt>
@@ -56183,12 +56264,13 @@
<p>Consume every character up to and including the first U+003E
GREATER-THAN SIGN character (>) or the end of the file (EOF),
whichever comes first. Emit a comment token whose data is the
- concatenation of all the characters starting from and including
- the character that caused the state machine to switch into the
- bogus comment state, up to and including the character immediately
- before the last consumed character (i.e. up to the character just
- before the U+003E or EOF character). (If the comment was started
- by the end of the file (EOF), the token is empty.)</p>
+ concatenation of all the characters starting from and including the
+ character that caused the state machine to switch into the bogus
+ comment state, up to and including the character immediately before
+ the last consumed character (i.e. up to the character just before
+ the U+003E or EOF character), but with any U+0000 NULL characters
+ replaced by U+FFFD REPLACEMENT CHARACTER characters. (If the comment
+ was started by the end of the file (EOF), the token is empty.)</p>
<p>Switch to the <a href="#data-state">data state</a>.</p>
@@ -56228,6 +56310,11 @@
<dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
<dd>Switch to the <a href="#comment-start-dash-state">comment start dash state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the comment token's data. Switch to the <a href="#comment-state">comment
+ state</a>.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
state</a>. Emit the comment token.</dd> <!-- see comment in
@@ -56248,6 +56335,12 @@
<dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
<dd>Switch to the <a href="#comment-end-state">comment end state</a></dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+002D HYPHEN-MINUS
+ character (-) and a U+FFFD REPLACEMENT CHARACTER character to the
+ comment token's data. Switch to the <a href="#comment-state">comment
+ state</a>.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#data-state">data
state</a>. Emit the comment token.</dd>
@@ -56269,6 +56362,10 @@
<dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
<dd>Switch to the <a href="#comment-end-dash-state">comment end dash state</a></dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the comment token's data.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Emit the comment token. Reconsume the
EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment
@@ -56285,6 +56382,12 @@
<dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt>
<dd>Switch to the <a href="#comment-end-state">comment end state</a></dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+002D HYPHEN-MINUS
+ character (-) and a U+FFFD REPLACEMENT CHARACTER character to the
+ comment token's data. Switch to the <a href="#comment-state">comment
+ state</a>.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Emit the comment token. Reconsume the
EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment
@@ -56303,6 +56406,12 @@
<dd>Switch to the <a href="#data-state">data state</a>. Emit the comment
token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS
+ characters (-) and a U+FFFD REPLACEMENT CHARACTER character to the
+ comment token's data. Switch to the <a href="#comment-state">comment
+ state</a>.</dd>
+
<dt>U+0021 EXCLAMATION MARK (!)</dt>
<dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#comment-end-bang-state">comment end bang
state</a>.</dd>
@@ -56338,6 +56447,12 @@
<dd>Switch to the <a href="#data-state">data state</a>. Emit the comment
token.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS
+ characters (-), a U+0021 EXCLAMATION MARK character (!), and a
+ U+FFFD REPLACEMENT CHARACTER character to the comment token's data.
+ Switch to the <a href="#comment-state">comment state</a>.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Emit the comment token. Reconsume
the EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see
@@ -56386,6 +56501,11 @@
character's code point). Switch to the <a href="#doctype-name-state">DOCTYPE name
state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Set the token's name to a U+FFFD
+ REPLACEMENT CHARACTER character. Switch to the <a href="#doctype-name-state">DOCTYPE name
+ state</a>.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Create a new DOCTYPE token. Set its
<i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
@@ -56421,6 +56541,10 @@
character</a> (add 0x0020 to the character's code point) to the
current DOCTYPE token's name.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current DOCTYPE token's name.</dd>
+
<dt>EOF</dt>
<dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's
<i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
@@ -56550,6 +56674,10 @@
<dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt>
<dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current DOCTYPE token's public identifier.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's
<i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
@@ -56561,8 +56689,8 @@
Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
<dt>Anything else</dt>
- <dd>Append the <a href="#current-input-character">current input character</a> to the current DOCTYPE
- token's public identifier.</dd>
+ <dd>Append the <a href="#current-input-character">current input character</a> to the current
+ DOCTYPE token's public identifier.</dd>
</dl><h5 id="doctype-public-identifier-single-quoted-state"><span class="secno">8.2.4.59 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
@@ -56571,6 +56699,10 @@
<dl class="switch"><dt>U+0027 APOSTROPHE (')</dt>
<dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current DOCTYPE token's public identifier.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's
<i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
@@ -56582,8 +56714,8 @@
Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd>
<dt>Anything else</dt>
- <dd>Append the <a href="#current-input-character">current input character</a> to the current DOCTYPE
- token's public identifier.</dd>
+ <dd>Append the <a href="#current-input-character">current input character</a> to the current
+ DOCTYPE token's public identifier.</dd>
</dl><h5 id="after-doctype-public-identifier-state"><span class="secno">8.2.4.60 </span><dfn>After DOCTYPE public identifier state</dfn></h5>
@@ -56737,6 +56869,10 @@
<dd>Switch to the <a href="#after-doctype-system-identifier-state">after DOCTYPE system identifier
state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current DOCTYPE token's system identifier.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's
<i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
@@ -56759,6 +56895,10 @@
<dd>Switch to the <a href="#after-doctype-system-identifier-state">after DOCTYPE system identifier
state</a>.</dd>
+ <dt>U+0000 NULL</dt>
+ <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER
+ character to the current DOCTYPE token's system identifier.</dd>
+
<dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's
<i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data
@@ -56821,7 +56961,9 @@
end of the file (EOF), whichever comes first. Emit a series of
character tokens consisting of all the characters consumed except
the matching three character sequence at the end (if one was found
- before the end of the file).</p>
+ before the end of the file)<!--(not needed; taken care of by the
+ tree constructor), but with any U+0000 NULL characters replaced by
+ U+FFFD REPLACEMENT CHARACTER characters-->.</p>
<p>Switch to the <a href="#data-state">data state</a>.</p>
@@ -58013,7 +58155,23 @@
<p>When the <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion
mode: in body">in body</a>", tokens must be handled as follows:</p>
- <dl class="switch"><dt>A character token</dt>
+ <dl class="switch"><dt>A character token that is U+0000 NULL</dt>
+ <dd>
+
+ <p><a href="#parse-error">Parse error</a>. Ignore the token.</p>
+
+ <!-- The D-Link DSL-G604T ADSL router has a zero byte in its
+ configuration UI before a <frameset>, which is why U+0000 is
+ special-cased here.
+ refs: https://bugzilla.mozilla.org/show_bug.cgi?id=563526
+ http://www.w3.org/Bugs/Public/show_bug.cgi?id=9659
+ -->
+
+ </dd>
+
+ <dt>A character token that is one of U+0009 CHARACTER TABULATION,
+ U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE
+ RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
@@ -58022,19 +58180,18 @@
<p><a href="#insert-a-character" title="insert a character">Insert the token's
character</a> into the <a href="#current-node">current node</a>.</p>
- <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
- LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
- (CR), U+0020 SPACE, or U+FFFD REPLACEMENT CHARACTER, then set the
- <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
+ </dd>
- <!-- U+FFFD REPLACEMENT CHARACTER is in this list because the
- D-Link DSL-G604T ADSL router has a zero byte in its
- configuration UI before a <frameset>. Zero bytes get
- converted to U+FFFD, which (without that character in this
- list) would mean the <frameset> would be ignored.
- refs: https://bugzilla.mozilla.org/show_bug.cgi?id=563526
- http://www.w3.org/Bugs/Public/show_bug.cgi?id=9659
- -->
+ <dt>Any other character token</dt>
+ <dd>
+
+ <p><a href="#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if
+ any.</p>
+
+ <p><a href="#insert-a-character" title="insert a character">Insert the token's
+ character</a> into the <a href="#current-node">current node</a>.</p>
+
+ <p>Set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
</dd>
@@ -59257,6 +59414,10 @@
<p><a href="#insert-a-character" title="insert a character">Insert the token's
character</a> into the <a href="#current-node">current node</a>.</p>
+ <p class="note">This can never be a U+0000 NULL character; the
+ tokenizer converts those to U+FFFD REPLACEMENT CHARACTER
+ characters.</p>
+
</dd>
<dt>An end-of-file token</dt>
@@ -60053,7 +60214,12 @@
<p>When the <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inselect" title="insertion
mode: in select">in select</a>", tokens must be handled as follows:</p>
- <dl class="switch"><dt>A character token</dt>
+ <dl class="switch"><dt>A character token that is U+0000 NULL</dt>
+ <dd>
+ <p><a href="#parse-error">Parse error</a>. Ignore the token.</p>
+ </dd>
+
+ <dt>Any other character token</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the token's
character</a> into the <a href="#current-node">current node</a>.</p>
@@ -60254,16 +60420,32 @@
</ol></dd>
- <dt>A character token</dt>
+ <dt>A character token that is U+0000 NULL</dt>
+ <dd>
+
+ <p><a href="#parse-error">Parse error</a>. <a href="#insert-a-character" title="insert a
+ character">Insert a U+FFFD REPLACEMENT CHARACTER character</a>
+ into the <a href="#current-node">current node</a>.</p>
+
+ </dd>
+
+ <dt>A character token that is one of U+0009 CHARACTER TABULATION,
+ U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE
+ RETURN (CR), or U+0020 SPACE</dt>
<dd>
<p><a href="#insert-a-character" title="insert a character">Insert the token's
character</a> into the <a href="#current-node">current node</a>.</p>
- <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
- LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
- (CR), or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok
- flag</a> to "not ok".</p>
+ </dd>
+
+ <dt>Any other character token</dt>
+ <dd>
+
+ <p><a href="#insert-a-character" title="insert a character">Insert the token's
+ character</a> into the <a href="#current-node">current node</a>.</p>
+
+ <p>Set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p>
</dd>
Received on Tuesday, 2 November 2010 02:09:05 UTC