- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Tue, 02 Nov 2010 02:09:01 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec In directory hutz:/tmp/cvs-serv15288 Modified Files: Overview.html Log Message: Parser: don't convert 0000 to FFFD in the input stream processor, instead do it (mostly) in the tokenizer, so that we can instead swallow 0000s in body. (whatwg r5666) Index: Overview.html =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.4532 retrieving revision 1.4533 diff -u -d -r1.4532 -r1.4533 --- Overview.html 2 Nov 2010 01:06:09 -0000 1.4532 +++ Overview.html 2 Nov 2010 02:08:58 -0000 1.4533 @@ -54457,12 +54457,12 @@ motivated by a desire to increase the resilience of user agents in the face of naïve transcoders.</p> - <p>All U+0000 NULL characters and code points in the range U+D800 to - U+DFFF<!-- surrogates not allowed e.g. in UTF-8, and we don't want - them to suddenly turn into codepoints when they go through a UTF-16 - pipe --> in the input must be replaced by U+FFFD REPLACEMENT - CHARACTERs. Any occurrences of such characters and code points are - <a href="#parse-error" title="parse error">parse errors</a>.</p> + <p>Code points in the range U+D800 to U+DFFF<!-- surrogates are not + allowed e.g. in UTF-8, and we don't want them to suddenly turn into + codepoints when they go through a UTF-16 pipe --> in the input must + be replaced by U+FFFD REPLACEMENT CHARACTERs. Any occurrences of + such characters and code points are <a href="#parse-error" title="parse error">parse + errors</a>.</p> <p>Any occurrences of any characters in the ranges U+0001 to U+0008, <!-- HT, LF allowed --> <!-- U+000B is in the next list --> <!-- FF, @@ -55095,6 +55095,10 @@ <dt>U+003C LESS-THAN SIGN (<)</dt> <dd>Switch to the <a href="#tag-open-state">tag open state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit the <a href="#current-input-character">current input + character</a> as a character token.</dd> + <dt>EOF</dt> <dd>Emit an end-of-file token.</dd> @@ -55126,6 +55130,10 @@ <dt>U+003C LESS-THAN SIGN (<)</dt> <dd>Switch to the <a href="#rcdata-less-than-sign-state">RCDATA less-than sign state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd>Emit an end-of-file token.</dd> @@ -55153,6 +55161,10 @@ <dl class="switch"><dt>U+003C LESS-THAN SIGN (<)</dt> <dd>Switch to the <a href="#rawtext-less-than-sign-state">RAWTEXT less-than sign state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd>Emit an end-of-file token.</dd> @@ -55167,6 +55179,10 @@ <dl class="switch"><dt>U+003C LESS-THAN SIGN (<)</dt> <dd>Switch to the <a href="#script-data-less-than-sign-state">script data less-than sign state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd>Emit an end-of-file token.</dd> @@ -55178,7 +55194,11 @@ <p>Consume the <a href="#next-input-character">next input character</a>:</p> - <dl class="switch"><dt>EOF</dt> + <dl class="switch"><dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + + <dt>EOF</dt> <dd>Emit an end-of-file token.</dd> <dt>Anything else</dt> @@ -55270,6 +55290,10 @@ character</a> (add 0x0020 to the character's code point) to the current tag token's tag name.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current tag token's tag name.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55576,6 +55600,10 @@ <dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55596,6 +55624,11 @@ <dd><p>Switch to the <a href="#script-data-escaped-less-than-sign-state">script data escaped less-than sign state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-escaped-state">script data + escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER character + token.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55619,6 +55652,11 @@ <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E GREATER-THAN SIGN character token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-escaped-state">script data + escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER character + token.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55769,6 +55807,10 @@ sign state</a>. Emit a U+003C LESS-THAN SIGN character token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55790,6 +55832,11 @@ sign state</a>. Emit a U+003C LESS-THAN SIGN character token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-double-escaped-state">script data + double escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55815,6 +55862,11 @@ <dd>Switch to the <a href="#script-data-state">script data state</a>. Emit a U+003E GREATER-THAN SIGN character token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#script-data-double-escaped-state">script data + double escaped state</a>. Emit a U+FFFD REPLACEMENT CHARACTER + character token.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -55893,6 +55945,12 @@ value to the empty string. Switch to the <a href="#attribute-name-state">attribute name state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Start a new attribute in the current + tag token. Set that attribute's name to a U+FFFD REPLACEMENT + CHARACTER character, and its value to the empty string. Switch to + the <a href="#attribute-name-state">attribute name state</a>.</dd> + <dt>U+0022 QUOTATION MARK (")</dt> <dt>U+0027 APOSTROPHE (')</dt> <dt>U+003C LESS-THAN SIGN (<)</dt> @@ -55906,8 +55964,8 @@ <dt>Anything else</dt> <dd>Start a new attribute in the current tag token. Set that - attribute's name to the <a href="#current-input-character">current input character</a>, and its value to - the empty string. Switch to the <a href="#attribute-name-state">attribute name + attribute's name to the <a href="#current-input-character">current input character</a>, and + its value to the empty string. Switch to the <a href="#attribute-name-state">attribute name state</a>.</dd> </dl><h5 id="attribute-name-state"><span class="secno">8.2.4.35 </span><dfn>Attribute name state</dfn></h5> @@ -55936,6 +55994,10 @@ character</a> (add 0x0020 to the character's code point) to the current attribute's name.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current attribute's name.</dd> + <dt>U+0022 QUOTATION MARK (")</dt> <dt>U+0027 APOSTROPHE (')</dt> <dt>U+003C LESS-THAN SIGN (<)</dt> @@ -55987,6 +56049,12 @@ and its value to the empty string. Switch to the <a href="#attribute-name-state">attribute name state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Start a new attribute in the current + tag token. Set that attribute's name to a U+FFFD REPLACEMENT + CHARACTER character, and its value to the empty string. Switch to + the <a href="#attribute-name-state">attribute name state</a>.</dd> + <dt>U+0022 QUOTATION MARK (")</dt> <dt>U+0027 APOSTROPHE (')</dt> <dt>U+003C LESS-THAN SIGN (<)</dt> @@ -56024,6 +56092,11 @@ <dt>U+0027 APOSTROPHE (')</dt> <dd>Switch to the <a href="#attribute-value-single-quoted-state">attribute value (single-quoted) state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current attribute's value. Switch to the + <a href="#attribute-value-unquoted-state">attribute value (unquoted) state</a>.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#data-state">data state</a>. Emit the current tag token.</dd> @@ -56056,6 +56129,10 @@ state</a>, with the <a href="#additional-allowed-character">additional allowed character</a> being U+0022 QUOTATION MARK (").</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current attribute's value.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> @@ -56105,6 +56182,10 @@ <dd>Switch to the <a href="#data-state">data state</a>. Emit the current tag token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current attribute's value.</dd> + <dt>U+0022 QUOTATION MARK (")</dt> <dt>U+0027 APOSTROPHE (')</dt> <dt>U+003C LESS-THAN SIGN (<)</dt> @@ -56183,12 +56264,13 @@ <p>Consume every character up to and including the first U+003E GREATER-THAN SIGN character (>) or the end of the file (EOF), whichever comes first. Emit a comment token whose data is the - concatenation of all the characters starting from and including - the character that caused the state machine to switch into the - bogus comment state, up to and including the character immediately - before the last consumed character (i.e. up to the character just - before the U+003E or EOF character). (If the comment was started - by the end of the file (EOF), the token is empty.)</p> + concatenation of all the characters starting from and including the + character that caused the state machine to switch into the bogus + comment state, up to and including the character immediately before + the last consumed character (i.e. up to the character just before + the U+003E or EOF character), but with any U+0000 NULL characters + replaced by U+FFFD REPLACEMENT CHARACTER characters. (If the comment + was started by the end of the file (EOF), the token is empty.)</p> <p>Switch to the <a href="#data-state">data state</a>.</p> @@ -56228,6 +56310,11 @@ <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt> <dd>Switch to the <a href="#comment-start-dash-state">comment start dash state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the comment token's data. Switch to the <a href="#comment-state">comment + state</a>.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#data-state">data state</a>. Emit the comment token.</dd> <!-- see comment in @@ -56248,6 +56335,12 @@ <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt> <dd>Switch to the <a href="#comment-end-state">comment end state</a></dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+002D HYPHEN-MINUS + character (-) and a U+FFFD REPLACEMENT CHARACTER character to the + comment token's data. Switch to the <a href="#comment-state">comment + state</a>.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#data-state">data state</a>. Emit the comment token.</dd> @@ -56269,6 +56362,10 @@ <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt> <dd>Switch to the <a href="#comment-end-dash-state">comment end dash state</a></dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the comment token's data.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Emit the comment token. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment @@ -56285,6 +56382,12 @@ <dl class="switch"><dt>U+002D HYPHEN-MINUS (-)</dt> <dd>Switch to the <a href="#comment-end-state">comment end state</a></dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+002D HYPHEN-MINUS + character (-) and a U+FFFD REPLACEMENT CHARACTER character to the + comment token's data. Switch to the <a href="#comment-state">comment + state</a>.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Emit the comment token. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see comment @@ -56303,6 +56406,12 @@ <dd>Switch to the <a href="#data-state">data state</a>. Emit the comment token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS + characters (-) and a U+FFFD REPLACEMENT CHARACTER character to the + comment token's data. Switch to the <a href="#comment-state">comment + state</a>.</dd> + <dt>U+0021 EXCLAMATION MARK (!)</dt> <dd><a href="#parse-error">Parse error</a>. Switch to the <a href="#comment-end-bang-state">comment end bang state</a>.</dd> @@ -56338,6 +56447,12 @@ <dd>Switch to the <a href="#data-state">data state</a>. Emit the comment token.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append two U+002D HYPHEN-MINUS + characters (-), a U+0021 EXCLAMATION MARK character (!), and a + U+FFFD REPLACEMENT CHARACTER character to the comment token's data. + Switch to the <a href="#comment-state">comment state</a>.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Emit the comment token. Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> <!-- see @@ -56386,6 +56501,11 @@ character's code point). Switch to the <a href="#doctype-name-state">DOCTYPE name state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Set the token's name to a U+FFFD + REPLACEMENT CHARACTER character. Switch to the <a href="#doctype-name-state">DOCTYPE name + state</a>.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Create a new DOCTYPE token. Set its <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data @@ -56421,6 +56541,10 @@ character</a> (add 0x0020 to the character's code point) to the current DOCTYPE token's name.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current DOCTYPE token's name.</dd> + <dt>EOF</dt> <dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. @@ -56550,6 +56674,10 @@ <dl class="switch"><dt>U+0022 QUOTATION MARK (")</dt> <dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current DOCTYPE token's public identifier.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data @@ -56561,8 +56689,8 @@ Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> <dt>Anything else</dt> - <dd>Append the <a href="#current-input-character">current input character</a> to the current DOCTYPE - token's public identifier.</dd> + <dd>Append the <a href="#current-input-character">current input character</a> to the current + DOCTYPE token's public identifier.</dd> </dl><h5 id="doctype-public-identifier-single-quoted-state"><span class="secno">8.2.4.59 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5> @@ -56571,6 +56699,10 @@ <dl class="switch"><dt>U+0027 APOSTROPHE (')</dt> <dd>Switch to the <a href="#after-doctype-public-identifier-state">after DOCTYPE public identifier state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current DOCTYPE token's public identifier.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data @@ -56582,8 +56714,8 @@ Reconsume the EOF character in the <a href="#data-state">data state</a>.</dd> <dt>Anything else</dt> - <dd>Append the <a href="#current-input-character">current input character</a> to the current DOCTYPE - token's public identifier.</dd> + <dd>Append the <a href="#current-input-character">current input character</a> to the current + DOCTYPE token's public identifier.</dd> </dl><h5 id="after-doctype-public-identifier-state"><span class="secno">8.2.4.60 </span><dfn>After DOCTYPE public identifier state</dfn></h5> @@ -56737,6 +56869,10 @@ <dd>Switch to the <a href="#after-doctype-system-identifier-state">after DOCTYPE system identifier state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current DOCTYPE token's system identifier.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data @@ -56759,6 +56895,10 @@ <dd>Switch to the <a href="#after-doctype-system-identifier-state">after DOCTYPE system identifier state</a>.</dd> + <dt>U+0000 NULL</dt> + <dd><a href="#parse-error">Parse error</a>. Append a U+FFFD REPLACEMENT CHARACTER + character to the current DOCTYPE token's system identifier.</dd> + <dt>U+003E GREATER-THAN SIGN (>)</dt> <dd><a href="#parse-error">Parse error</a>. Set the DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href="#data-state">data @@ -56821,7 +56961,9 @@ end of the file (EOF), whichever comes first. Emit a series of character tokens consisting of all the characters consumed except the matching three character sequence at the end (if one was found - before the end of the file).</p> + before the end of the file)<!--(not needed; taken care of by the + tree constructor), but with any U+0000 NULL characters replaced by + U+FFFD REPLACEMENT CHARACTER characters-->.</p> <p>Switch to the <a href="#data-state">data state</a>.</p> @@ -58013,7 +58155,23 @@ <p>When the <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>", tokens must be handled as follows:</p> - <dl class="switch"><dt>A character token</dt> + <dl class="switch"><dt>A character token that is U+0000 NULL</dt> + <dd> + + <p><a href="#parse-error">Parse error</a>. Ignore the token.</p> + + <!-- The D-Link DSL-G604T ADSL router has a zero byte in its + configuration UI before a <frameset>, which is why U+0000 is + special-cased here. + refs: https://bugzilla.mozilla.org/show_bug.cgi?id=563526 + http://www.w3.org/Bugs/Public/show_bug.cgi?id=9659 + --> + + </dd> + + <dt>A character token that is one of U+0009 CHARACTER TABULATION, + U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE + RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if @@ -58022,19 +58180,18 @@ <p><a href="#insert-a-character" title="insert a character">Insert the token's character</a> into the <a href="#current-node">current node</a>.</p> - <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A - LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN - (CR), U+0020 SPACE, or U+FFFD REPLACEMENT CHARACTER, then set the - <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> + </dd> - <!-- U+FFFD REPLACEMENT CHARACTER is in this list because the - D-Link DSL-G604T ADSL router has a zero byte in its - configuration UI before a <frameset>. Zero bytes get - converted to U+FFFD, which (without that character in this - list) would mean the <frameset> would be ignored. - refs: https://bugzilla.mozilla.org/show_bug.cgi?id=563526 - http://www.w3.org/Bugs/Public/show_bug.cgi?id=9659 - --> + <dt>Any other character token</dt> + <dd> + + <p><a href="#reconstruct-the-active-formatting-elements">Reconstruct the active formatting elements</a>, if + any.</p> + + <p><a href="#insert-a-character" title="insert a character">Insert the token's + character</a> into the <a href="#current-node">current node</a>.</p> + + <p>Set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> </dd> @@ -59257,6 +59414,10 @@ <p><a href="#insert-a-character" title="insert a character">Insert the token's character</a> into the <a href="#current-node">current node</a>.</p> + <p class="note">This can never be a U+0000 NULL character; the + tokenizer converts those to U+FFFD REPLACEMENT CHARACTER + characters.</p> + </dd> <dt>An end-of-file token</dt> @@ -60053,7 +60214,12 @@ <p>When the <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inselect" title="insertion mode: in select">in select</a>", tokens must be handled as follows:</p> - <dl class="switch"><dt>A character token</dt> + <dl class="switch"><dt>A character token that is U+0000 NULL</dt> + <dd> + <p><a href="#parse-error">Parse error</a>. Ignore the token.</p> + </dd> + + <dt>Any other character token</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the token's character</a> into the <a href="#current-node">current node</a>.</p> @@ -60254,16 +60420,32 @@ </ol></dd> - <dt>A character token</dt> + <dt>A character token that is U+0000 NULL</dt> + <dd> + + <p><a href="#parse-error">Parse error</a>. <a href="#insert-a-character" title="insert a + character">Insert a U+FFFD REPLACEMENT CHARACTER character</a> + into the <a href="#current-node">current node</a>.</p> + + </dd> + + <dt>A character token that is one of U+0009 CHARACTER TABULATION, + U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE + RETURN (CR), or U+0020 SPACE</dt> <dd> <p><a href="#insert-a-character" title="insert a character">Insert the token's character</a> into the <a href="#current-node">current node</a>.</p> - <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A - LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN - (CR), or U+0020 SPACE, then set the <a href="#frameset-ok-flag">frameset-ok - flag</a> to "not ok".</p> + </dd> + + <dt>Any other character token</dt> + <dd> + + <p><a href="#insert-a-character" title="insert a character">Insert the token's + character</a> into the <a href="#current-node">current node</a>.</p> + + <p>Set the <a href="#frameset-ok-flag">frameset-ok flag</a> to "not ok".</p> </dd>
Received on Tuesday, 2 November 2010 02:09:05 UTC