- From: poot <cvsmail@w3.org>
- Date: Tue, 23 Jun 2009 10:34:15 +0900 (JST)
- To: public-html-diffs@w3.org
Write some explanatory text around the HTML parser. (whatwg r3304) http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.2449&r2=1.2450&f=h http://html5.org/tools/web-apps-tracker?from=3303&to=3304 =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.2449 retrieving revision 1.2450 diff -u -d -r1.2449 -r1.2450 --- Overview.html 17 Jun 2009 07:12:12 -0000 1.2449 +++ Overview.html 23 Jun 2009 01:33:49 -0000 1.2450 @@ -146,13 +146,28 @@ -webkit-column-width: 25em; -webkit-column-gap: 1em; } + + ul.domTree, ul.domTree ul { padding: 0 0 0 1em; margin: 0; } + ul.domTree li { padding: 0; margin: 0; list-style: none; position: relative; } + ul.domTree li li { list-style: none; } + ul.domTree li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; } + ul.domTree li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; } + ul.domTree span { font-style: italic; font-family: serif; } + ul.domTree .t1 code { color: purple; font-weight: bold; } + ul.domTree .t2 { font-style: normal; font-family: monospace; } + ul.domTree .t2 .name { color: black; font-weight: bold; } + ul.domTree .t2 .value { color: blue; font-weight: normal; } + ul.domTree .t3 code, .domTree .t4 code, .domTree .t5 code { color: gray; } + ul.domTree .t7 code, .domTree .t8 code { color: green; } + ul.domTree .t10 code { color: teal; } + </style><link href="data:text/css," rel="stylesheet" title="Complete specification"><link href="data:text/css,.impl%20{%20display:%20none;%20}" rel="alternate stylesheet" title="Author documentation only"><link href="data:text/css,.impl%20{%20background:%20%23FFEEEE;%20}" rel="alternate stylesheet" title="Highlight implementation requirements"><link href="http://www.w3.org/StyleSheets/TR/W3C-ED" rel="stylesheet" type="text/css"><!-- ZZZ ED vs WD --><div class="head"> <p><a href="http://www.w3.org/"><img alt="W3C" height="48" src="http://www.w3.org/Icons/w3c_home" width="72"></a></p> <h1>HTML 5</h1> <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2> <!--ZZZ:--> <!--<h2 class="no-num no-toc">W3C Working Draft 23 April 2009</h2>--> - <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 17 June 2009</h2> + <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 23 June 2009</h2> <!--:ZZZ--> <dl><!-- ZZZ: update the month/day (twice), (un)comment out <dt>This Version:</dt> @@ -245,7 +260,7 @@ track. <!--ZZZ:--> <!--This specification is the 23 April 2009 Working Draft.--> - This specification is the 17 June 2009 Editor's Draft. + This specification is the 23 June 2009 Editor's Draft. <!--:ZZZ--> </p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>This specification is also being produced by the <a href="http://www.whatwg.org/">WHATWG</a>. The two specifications are identical from the table of contents onwards.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- context and rationale (required) --><p>This specification is intended to replace (be a new version of) @@ -1058,7 +1073,12 @@ <li><a href="#the-after-after-body-insertion-mode"><span class="secno">9.2.5.24 </span>The "after after body" insertion mode</a></li> <li><a href="#the-after-after-frameset-insertion-mode"><span class="secno">9.2.5.25 </span>The "after after frameset" insertion mode</a></ol></li> <li><a href="#the-end"><span class="secno">9.2.6 </span>The end</a></li> - <li><a href="#coercing-an-html-dom-into-an-infoset"><span class="secno">9.2.7 </span>Coercing an HTML DOM into an infoset</a></ol></li> + <li><a href="#coercing-an-html-dom-into-an-infoset"><span class="secno">9.2.7 </span>Coercing an HTML DOM into an infoset</a></li> + <li><a href="#an-introduction-to-error-handling-in-the-parser"><span class="secno">9.2.8 </span>An introduction to error handling in the parser</a> + <ol> + <li><a href="#misnested-tags:-b-i-b-i"><span class="secno">9.2.8.1 </span>Misnested tags: <b><i></b></i></a></li> + <li><a href="#misnested-tags:-b-p-b-p"><span class="secno">9.2.8.2 </span>Misnested tags: <b><p></b></p></a></li> + <li><a href="#unexpected-markup-in-tables"><span class="secno">9.2.8.3 </span>Unexpected markup in tables</a></ol></ol></li> <li><a href="#namespaces"><span class="secno">9.3 </span>Namespaces</a></li> <li><a href="#serializing-html-fragments"><span class="secno">9.4 </span>Serializing HTML fragments</a></li> <li><a href="#parsing-html-fragments"><span class="secno">9.5 </span>Parsing HTML fragments</a></li> @@ -52234,6 +52254,7 @@ pause flag</dfn>, which must be initially set to false.</p> + <h4 id="the-input-stream"><span class="secno">9.2.2 </span>The <dfn>input stream</dfn></h4> <p>The stream of Unicode characters that comprises the input to the @@ -53057,8 +53078,13 @@ category, and scope markers. The scope markers are inserted when entering <code><a href="#the-applet-element">applet</a></code> elements, buttons, <code><a href="#the-object-element">object</a></code> elements, marquees, table cells, and table captions, and are used to - prevent formatting from "leaking" into <code><a href="#the-applet-element">applet</a></code> elements, - buttons, <code><a href="#the-object-element">object</a></code> elements, marquees, and tables.</p> + prevent formatting from "leaking" <em>into</em> <code><a href="#the-applet-element">applet</a></code> + elements, buttons, <code><a href="#the-object-element">object</a></code> elements, marquees, and + tables.</p> + + <p class="note">The scope markers are unrelated to the concept of an + element being <a href="#has-an-element-in-scope" title="has an element in scope">in + scope</a>.</p> <p>In addition, each element in the <a href="#list-of-active-formatting-elements">list of active formatting elements</a> is associated with the token for which it was @@ -54835,9 +54861,9 @@ must be inserted into the <i><a href="#foster-parent-element">foster parent element</a></i>, and the <a href="#current-table">current table</a> must be marked as <dfn id="tainted">tainted</dfn>. (Once the <a href="#current-table">current table</a> has been - <a href="#tainted">tainted</a>, whitespace characters are inserted into the - <i><a href="#foster-parent-element">foster parent element</a></i> instead of the <a href="#current-node">current - node</a>.)</p> + <a href="#tainted">tainted</a>, <a href="#space-character" title="space character">space + characters</a> are inserted into the <i><a href="#foster-parent-element">foster parent element</a></i> + instead of the <a href="#current-node">current node</a>.)</p> <p>The <dfn id="foster-parent-element">foster parent element</dfn> is the parent element of the last <code><a href="#the-table-element">table</a></code> element in the <a href="#stack-of-open-elements">stack of open @@ -58265,7 +58291,192 @@ - <h3 id="namespaces"><span class="secno">9.3 </span>Namespaces</h3> + <h4 id="an-introduction-to-error-handling-in-the-parser"><span class="secno">9.2.8 </span>An introduction to error handling in the parser</h4> + + <p><em>This section is non-normative.</em></p> + + <p>This section examines some erroneous markup and discusses how + the <a href="#html-parser">HTML parser</a> handles these cases.</p> + + + <h5 id="misnested-tags:-b-i-b-i"><span class="secno">9.2.8.1 </span>Misnested tags: <b><i></b></i></h5> + + <p><em>This section is non-normative.</em></p> + + <p>The most-often discussed example of erroneous markup is as + follows:</p> + + <pre><p>1<b>2<i>3</b>4</i>5</p></pre> + + <p>The parsing of this markup is straightforward up to the "3". At + this point, the DOM looks like this:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span><li class="t1"><code><a href="#the-i-element">i</a></code><ul><li class="t3"><code>#text</code>: <span title="">3</span></ul></ul></ul></ul></ul></ul><p>Here, the <a href="#stack-of-open-elements">stack of open elements</a> has five elements + on it: <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-p-element">p</a></code>, + <code><a href="#the-b-element">b</a></code>, and <code><a href="#the-i-element">i</a></code>. The <a href="#list-of-active-formatting-elements">list of active + formatting elements</a> just has two: <code><a href="#the-b-element">b</a></code> and + <code><a href="#the-i-element">i</a></code>. The <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>".</p> + + <p>Upon receiving the end tag token with the tag name "b", the "<a href="#adoptionAgency">adoption agency algorithm</a>" is + invoked. This is a simple case, in that the <var title="">formatting + element</var> is the <code><a href="#the-b-element">b</a></code> element, and there is no + <var title="">furthest block</var>. Thus, the <a href="#stack-of-open-elements">stack of open + elements</a> ends up with just three elements: <code><a href="#the-html-element">html</a></code>, + <code><a href="#the-body-element">body</a></code>, and <code><a href="#the-p-element">p</a></code>, while the <a href="#list-of-active-formatting-elements">list of + active formatting elements</a> has just one: <code><a href="#the-i-element">i</a></code>. The + DOM tree is unmodified at this point.</p> + + <p>The next token is a character ("4"), triggers the <a href="#reconstruct-the-active-formatting-elements" title="reconstruct the active formatting elements">reconstruction of + the active formatting elements</a>, in this case just the + <code><a href="#the-i-element">i</a></code> element. A new <code><a href="#the-i-element">i</a></code> element is thus created + for the "4" text node. After the end tag token for the "i" is also + received, and the "5" text node is inserted, the DOM looks as + follows:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span><li class="t1"><code><a href="#the-i-element">i</a></code><ul><li class="t3"><code>#text</code>: <span title="">3</span></ul></ul><li class="t1"><code><a href="#the-i-element">i</a></code><ul><li class="t3"><code>#text</code>: <span title="">4</span></ul><li class="t3"><code>#text</code>: <span title="">5</span></ul></ul></ul></ul><h5 id="misnested-tags:-b-p-b-p"><span class="secno">9.2.8.2 </span>Misnested tags: <b><p></b></p></h5> + + <p><em>This section is non-normative.</em></p> + + <p>A case similar to the previous one is the following:</p> + + <pre><b>1<p>2</b>3</p></pre> + + <p>Up to the "2" the parsing here is straightforward:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul></ul></ul></ul><p>The interesting part is when the end tag token with the tag name + "b" is parsed.</p> + + <p>Before that token is seen, the <a href="#stack-of-open-elements">stack of open + elements</a> has four elements on it: <code><a href="#the-html-element">html</a></code>, + <code><a href="#the-body-element">body</a></code>, <code><a href="#the-b-element">b</a></code>, and <code><a href="#the-p-element">p</a></code>. The + <a href="#list-of-active-formatting-elements">list of active formatting elements</a> just has the one: + <code><a href="#the-b-element">b</a></code>. The <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>".</p> + + <p>Upon receiving the end tag token with the tag name "b", the "<a href="#adoptionAgency">adoption agency algorithm</a>" is invoked, as + in the previous example. However, in this case, there <em>is</em> a + <var title="">furthest block</var>, namely the <code><a href="#the-p-element">p</a></code> element. Thus, + this time the adoption agency algorithm isn't skipped over.</p> + + <p>The <var title="">common ancestor</var> is the <code><a href="#the-body-element">body</a></code> + element. A conceptual "bookmark" marks the position of the + <code><a href="#the-b-element">b</a></code> in the <a href="#list-of-active-formatting-elements">list of active formatting + elements</a>, but since that list has only one element in it, + it won't have much effect.</p> + + <p>As the algorithm progresses, <var title="">node</var> ends up set + to the formatting element (<code><a href="#the-b-element">b</a></code>), and <var title="">last + node</var> ends up set to the <var title="">furthest block</var> + (<code><a href="#the-p-element">p</a></code>).</p> + + <p>The <var title="">last node</var> gets appended (moved) to the + <var title="">common ancestor</var>, so that the DOM looks like:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul></ul></ul><p>A new <code><a href="#the-b-element">b</a></code> element is created, and the children of the + <code><a href="#the-p-element">p</a></code> element are moved to it:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code></ul></ul></ul><ul class="domTree"><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul><p>Finally, the new <code><a href="#the-b-element">b</a></code> element is appended to the + <code><a href="#the-p-element">p</a></code> element, so that the DOM looks like:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul></ul></ul></ul><p>The <code><a href="#the-b-element">b</a></code> element is removed from the <a href="#list-of-active-formatting-elements">list of + active formatting elements</a> and the <a href="#stack-of-open-elements">stack of open + elements</a>, so that when the "3" is parsed, it is appended to + the <code><a href="#the-p-element">p</a></code> element:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul><li class="t3"><code>#text</code>: <span title="">3</span></ul></ul></ul></ul><h5 id="unexpected-markup-in-tables"><span class="secno">9.2.8.3 </span>Unexpected markup in tables</h5> + + <p><em>This section is non-normative.</em></p> + + <p>Error handling in tables is, for historical reasons, especially + strange. For example, consider the following markup:</p> + + <pre><table><strong><b></strong><tr><td>aaa</td></tr><strong>bbb</strong></table>ccc</pre> + + <p>The highlighted <code><a href="#the-b-element">b</a></code> element start tag is not allowed + directly inside a table like that, and the parser handles this case + by placing the element <em>before</em> the table. (This is called <i title="foster parent"><a href="#foster-parent">foster parenting</a></i>.) This can be seen by + examining the DOM tree as it stands just after the + <code><a href="#the-table-element">table</a></code> element's start tag has been seen:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-table-element">table</a></code></ul></ul></ul><p>...and then immediately after the <code><a href="#the-b-element">b</a></code> element start + tag has been seen:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code></ul></ul></ul><p>At this point, the <a href="#stack-of-open-elements">stack of open elements</a> has on it + the elements <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, + <code><a href="#the-table-element">table</a></code>, and <code><a href="#the-b-element">b</a></code> (in that order, despite the + resulting DOM tree); the <a href="#list-of-active-formatting-elements">list of active formatting + elements</a> just has the <code><a href="#the-b-element">b</a></code> element in it; the + <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intable" title="insertion mode: in + table">in table</a>"; and the <code><a href="#the-table-element">table</a></code> element is + <a href="#tainted">tainted</a>.</p> + + <p>The <code><a href="#the-tr-element">tr</a></code> start tag causes the <code><a href="#the-b-element">b</a></code> element + to be popped off the stack and a <code><a href="#the-tbody-element">tbody</a></code> start tag to be + implied; the <code><a href="#the-tbody-element">tbody</a></code> and <code><a href="#the-tr-element">tr</a></code> elements are + then handled in a rather straight-forward manner, taking the parser + through the "<a href="#parsing-main-intbody" title="insertion mode: in table body">in table + body</a>" and "<a href="#parsing-main-intr" title="insertion mode: in row">in + row</a>" insertion modes, after which the DOM looks as + follows:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code></ul></ul></ul></ul></ul><p>Here, the <a href="#stack-of-open-elements">stack of open elements</a> has on it the + elements <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-table-element">table</a></code>, + <code><a href="#the-tbody-element">tbody</a></code>, and <code><a href="#the-tr-element">tr</a></code>; the <a href="#list-of-active-formatting-elements">list of active + formatting elements</a> still has the <code><a href="#the-b-element">b</a></code> element in + it; the <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intr" title="insertion mode: + in row">in row</a>"; and the <code><a href="#the-table-element">table</a></code> element is still + <a href="#tainted">tainted</a>.</p> + + <p>The <code><a href="#the-td-element">td</a></code> element start tag token, after putting a + <code><a href="#the-td-element">td</a></code> element on the tree, puts a marker on the <a href="#list-of-active-formatting-elements">list + of active formatting elements</a> (it also switches to the "<a href="#parsing-main-intd" title="insertion mode: in cell">in cell</a>" <a href="#insertion-mode">insertion + mode</a>).</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code></ul></ul></ul></ul></ul></ul><p>The marker means that when the "aaa" character tokens are seen, + no <code><a href="#the-b-element">b</a></code> element is created to hold the resulting text + node:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code><ul><li class="t3"><code>#text</code>: <span title="">aaa</span></ul></ul></ul></ul></ul></ul></ul><p>The end tags are handled in a straight-forward manner; after + handling them, the <a href="#stack-of-open-elements">stack of open elements</a> has on it the + elements <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-table-element">table</a></code>, + and <code><a href="#the-tbody-element">tbody</a></code>; the <a href="#list-of-active-formatting-elements">list of active formatting + elements</a> still has the <code><a href="#the-b-element">b</a></code> element in it (the + marker having been removed by the "td" end tag token); the + <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intbody" title="insertion mode: in + table body">in table body</a>"; and the <code><a href="#the-table-element">table</a></code> + element is still <a href="#tainted">tainted</a>.</p> + + <p>Thus it is that the "bbb" character tokens are found. When <a href="#reconstruct-the-active-formatting-elements" title="reconstruct the active formatting elements">the active + formatting elements are reconstructed</a>, a <code><a href="#the-b-element">b</a></code> + element is created and <a href="#foster-parent" title="foster parent">foster + parented</a>, and then the "bbb" text node is appended to it:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">bbb</span></ul><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code><ul><li class="t3"><code>#text</code>: <span title="">aaa</span></ul></ul></ul></ul></ul></ul></ul><p>The <a href="#stack-of-open-elements">stack of open elements</a> has on it the elements + <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-table-element">table</a></code>, + <code><a href="#the-tbody-element">tbody</a></code>, and the new <code><a href="#the-b-element">b</a></code> (again, note that + this doesn't match the resulting tree!); the <a href="#list-of-active-formatting-elements">list of active + formatting elements</a> has the new <code><a href="#the-b-element">b</a></code> element in it; + the <a href="#insertion-mode">insertion mode</a> is still "<a href="#parsing-main-intbody" title="insertion + mode: in table body">in table body</a>"; and the + <code><a href="#the-table-element">table</a></code> element is still <a href="#tainted">tainted</a>.</p> + + <p>Had the character tokens been <a href="#space-character" title="space character">space + characters</a> instead of "bbb", the result would have been the + same, but only because the table is <a href="#tainted">tainted</a>. Had the + <code><a href="#the-b-element">b</a></code> element's start tag been before the + <code><a href="#the-table-element">table</a></code> instead of after, then the table wouldn't have + been <a href="#tainted">tainted</a> and such <a href="#space-character" title="space + character">space characters</a> would just be appended to the + <code><a href="#the-tbody-element">tbody</a></code> element.</p> + + <p>Finally, the <code><a href="#the-table-element">table</a></code> is closed by a "table" end + tag. This pops all the nodes from the <a href="#stack-of-open-elements">stack of open + elements</a> up to and including the <code><a href="#the-table-element">table</a></code> element, + but it doesn't affect the <a href="#list-of-active-formatting-elements">list of active formatting + elements</a>, so the "ccc" character tokens after the table + result in yet another <code><a href="#the-b-element">b</a></code> element being created, this + time after the table:</p> + + <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">bbb</span></ul><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code><ul><li class="t3"><code>#text</code>: <span title="">aaa</span></ul></ul></ul></ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">ccc</span></ul></ul></ul></ul><h3 id="namespaces"><span class="secno">9.3 </span>Namespaces</h3> <p>The <dfn id="html-namespace-0">HTML namespace</dfn> is: <code>http://www.w3.org/1999/xhtml</code></p>
Received on Tuesday, 23 June 2009 01:34:50 UTC