- From: poot <cvsmail@w3.org>
- Date: Tue, 23 Jun 2009 10:34:15 +0900 (JST)
- To: public-html-diffs@w3.org
Write some explanatory text around the HTML parser. (whatwg r3304)
http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.2449&r2=1.2450&f=h
http://html5.org/tools/web-apps-tracker?from=3303&to=3304
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.2449
retrieving revision 1.2450
diff -u -d -r1.2449 -r1.2450
--- Overview.html 17 Jun 2009 07:12:12 -0000 1.2449
+++ Overview.html 23 Jun 2009 01:33:49 -0000 1.2450
@@ -146,13 +146,28 @@
-webkit-column-width: 25em;
-webkit-column-gap: 1em;
}
+
+ ul.domTree, ul.domTree ul { padding: 0 0 0 1em; margin: 0; }
+ ul.domTree li { padding: 0; margin: 0; list-style: none; position: relative; }
+ ul.domTree li li { list-style: none; }
+ ul.domTree li:first-child::before { position: absolute; top: 0; height: 0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
+ ul.domTree li:not(:last-child)::after { position: absolute; top: 0; bottom: -0.6em; left: -0.75em; width: 0.5em; border-style: none none solid solid; content: ''; border-width: 0.1em; }
+ ul.domTree span { font-style: italic; font-family: serif; }
+ ul.domTree .t1 code { color: purple; font-weight: bold; }
+ ul.domTree .t2 { font-style: normal; font-family: monospace; }
+ ul.domTree .t2 .name { color: black; font-weight: bold; }
+ ul.domTree .t2 .value { color: blue; font-weight: normal; }
+ ul.domTree .t3 code, .domTree .t4 code, .domTree .t5 code { color: gray; }
+ ul.domTree .t7 code, .domTree .t8 code { color: green; }
+ ul.domTree .t10 code { color: teal; }
+
</style><link href="data:text/css," rel="stylesheet" title="Complete specification"><link href="data:text/css,.impl%20{%20display:%20none;%20}" rel="alternate stylesheet" title="Author documentation only"><link href="data:text/css,.impl%20{%20background:%20%23FFEEEE;%20}" rel="alternate stylesheet" title="Highlight implementation requirements"><link href="http://www.w3.org/StyleSheets/TR/W3C-ED" rel="stylesheet" type="text/css"><!-- ZZZ ED vs WD --><div class="head">
<p><a href="http://www.w3.org/"><img alt="W3C" height="48" src="http://www.w3.org/Icons/w3c_home" width="72"></a></p>
<h1>HTML 5</h1>
<h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2>
<!--ZZZ:-->
<!--<h2 class="no-num no-toc">W3C Working Draft 23 April 2009</h2>-->
- <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 17 June 2009</h2>
+ <h2 class="no-num no-toc" id="editor-s-draft-date-1-january-1970">Editor's Draft 23 June 2009</h2>
<!--:ZZZ-->
<dl><!-- ZZZ: update the month/day (twice), (un)comment out
<dt>This Version:</dt>
@@ -245,7 +260,7 @@
track.
<!--ZZZ:-->
<!--This specification is the 23 April 2009 Working Draft.-->
- This specification is the 17 June 2009 Editor's Draft.
+ This specification is the 23 June 2009 Editor's Draft.
<!--:ZZZ-->
</p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>This specification is also being produced by the <a href="http://www.whatwg.org/">WHATWG</a>. The two specifications are
identical from the table of contents onwards.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- context and rationale (required) --><p>This specification is intended to replace (be a new version of)
@@ -1058,7 +1073,12 @@
<li><a href="#the-after-after-body-insertion-mode"><span class="secno">9.2.5.24 </span>The "after after body" insertion mode</a></li>
<li><a href="#the-after-after-frameset-insertion-mode"><span class="secno">9.2.5.25 </span>The "after after frameset" insertion mode</a></ol></li>
<li><a href="#the-end"><span class="secno">9.2.6 </span>The end</a></li>
- <li><a href="#coercing-an-html-dom-into-an-infoset"><span class="secno">9.2.7 </span>Coercing an HTML DOM into an infoset</a></ol></li>
+ <li><a href="#coercing-an-html-dom-into-an-infoset"><span class="secno">9.2.7 </span>Coercing an HTML DOM into an infoset</a></li>
+ <li><a href="#an-introduction-to-error-handling-in-the-parser"><span class="secno">9.2.8 </span>An introduction to error handling in the parser</a>
+ <ol>
+ <li><a href="#misnested-tags:-b-i-b-i"><span class="secno">9.2.8.1 </span>Misnested tags: <b><i></b></i></a></li>
+ <li><a href="#misnested-tags:-b-p-b-p"><span class="secno">9.2.8.2 </span>Misnested tags: <b><p></b></p></a></li>
+ <li><a href="#unexpected-markup-in-tables"><span class="secno">9.2.8.3 </span>Unexpected markup in tables</a></ol></ol></li>
<li><a href="#namespaces"><span class="secno">9.3 </span>Namespaces</a></li>
<li><a href="#serializing-html-fragments"><span class="secno">9.4 </span>Serializing HTML fragments</a></li>
<li><a href="#parsing-html-fragments"><span class="secno">9.5 </span>Parsing HTML fragments</a></li>
@@ -52234,6 +52254,7 @@
pause flag</dfn>, which must be initially set to false.</p>
+
<h4 id="the-input-stream"><span class="secno">9.2.2 </span>The <dfn>input stream</dfn></h4>
<p>The stream of Unicode characters that comprises the input to the
@@ -53057,8 +53078,13 @@
category, and scope markers. The scope markers are inserted when
entering <code><a href="#the-applet-element">applet</a></code> elements, buttons, <code><a href="#the-object-element">object</a></code>
elements, marquees, table cells, and table captions, and are used to
- prevent formatting from "leaking" into <code><a href="#the-applet-element">applet</a></code> elements,
- buttons, <code><a href="#the-object-element">object</a></code> elements, marquees, and tables.</p>
+ prevent formatting from "leaking" <em>into</em> <code><a href="#the-applet-element">applet</a></code>
+ elements, buttons, <code><a href="#the-object-element">object</a></code> elements, marquees, and
+ tables.</p>
+
+ <p class="note">The scope markers are unrelated to the concept of an
+ element being <a href="#has-an-element-in-scope" title="has an element in scope">in
+ scope</a>.</p>
<p>In addition, each element in the <a href="#list-of-active-formatting-elements">list of active formatting
elements</a> is associated with the token for which it was
@@ -54835,9 +54861,9 @@
must be inserted into the <i><a href="#foster-parent-element">foster parent element</a></i>, and the
<a href="#current-table">current table</a> must be marked as
<dfn id="tainted">tainted</dfn>. (Once the <a href="#current-table">current table</a> has been
- <a href="#tainted">tainted</a>, whitespace characters are inserted into the
- <i><a href="#foster-parent-element">foster parent element</a></i> instead of the <a href="#current-node">current
- node</a>.)</p>
+ <a href="#tainted">tainted</a>, <a href="#space-character" title="space character">space
+ characters</a> are inserted into the <i><a href="#foster-parent-element">foster parent element</a></i>
+ instead of the <a href="#current-node">current node</a>.)</p>
<p>The <dfn id="foster-parent-element">foster parent element</dfn> is the parent element of the
last <code><a href="#the-table-element">table</a></code> element in the <a href="#stack-of-open-elements">stack of open
@@ -58265,7 +58291,192 @@
- <h3 id="namespaces"><span class="secno">9.3 </span>Namespaces</h3>
+ <h4 id="an-introduction-to-error-handling-in-the-parser"><span class="secno">9.2.8 </span>An introduction to error handling in the parser</h4>
+
+ <p><em>This section is non-normative.</em></p>
+
+ <p>This section examines some erroneous markup and discusses how
+ the <a href="#html-parser">HTML parser</a> handles these cases.</p>
+
+
+ <h5 id="misnested-tags:-b-i-b-i"><span class="secno">9.2.8.1 </span>Misnested tags: <b><i></b></i></h5>
+
+ <p><em>This section is non-normative.</em></p>
+
+ <p>The most-often discussed example of erroneous markup is as
+ follows:</p>
+
+ <pre><p>1<b>2<i>3</b>4</i>5</p></pre>
+
+ <p>The parsing of this markup is straightforward up to the "3". At
+ this point, the DOM looks like this:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span><li class="t1"><code><a href="#the-i-element">i</a></code><ul><li class="t3"><code>#text</code>: <span title="">3</span></ul></ul></ul></ul></ul></ul><p>Here, the <a href="#stack-of-open-elements">stack of open elements</a> has five elements
+ on it: <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-p-element">p</a></code>,
+ <code><a href="#the-b-element">b</a></code>, and <code><a href="#the-i-element">i</a></code>. The <a href="#list-of-active-formatting-elements">list of active
+ formatting elements</a> just has two: <code><a href="#the-b-element">b</a></code> and
+ <code><a href="#the-i-element">i</a></code>. The <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>".</p>
+
+ <p>Upon receiving the end tag token with the tag name "b", the "<a href="#adoptionAgency">adoption agency algorithm</a>" is
+ invoked. This is a simple case, in that the <var title="">formatting
+ element</var> is the <code><a href="#the-b-element">b</a></code> element, and there is no
+ <var title="">furthest block</var>. Thus, the <a href="#stack-of-open-elements">stack of open
+ elements</a> ends up with just three elements: <code><a href="#the-html-element">html</a></code>,
+ <code><a href="#the-body-element">body</a></code>, and <code><a href="#the-p-element">p</a></code>, while the <a href="#list-of-active-formatting-elements">list of
+ active formatting elements</a> has just one: <code><a href="#the-i-element">i</a></code>. The
+ DOM tree is unmodified at this point.</p>
+
+ <p>The next token is a character ("4"), triggers the <a href="#reconstruct-the-active-formatting-elements" title="reconstruct the active formatting elements">reconstruction of
+ the active formatting elements</a>, in this case just the
+ <code><a href="#the-i-element">i</a></code> element. A new <code><a href="#the-i-element">i</a></code> element is thus created
+ for the "4" text node. After the end tag token for the "i" is also
+ received, and the "5" text node is inserted, the DOM looks as
+ follows:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span><li class="t1"><code><a href="#the-i-element">i</a></code><ul><li class="t3"><code>#text</code>: <span title="">3</span></ul></ul><li class="t1"><code><a href="#the-i-element">i</a></code><ul><li class="t3"><code>#text</code>: <span title="">4</span></ul><li class="t3"><code>#text</code>: <span title="">5</span></ul></ul></ul></ul><h5 id="misnested-tags:-b-p-b-p"><span class="secno">9.2.8.2 </span>Misnested tags: <b><p></b></p></h5>
+
+ <p><em>This section is non-normative.</em></p>
+
+ <p>A case similar to the previous one is the following:</p>
+
+ <pre><b>1<p>2</b>3</p></pre>
+
+ <p>Up to the "2" the parsing here is straightforward:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul></ul></ul></ul><p>The interesting part is when the end tag token with the tag name
+ "b" is parsed.</p>
+
+ <p>Before that token is seen, the <a href="#stack-of-open-elements">stack of open
+ elements</a> has four elements on it: <code><a href="#the-html-element">html</a></code>,
+ <code><a href="#the-body-element">body</a></code>, <code><a href="#the-b-element">b</a></code>, and <code><a href="#the-p-element">p</a></code>. The
+ <a href="#list-of-active-formatting-elements">list of active formatting elements</a> just has the one:
+ <code><a href="#the-b-element">b</a></code>. The <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-inbody" title="insertion mode: in body">in body</a>".</p>
+
+ <p>Upon receiving the end tag token with the tag name "b", the "<a href="#adoptionAgency">adoption agency algorithm</a>" is invoked, as
+ in the previous example. However, in this case, there <em>is</em> a
+ <var title="">furthest block</var>, namely the <code><a href="#the-p-element">p</a></code> element. Thus,
+ this time the adoption agency algorithm isn't skipped over.</p>
+
+ <p>The <var title="">common ancestor</var> is the <code><a href="#the-body-element">body</a></code>
+ element. A conceptual "bookmark" marks the position of the
+ <code><a href="#the-b-element">b</a></code> in the <a href="#list-of-active-formatting-elements">list of active formatting
+ elements</a>, but since that list has only one element in it,
+ it won't have much effect.</p>
+
+ <p>As the algorithm progresses, <var title="">node</var> ends up set
+ to the formatting element (<code><a href="#the-b-element">b</a></code>), and <var title="">last
+ node</var> ends up set to the <var title="">furthest block</var>
+ (<code><a href="#the-p-element">p</a></code>).</p>
+
+ <p>The <var title="">last node</var> gets appended (moved) to the
+ <var title="">common ancestor</var>, so that the DOM looks like:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul></ul></ul><p>A new <code><a href="#the-b-element">b</a></code> element is created, and the children of the
+ <code><a href="#the-p-element">p</a></code> element are moved to it:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code></ul></ul></ul><ul class="domTree"><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul><p>Finally, the new <code><a href="#the-b-element">b</a></code> element is appended to the
+ <code><a href="#the-p-element">p</a></code> element, so that the DOM looks like:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul></ul></ul></ul></ul><p>The <code><a href="#the-b-element">b</a></code> element is removed from the <a href="#list-of-active-formatting-elements">list of
+ active formatting elements</a> and the <a href="#stack-of-open-elements">stack of open
+ elements</a>, so that when the "3" is parsed, it is appended to
+ the <code><a href="#the-p-element">p</a></code> element:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">1</span></ul><li class="t1"><code><a href="#the-p-element">p</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">2</span></ul><li class="t3"><code>#text</code>: <span title="">3</span></ul></ul></ul></ul><h5 id="unexpected-markup-in-tables"><span class="secno">9.2.8.3 </span>Unexpected markup in tables</h5>
+
+ <p><em>This section is non-normative.</em></p>
+
+ <p>Error handling in tables is, for historical reasons, especially
+ strange. For example, consider the following markup:</p>
+
+ <pre><table><strong><b></strong><tr><td>aaa</td></tr><strong>bbb</strong></table>ccc</pre>
+
+ <p>The highlighted <code><a href="#the-b-element">b</a></code> element start tag is not allowed
+ directly inside a table like that, and the parser handles this case
+ by placing the element <em>before</em> the table. (This is called <i title="foster parent"><a href="#foster-parent">foster parenting</a></i>.) This can be seen by
+ examining the DOM tree as it stands just after the
+ <code><a href="#the-table-element">table</a></code> element's start tag has been seen:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-table-element">table</a></code></ul></ul></ul><p>...and then immediately after the <code><a href="#the-b-element">b</a></code> element start
+ tag has been seen:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code></ul></ul></ul><p>At this point, the <a href="#stack-of-open-elements">stack of open elements</a> has on it
+ the elements <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>,
+ <code><a href="#the-table-element">table</a></code>, and <code><a href="#the-b-element">b</a></code> (in that order, despite the
+ resulting DOM tree); the <a href="#list-of-active-formatting-elements">list of active formatting
+ elements</a> just has the <code><a href="#the-b-element">b</a></code> element in it; the
+ <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intable" title="insertion mode: in
+ table">in table</a>"; and the <code><a href="#the-table-element">table</a></code> element is
+ <a href="#tainted">tainted</a>.</p>
+
+ <p>The <code><a href="#the-tr-element">tr</a></code> start tag causes the <code><a href="#the-b-element">b</a></code> element
+ to be popped off the stack and a <code><a href="#the-tbody-element">tbody</a></code> start tag to be
+ implied; the <code><a href="#the-tbody-element">tbody</a></code> and <code><a href="#the-tr-element">tr</a></code> elements are
+ then handled in a rather straight-forward manner, taking the parser
+ through the "<a href="#parsing-main-intbody" title="insertion mode: in table body">in table
+ body</a>" and "<a href="#parsing-main-intr" title="insertion mode: in row">in
+ row</a>" insertion modes, after which the DOM looks as
+ follows:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code></ul></ul></ul></ul></ul><p>Here, the <a href="#stack-of-open-elements">stack of open elements</a> has on it the
+ elements <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-table-element">table</a></code>,
+ <code><a href="#the-tbody-element">tbody</a></code>, and <code><a href="#the-tr-element">tr</a></code>; the <a href="#list-of-active-formatting-elements">list of active
+ formatting elements</a> still has the <code><a href="#the-b-element">b</a></code> element in
+ it; the <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intr" title="insertion mode:
+ in row">in row</a>"; and the <code><a href="#the-table-element">table</a></code> element is still
+ <a href="#tainted">tainted</a>.</p>
+
+ <p>The <code><a href="#the-td-element">td</a></code> element start tag token, after putting a
+ <code><a href="#the-td-element">td</a></code> element on the tree, puts a marker on the <a href="#list-of-active-formatting-elements">list
+ of active formatting elements</a> (it also switches to the "<a href="#parsing-main-intd" title="insertion mode: in cell">in cell</a>" <a href="#insertion-mode">insertion
+ mode</a>).</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code></ul></ul></ul></ul></ul></ul><p>The marker means that when the "aaa" character tokens are seen,
+ no <code><a href="#the-b-element">b</a></code> element is created to hold the resulting text
+ node:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code><ul><li class="t3"><code>#text</code>: <span title="">aaa</span></ul></ul></ul></ul></ul></ul></ul><p>The end tags are handled in a straight-forward manner; after
+ handling them, the <a href="#stack-of-open-elements">stack of open elements</a> has on it the
+ elements <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-table-element">table</a></code>,
+ and <code><a href="#the-tbody-element">tbody</a></code>; the <a href="#list-of-active-formatting-elements">list of active formatting
+ elements</a> still has the <code><a href="#the-b-element">b</a></code> element in it (the
+ marker having been removed by the "td" end tag token); the
+ <a href="#insertion-mode">insertion mode</a> is "<a href="#parsing-main-intbody" title="insertion mode: in
+ table body">in table body</a>"; and the <code><a href="#the-table-element">table</a></code>
+ element is still <a href="#tainted">tainted</a>.</p>
+
+ <p>Thus it is that the "bbb" character tokens are found. When <a href="#reconstruct-the-active-formatting-elements" title="reconstruct the active formatting elements">the active
+ formatting elements are reconstructed</a>, a <code><a href="#the-b-element">b</a></code>
+ element is created and <a href="#foster-parent" title="foster parent">foster
+ parented</a>, and then the "bbb" text node is appended to it:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">bbb</span></ul><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code><ul><li class="t3"><code>#text</code>: <span title="">aaa</span></ul></ul></ul></ul></ul></ul></ul><p>The <a href="#stack-of-open-elements">stack of open elements</a> has on it the elements
+ <code><a href="#the-html-element">html</a></code>, <code><a href="#the-body-element">body</a></code>, <code><a href="#the-table-element">table</a></code>,
+ <code><a href="#the-tbody-element">tbody</a></code>, and the new <code><a href="#the-b-element">b</a></code> (again, note that
+ this doesn't match the resulting tree!); the <a href="#list-of-active-formatting-elements">list of active
+ formatting elements</a> has the new <code><a href="#the-b-element">b</a></code> element in it;
+ the <a href="#insertion-mode">insertion mode</a> is still "<a href="#parsing-main-intbody" title="insertion
+ mode: in table body">in table body</a>"; and the
+ <code><a href="#the-table-element">table</a></code> element is still <a href="#tainted">tainted</a>.</p>
+
+ <p>Had the character tokens been <a href="#space-character" title="space character">space
+ characters</a> instead of "bbb", the result would have been the
+ same, but only because the table is <a href="#tainted">tainted</a>. Had the
+ <code><a href="#the-b-element">b</a></code> element's start tag been before the
+ <code><a href="#the-table-element">table</a></code> instead of after, then the table wouldn't have
+ been <a href="#tainted">tainted</a> and such <a href="#space-character" title="space
+ character">space characters</a> would just be appended to the
+ <code><a href="#the-tbody-element">tbody</a></code> element.</p>
+
+ <p>Finally, the <code><a href="#the-table-element">table</a></code> is closed by a "table" end
+ tag. This pops all the nodes from the <a href="#stack-of-open-elements">stack of open
+ elements</a> up to and including the <code><a href="#the-table-element">table</a></code> element,
+ but it doesn't affect the <a href="#list-of-active-formatting-elements">list of active formatting
+ elements</a>, so the "ccc" character tokens after the table
+ result in yet another <code><a href="#the-b-element">b</a></code> element being created, this
+ time after the table:</p>
+
+ <ul class="domTree"><li class="t1"><code><a href="#the-html-element">html</a></code><ul><li class="t1"><code><a href="#the-head-element">head</a></code><li class="t1"><code><a href="#the-body-element">body</a></code><ul><li class="t1"><code><a href="#the-b-element">b</a></code><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">bbb</span></ul><li class="t1"><code><a href="#the-table-element">table</a></code><ul><li class="t1"><code><a href="#the-tbody-element">tbody</a></code><ul><li class="t1"><code><a href="#the-tr-element">tr</a></code><ul><li class="t1"><code><a href="#the-td-element">td</a></code><ul><li class="t3"><code>#text</code>: <span title="">aaa</span></ul></ul></ul></ul><li class="t1"><code><a href="#the-b-element">b</a></code><ul><li class="t3"><code>#text</code>: <span title="">ccc</span></ul></ul></ul></ul><h3 id="namespaces"><span class="secno">9.3 </span>Namespaces</h3>
<p>The <dfn id="html-namespace-0">HTML namespace</dfn> is: <code>http://www.w3.org/1999/xhtml</code></p>
Received on Tuesday, 23 June 2009 01:34:50 UTC