[whatwg] Bug in 12.2.5.4.8 (The "text" insertion mode) when invoking the "spin the event loop" algorithm

12.2.5.4.8 (The "text" insertion mode) defines an following algorithm for
dealing with inline <script> tags that aren't ready to execute when parsed.
 I believe there are some subtle bugs with the way the algorithm is
specified.  More importantly, the invocation of the "spin the event loop"
algorithm makes it harder to reason about the system as a whole.  The
algorithm in question runs when parsing a </script> at a script nesting
level (i.e. not one generated by document.write()):

1.) Let *the script* be the pending parsing-blocking
script<http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#pending-parsing-blocking-script>.
There is no longer a pending parsing-blocking
script<http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#pending-parsing-blocking-script>
.

2.) Block the tokenizer<http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenization>for
this instance of the HTML
parser<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#html-parser>,
such that the event
loop<http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#event-loop>will
not run
tasks<http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#concept-task>that
invoke the
tokenizer<http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenization>
.

3.) If the parser's
Document<http://www.whatwg.org/specs/web-apps/current-work/multipage/dom.html#document>
has
a style sheet that is blocking
scripts<http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#has-a-style-sheet-that-is-blocking-scripts>or
*the script*'s "ready to be
parser-executed"<http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#ready-to-be-parser-executed>flag
is not set: spin
the event loop<http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#spin-the-event-loop>until
the parser's
Document<http://www.whatwg.org/specs/web-apps/current-work/multipage/dom.html#document>
has
no style sheet that is blocking
scripts<http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#has-no-style-sheet-that-is-blocking-scripts>and
*the script*'s "ready to be
parser-executed"<http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#ready-to-be-parser-executed>flag
is set.

4.) Unblock the
tokenizer<http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenization>for
this instance of the HTML
parser<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#html-parser>,
such that tasks<http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#concept-task>that
invoke the
tokenizer<http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenization>can
again be run.

5.) Let the insertion
point<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#insertion-point>be
just before the next
input character<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#next-input-character>
.

6.) Increment the parser's script nesting
level<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#script-nesting-level>by
one (it should be zero before this step, so this sets it to one).

7.) Execute<http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#execute-the-script-block>
*the script*.

...


Step 3 spins the event loop.  The issue is that while the tokenizer is
blocked other tasks can run whenever the event loop is spun and cause
changes that make the rest of the algorithm incorrect.  For example,
consider:

<!DOCTYPE html>
<script>
window.setTimeout(function() {
document.write("Goodbye");
}, 50);
<link rel="stylesheet" type="text/css" href="long_loading.css"></link>
<script>
window.alert("Hello");
</script>

The algorithm in question will run when parsing the last </script>.  The
second script can't execute until the stylesheet loads, so the spec spins
the event loop until that happens.  However, if the setTimeout fires before
long_loading.css loads then the document.write() call will first perform an
implicit document.open(), since there is no insertion point, and blow away
the entire Document.  This cancels any pending tasks but doesn't (as far as
I can tell) cancel already started tasks.  By my reading of the spec, the
rest of the steps of the algorithm should still run and the script should
execute.  However, what actually happens in every browser I can test
(Chrome Canary* / Firefox 22 / IE10) the alert never fires.  What the Blink
code actually does is simply suspend the tokenizer and then return control
the the underlying (non-nested, usually) event loop.

I propose that instead of spinning the event loop, we instead have step 3
enter an asynchronous section if the script isn't ready to run yet which
queues a task once the script is ready to run.  Since this algorithm only
runs at a script nesting level of zero this is a fairly minor tweak in
overall behavior, but I believe it means that invoking the tokenizer can
never spin the event loop which is a nice property to have.

- James


* This testcase actually crashes Chrome versions other than recent
canaries.  Oops.

Received on Friday, 27 September 2013 22:38:21 UTC