Re: Insertion point for script@onload from Adam Barth on 2010-06-22 (public-html@w3.org from June 2010)

From: Adam Barth <w3c@adambarth.com>
Date: Tue, 22 Jun 2010 09:52:14 -0700
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Eric Seidel <eric@webkit.org>, HTML WG <public-html@w3.org>
Message-ID: <AANLkTilfVBryq3zT1gKNzKUIZxts2g21UoWO8nNUJ1fr@mail.gmail.com>
On Tue, Jun 22, 2010 at 2:36 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
> "Adam Barth" <w3c@adambarth.com> wrote:
>> In double-checking our work with the HTML5 parser implementation in
>> Minefield, we noticed that Minefield handles this case slightly
>> different (likely due to the spec ambiguity above).  In particular,
>> when the page calls document.write from the load event of a script
>> tag, Minefield seems to believe there is no current insertion point
>> and blows away the entire document:
>>
>> http://trac.webkit.org/export/LATEST/trunk/LayoutTests/fast/tokenizer/write-on-load.html
>>
>> Under the above interpretation of the spec, the load event shares the
>> same insertion point record as the external script itself, resulting
>> in the numerals in that test being printed in order from 1 to 7.
>> That
>> behavior appears to match the legacy WebKit and Firefox behavior and
>> is, therefore, likely compatible with the web.
>
> Is there evidence of sites calling document.write() from the load handler of a script so often that calling document.write() from a script load handler needs to work?

I don't have any such evidence at this time.

> The off-the-main-thread parsing implementation in Minefield makes it necessary to know in advance which points in the network stream are eligible for document.write(). Since establishing a point eligible for document.write() is somewhat complex, it is only done at </script> and only for scripts that don't have defer or async specified in the source.

The insertion point, in this case, is in fact such a point.  It's the
same insertion point that we use for the external script itself
(assuming the spec intends to create an insertion point for external
scripts).

> Furthermore, to be able to perform multiple DOM modifications in a script-unsafe batch, the HTML5 parser limits script execution of any kind to well-defined points (</script> mainly plus a couple of other cases that may go away as soon as other parts of Gecko are changed not to expect that behavior).

I didn't understand this statement.  Script can execute at all kinds
of crazy times (especially in light of DOM mutation events and
plug-ins).  I don't understand how you can limit script execution to
point in time when you have such well-defined insertion points.

> For these reasons, I've made the parser forbid document.write() from all event handlers.

This is unlikely to be correct.  For example, what if a script
executing as a result of a <script> element (either external or
inline) dispatches a synchronous event, such as a DOM mutation event
or directly via dispatchEvent?  Surely a document.write call in such
an event handler should respect the current insertion point.  That's
certainly what the spec says for inline scripts.

> When I did this, the event handler I particularly wanted to prevent from writing was the SVG load event handler. I wasn't thinking of <script onload>. While <script onload> of parser-inserted scripts is guaranteed to fire when the parser is at the </script> safe point (*if* the event fires synchronously!),

The spec is explicit about when to fire script@onload synchronously.
The behavior in the spec seems to patch previous versions of Firefox
and WebKit.

> I'd rather not punch a special hole for that event handler without a compelling use case or site compat requirement. (Also, it seems inconsistent to make load on <script> fire synchronously when load events in general are async.)

I guess I don't fully understand the implementation constraints you're
operating under.  There are two issues:

1) Should the script load even fire synchronously.
2) Should synchronous events that call document.write use the current
insertion point.

There are a number of benefits to firing the script load event synchronously:

A) The behavior matches all the shipping browsers I've tested.
B) Firing the script load event synchronously is more predictable for
developers and less likely to lead to race conditions.
C) There are unknown compatibility implications for changing when the
event is fired.

There are a number of benefits for having synchronous events use the
current insertion point.

a) The behavior matches all the shipping browsers I've tested.
b) Reusing the current insertion point is more predictable for
developers and less likely to lead to race conditions.
c) Reusing the current insertion point better matches the mental model
for how HTML documents are processed (basically, it abstracts away how
much of the document input stream is buffered in the network layer and
how much is buffered inside the tokenizer).
d) There are unknown compatibility implications for changing how
document.write in synchronous events behaves.

The only "con" I see to (1) and (2) is a new implementation constraint
in Gecko that I don't quite understand given that we're not creating
any new insertion points (these events are just using insertion points
that already exist for the <script> elements themselves).

Adam
Received on Tuesday, 22 June 2010 16:53:06 UTC