[whatwg] Script-related feedback from Ian Hickson on 2010-03-17 (public-whatwg-archive@w3.org from March 2010)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 17 Mar 2010 00:05:16 +0000 (UTC)
Message-ID: <Pine.LNX.4.64.1003162201460.13402@ps20323.dreamhostps.com>
On Tue, 3 Nov 2009, Brian Kuhn wrote:
>
> In section 
> http://www.whatwg.org/specs/web-apps/current-work/#attr-script-async, it 
> says:
> 
> *Fetching an external script must delay the load event of the element's 
> document until the task that is queued by the networking task source 
> once the resource has been fetched (defined above) has been run.*
> 
> Has any thought been put into changing this for async scripts?  It seems 
> like it might be worthwhile to allow window.onload to fire while an 
> async script is still downloading if everything else is done.

On Fri, 6 Nov 2009, Brian Kuhn wrote:
> 
> It seems to me that the purpose of async scripts is to get out of the 
> way of user-visible functionality.  Many sites currently attach 
> user-visible functionality to window.onload, so it would be great if 
> async scripts at least had a way to not block that event.  It would help 
> minimize the affect that secondary-functionality like ads and web 
> analytics have on the user experience.

On Wed, 10 Feb 2010, Jonas Sicking wrote:
> 
> I'm concerned that this is too big of a departure from how people are 
> used to <script>s behaving.
> 
> If we do want to do something like this, one possibility would be to 
> create a generic attribute that can go on things like <img>, <link 
> rel=stylesheet>, <script> etc that make the resource not block the 
> 'load' event.

On Thu, 11 Feb 2010, Steve Souders wrote:
>
> I just sent email last week proposing a POSTONLOAD attribute for 
> scripts.

On Thu, 11 Feb 2010, Jonas Sicking wrote:
>
> Though what we want here is a DONTDELAYLOAD attribute. I.e. we want
> load to start asap, but we don't want the load to hold up the load
> event if all other resources finish loading before this one.

On Fri, 12 Feb 2010, Brian Kuhn wrote:
>
> Right.  Async scripts aren't really asynchronous if they block all the 
> user-visible functionality that sites currently tie to window.onload.
> 
> I don't know if we need another attribute, or if we just need to change 
> the behavior for all async scripts.  But I think the best time to fix 
> this is now; before too many UAs implement async.

On Fri, 12 Feb 2010, Nicholas Zakas wrote:
>
> To me "asynchronous" fundamentally means "doesn't block other things 
> from happening," so if async currently does block the load event from 
> firing then that seems very wrong to me.

On Fri, 12 Feb 2010, Steve Souders wrote:
>
> ASYNC should not block the onload event. Thinking of the places where 
> ASYNC will be used, they would not want onload to be blocked.

On Sat, 13 Feb 2010, Darin Fisher wrote:
>
> I don't know... to me, "asynchronous" means completes later.  
> Precedence: XMLHttpRequest.

On Sat, 13 Feb 2010, Boris Zbarsky wrote:

> [...] my real worry about making any loads that don't block onload: 
> would web developers expect them to?

On Sat, 13 Feb 2010, Brian Kuhn wrote:
>
> FWIW, loading scripts asynchronously with the "Script DOM Element" 
> approach does not block window.onload in IE.  In Chrome and Safari, the 
> downloading blocks, but execution doesn't.  In Firefox and Opera, 
> downloading and execution blocks.
> 
> So, it's pretty hard to say what web developers would expect with async 
> scripts.  I know that they will like having things like ads and 
> analytics not block window.onload though.  At the very least, we need 
> that ability to make that happen.

On Sat, 13 Feb 2010, Jonas Sicking wrote:
> 
> Yeah, my big concern is "what do developers expect". Having an explicit 
> attribute for not blocking onload definitely follows the path of least 
> surprise. Though having an explicit attribute does give Steve more 
> things to evangelize, i.e. it'll probably lead to more pages firing 
> onload later than they could.

On Sat, 13 Feb 2010, Darin Fisher wrote:
>
> The thing is, almost all subresources load asynchronously.  The load 
> event exists to tell us when those asynchronous loads have finished.  
> So, I think it follows that an asynchronous resource load may reasonably 
> block the load event.  (That's the point of the load event afterall!)

I've changed the spec to fire 'DOMContentLoaded' without waiting for the 
async scripts, so that if you need this you can just listen for that event 
instead of 'load'. 'load' still waits for all scripts. 'DOMContentLoaded' 
still waits for deferred scripts. As far as I can tell this handles all 
the above (still makes sense, still consistent with the way other 'load' 
events work, but still lets you do things without waiting).


On Wed, 30 Dec 2009, David Bruant wrote:
> 
> The "6.8.1 Client identification" starts with an explanation dealing 
> with browser-specific bugs and limitation ("browser-specific features" 
> are missing, aren't they ?) that Web authors are forced to work around.

"Browser-specific features" should be featured-tested, not version-tested.


> A very interesting project dealing with these browsers specific
> implementations is TestSwarm : http://testswarm.com/
> 
> As you may notice, the web browsers are classified this way :
> 1) Operating system
> 2) Web Browser (the equivalent of the current "window . navigator . appName")
> 3) Version (the equivalent of the current "window . navigator . appVersion")
> 
> In my opinion, the TestSwarm approach is relevant because a user agent with
> the same "appName" and "appVersion" can have OS-specific bugs. However, in the
> NavigatorID interface, there is currently no way to detect the operating
> system. The current way to detect the operating system is to use the userAgent
> String. However, this can be freely overridden in some browsers. As a
> consequence, this string cannot be relied at all for OS detection.
> 
> To allow OS-specific bugs detection, a property could be added to the
> NavigatorID interface.
> window.navigator.operatingSystem ? opSysName and opSysVersion ? Name, SubName,
> Version, SubVersion ? Something even more detailled ?
> It would certainly also be the role of the spec to list the operating systems,
> specify the concordant strings and/or give a rule for future and unknown
> version and operation systems.

These strings could also be freely overridden in some browsers, so I don't 
see why it would be any more reliable.

I'd rather not add yet more features to this particular object. We should 
discourage this kind of thing, not encourage it.


On Wed, 20 Jan 2010, David Flanagan wrote:
>
> I'm trying to understand the async and defer attributes of the script 
> tag. Unfortunately, since script execution is so intimately tied up with 
> HTML parsing, section 4.3.1 is particularly hard to make sense of. I've 
> got 3 questions, and 3 suggested clarifications to the spec. Thanks to 
> anyone who can explain these!
> 
> First, my questions.  Are the following three statements correct?  (I'm 
> only concerned with <script> tags that actually appear in a document, 
> not those inserted or emitted (via document.write()) by another 
> script.):
> 
> 1) Scripts without async or defer attributes are executed in the order 
> in which they appear in the document.  They are executed synchronously, 
> which means that the parser must stop parsing the document while they 
> run.

Mostly. There are ways to make things happen out of apparent order, e.g. 
using document.write() and appendChild(). I'd rather not try to explain 
this in detail (in the spec, for authors) because it's likely to be wrong, 
and it's likely to cause implementors to ignore the real spec and instead 
read the author-facing description. We've already seen this happen 
multiple times for the green DOM intro boxes.


> 2) Scripts with the defer attribute, but without the async attribute are 
> executed in the order in which they appear in the document, but their 
> execution is deferred until the document has finished parsing. All these 
> scripts will execute before DOMContentLoaded and the load event are 
> fired.  A deferred script can assume that the entire DOM tree has been 
> constructed and is ready for manipulation--these scripts do not 
> generally need to register an onload event handler. A call to 
> document.write() within a deferred script will blow away the current 
> document and begin a new one.

Sounds right.


> 3) Scripts with the async attribute are executed as their script content 
> becomes available over the network, with no guarantee that they will be 
> executed in the order in which they appear in the document.  The only 
> guarantee is that these scripts will run before the DOMContentLoaded or 
> load events are fired. Document parsing may or may not have completed 
> when an async script is run, and a call to document.write() from an 
> async script will have unpredictable behavior. Though the order of 
> execution of async scripts is not predictable, the scripts will always 
> appear to run in some serial order without concurrent execution.

I just changed the DOMContentLoaded event to fire before the async events. 
Other than that it is correct.


> Next, I suggest that the following things in the spec be clarified:
> 
> 1) After describing the async and defer attributes, the spec promises: 
> "The exact processing details for these attributes are described below." 
> I take this to mean "below, somewhere in section 4.3".  In fact, 
> however, the exact processing details are scattered throughout the spec, 
> and understanding the attributes requires understanding section 9, I 
> think. It would be nice to note this.

Done.


> 2) The last sentence of this paragraph:
> 
> > The second is a flag indicating whether the element was 
> > "parser-inserted". Initially, script elements must have this flag 
> > unset. It is set by the HTML parser and is used to handle 
> > document.write() calls.
> 
> made me think that the "parser-inserted" flag would only be set to true 
> for scripts that were emitted through document.write() calls.  That is, 
> I thought that the parser-inserted flag would be set only in unusual 
> cases rather than in the most common case.  This section should explain 
> the meaning of the parser-inserted flag. Instead it describes one of the 
> purposes of the flag, but that purpose is different than the purpose for 
> which it is used in this section.

I've tried to make it less confusing.


> 3) The algorithm for "running a script" adds scripts to "the list of 
> scripts that will execute as soon as possible".  And 9.2.6 spins the 
> event loop until this list is empty.  But I don't see anything in the 
> spec that removes items from this list.  That seems like an error in the 
> spec, not just a confusing bit.

Oops. Fixed.


> Furthermore, the fact that this mechanism is specified as a "list" 
> rather than as a "set" implies some kind of sequential execution of the 
> scripts.  But I don't think any sequence is meant here.

Fixed.


On Mon, 8 Feb 2010, Steve Souders wrote:
>
> I have some comments and questions about the ASYNC and DEFER attributes 
> of the SCRIPT tag based on reading this document: 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html 
> <http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#script>
> 
> 1. "If neither attribute is present, then the script is fetched and executed
> immediately, before the user agent continues parsing the page."
>     Thankfully, newer browsers are downloading scripts in parallel with other
> resources. I presume the way this is done is they launch a request for a
> script and continue to do "speculative" parsing looking for other resources
> (images, stylesheets, other scripts, etc.) and launch those requests. But this
> nice feature seems to be in conflict with the above text because the browser
> continues parsing (albeit speculatively) before the script is executed. It
> would be good to mention this optional behavior here, something along the
> lines of browsers may want to do speculative parsing, but shouldn't create DOM
> elements, etc. - only kickoff HTTP requests.

The spec says:

# Conformance requirements phrased as algorithms or specific steps may be 
# implemented in any manner, so long as the end result is equivalent.

# For performance reasons, user agents may start fetching the script as 
# soon as the attribute is set, instead, in the hope that the element will 
# be inserted into the document. Either way, once the element is inserted 
# into the document, the load must have started. If the UA performs such 
# prefetching, but the element is never inserted in the document, or the 
# src attribute is dynamically changed, then the user agent will not 
# execute the script, and the fetching process will have been effectively 
# wasted.

...which is intended to allow this. I suppose I could change the last 
paragraph quoted above to allow downloads even before elements are 
created, if you think that would be better.


> 2. "If one or both of the defer and async attributes are specified, the src
> attribute must also be specified."
>     It should be possible to specify DEFER without a SRC. The use case is a
> page that has a sequence of SCRIPTs (with and without a SRC attribute) all of
> which need to execute in order, but should do so without blocking the parser.
> This happens a lot with ads, widgets, and analytics. A workaround is to use
> callbacks to daisy-chain the calling sequence, but the complexity will lead
> most 3rd party snippet providers to default to a normal SCRIPT tag (without
> DEFER or ASYNC) resulting in blocking the parser and slow pages. It's
> especially annoying for web site owners to have 3rd party content slowing down
> their pages and blocking the content they've created.
>     This appears to be a recent change perhaps prompted by Jonas Sicking's
> comments that Mozilla found many web sites that specified DEFER without a SRC
> and then called document.write (which pretty clearly indicates the developer
> didn't mean to specify DEFER). If that's the motivation for this restriction,
> we need to either find an alternative syntax or go ahead and allow DEFER
> without SRC. Finding an alternative is the worse alternative (DEFER has the
> exact behavior we want, so creating something with a different name that
> behaves just like DEFER is confusing). If we do move forward with allowing
> DEFER without SRC, then we need to specify what happens if it contains
> document.write so that the entire document isn't overwritten. (I believe this
> is addressed in section 3.5.) There's no good way to make DEFER do what it
> should and have those pages who are using DEFER incorrectly work the way they
> do now. With this path, at least those pages will have their content appear at
> the bottom and not wipe out the entire page.

The reason for the spec being the way it is is indeed as you describe 
(well, IIRC it has more to do with innerHTML than document.write(), but 
it's similar). However, the problem is non-trivial. Getting compatibility 
with legacy content while supporting defer="" on <script> without src="" 
appears to me to be hugely complicated.

It seems pretty easy to work around this limitation by just having 
callbacks in the code, though, so I don't really see this as a huge 
problem.


> 3. "[the 'parser-inserted' state] is set by the HTML parser and is used to
> handle document.write() calls."
>     In what way is this used to handle document.write() calls? Is it for
> handling additional SCRIPTs added via document.write, or to make
> document.write itself have different behavior? The answer should be added to
> the spec somewhere. I searched for .write in this document and didn't find an
> explanation.

I removed that comment. To answer your question, though, it's used to 
ensure that script run in the right order which affects the insertion 
point which affects the way document.write() works.


> 4. "If the element has a src attribute, [snip] the specified resource must
> then be fetched, from the origin of the element's Document."
>     If the script has DEFER, the request should not start until after parsing
> is finished. Starting it earlier could block other (non-deferred) requests due
> to a connection limit or limited bandwidth.

Browsers can prioritise when scripts are loaded, but I think it would be a 
mistake to disallow browsers from fetching scripts earlier than load. (To 
start with I think they're ignore the requirement.)


> 5. I don't see any rules for the order of executing scripts added to the 
> "list of scripts that will execute when the document has finished 
> parsing" and the "list of scripts that will execute as soon as 
> possible". DEFER scripts should execute in the order they appear in the 
> list. ASYNC scripts should be executed as soon as the response is 
> received.

The rules are implicit in how the lists (the list and the set, now) are 
processed.


On Wed, 10 Feb 2010, Steve Souders wrote:
>
> In the current text, it says "must then be fetched". In my suggestion I 
> say "should not start until after parsing". Saying "should" instead of 
> "must" leaves the opening for browsers that feel they can fetch 
> immediately without negatively impacting performance.

"Must be fetched" means it runs the "Fetch" algorithm, which explicitly 
says that the download happens "at a time convenient to the user and the 
user agent".


On Wed, 10 Feb 2010, Jonas Sicking wrote:
> 
> Instead, if the use cases are strong enough, I think we need to 
> introduce another mechanism for delaying a <script> to get loaded until 
> after the 'load' event has fired. I think it's an interesting idea to 
> add a 'postonload' attribute to all resources, such as <script>, <img> 
> and <link rel=stylesheet> (though the maybe the name could be better).

On Mon, 8 Feb 2010, Steve Souders wrote:
>
> I'd like to propose the addition of a POSTONLOAD attribute to the SCRIPT 
> tag.
> 
> The behavior would be similar to DEFER, but instead of delaying 
> downloads until after parsing they would be delayed until after the 
> window's load event. Similar to DEFER, this new attribute would ensure 
> scripts were executed in the order they appear in the document, although 
> it could be combined with ASYNC to have them execute as soon as the 
> response is received.

This idea is interesting, but I think it's better for us to wait until 
we've seen what browsers do with async="" before adding yet another 
feature to <script>. If we add too much at once, browsers will have no 
hope of implementing it all correctly. :-)

Given that it is possible to do this from script, how common is it for 
people to do it from script? If it's very common, that would be a good 
data point encouraging us to do this sooner rather than later.


On Thu, 11 Feb 2010, Mathias Sch?fer wrote:
> 
> In a JavaScript tutorial, I wanted to explain what DOMContentLoaded 
> actually does. But the tests I made revealed that there isn't a 
> consistent behavior across browsers with regard to stylesheets. In fact, 
> it's a total mess. These are the results of my tests:
> 
> http://molily.de/weblog/domcontentloaded
> 
> Please have a quick look at these findings (you can skip the 
> introduction part). My questions are:
> 
> 1. Am I right that HTML5 will standardize Opera's pure DOMContentLoaded 
> model, never waiting for stylesheets? My assumption is that this will 
> break compatibility with the current Gecko and Webkit implementations.

Scripts can block the whole parser until style sheets have loaded, which 
implicitly means the script will wait too, but other than that, yes, the 
spec doesn't wait for style sheets for DOMContentLoaded.


> 2. Does the HTML5 parser specify that external stylesheets defer 
> external script execution? As far as I understand the specs, it doesn't.

It does, unless I made a mistake.


> In Gecko and IE, the loading of stylesheets also defers the execution of 
> subsequent *inline* scripts. I haven't found a rule for that in the 
> HTML5 parsing algorithm either. Does it conform to the specs, is it 
> against the rules or a legitimate extension which is not covered by 
> HTML5?

The spec does require that. Search for "a style sheet blocking scripts".


On Wed, 10 Feb 2010, Boris Zbarsky wrote:
> 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/scripting-1.html#running-a-script 
> step 8 the cases that talk about "a style sheet blocking scripts" 
> specify this.
> 
> I really wish those steps had individual IDs, and so did the cases 
> inside them.  It'd make it a lot easier to link to them!

Added. Let me know if you think anything else needs them. (I don't want to 
automatically add them to every paragraph because the resulting bloat is 
excessive.)


On Thu, 11 Feb 2010, Mathias Sch??fer wrote:
> 
> The question is: Is a normal external script ???parser-inserted??? or not?
> I assume the flag to be false, since that???s the default value and I
> found ???parser-inserted??? to be true for XML parsing only
> (#parsing-xhtml-documents). Correct?

I've changed the comment near the definition of the term to be clearer. I 
hope that helps.


> Just to translate from HTML5 speak into my own words. I???ve got ...
> 
> <link rel="stylesheet" href="...">
> <script src="..."></script>
> 
> ... and I would like to step through the parsing algorithm. This is my
> understanding so far:
> 
> 1. Run the script (#parsing-main-incdata, case ???An end tag whose tag
> name is "script"???)

You missed the <script> start tag processing, which sets the 
"parser-inserted" flag:

   http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#scriptTag


> That means, inline script execution should also wait for stylesheets to 
> load. Am I right in this reading?

Yes.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 16 March 2010 17:05:16 UTC