RE: publish Last Call Working Draft of Web Workers; deadline March 7 from Travis Leithead on 2011-03-15 (public-webapps@w3.org from January to March 2011)

From: Travis Leithead <Travis.Leithead@microsoft.com>
Date: Tue, 15 Mar 2011 02:34:06 +0000
To: Adrian Bateman <adrianba@microsoft.com>, "atwilson@google.com" <atwilson@google.com>
CC: Travis Leithead <Travis.Leithead@microsoft.com>, Arthur Barstow <art.barstow@nokia.com>, public-webapps <public-webapps@w3.org>
Message-ID: <9768D477C67135458BF978A45BCF9B3817412410@TK5EX14MBXW602.wingroup.windeploy.ntde>

Drew Wilson (atwilson@google.com) wrote:

> I think this alternate lifetime model is practically unimplementable in a world where 
> workers and pages live in multiple processes. The reason is that the linkage between 
> nodes in your graph depends on reachability of ports which can't really be established 
> simultaneously across processes - I think you end up with cycles in the graph that can't 
> easily be resolved.

It sounds like we agree that this alternate lifetime model (based on graph reachability) is not desirable and should not be specified.

--

> I guess I don't understand the implementability concerns that would lead to adopting #2. 
> Your example below seems largely identical to:
> 
> Document creates two workers, W1 and W2
> Document creates a MessageChannel and gives one port to each worker
> Document removes all references to the worker objects, so those two ports are the only 
> thing keeping W1 and W2 alive.
>
> I don't understand why we would treat this case differently than your case (where a third 
> worker actually performs the creation).

Yes, these are effectively the same, except with "terminate" I expressly instruct the worker to close, rather than relying on the GC

> Let me provide a useful real-life example:
>
> 1) Document creates worker W1 and immediately drops its reference to it so
> it is no longer reachable. W1 is still protected because it has not run its
> initial script yet.
> 2) In its startup script, W1 creates two workers, W2 and W3, and immediately
> drops references to them. W1 is permissible, but no longer protected, and
> could be closed at any time.
> 3) W2 and W3 run their startup script to do some quick processing, and when
> complete they will upload the results to the server.
> 4) Since W1 is no longer protected, the system could close it, so let's say
> W1 gets closed now.
>
> With the spec as it currently stands, W2 and W3 would continue to run as
> long as the parent document stays alive, and they would complete their
> processing and submit the results to the server.
> With your suggested change, W1 would close, and that would cascade to W2 and
> W3, cancelling their operation. So developers would be forced to find a way
> to keep W1 alive for the duration of the execution of W2 and W3 to prevent
> them from being prematurely closed, either by timers, or by holding
> references to resources (like protected workers W2 and W3) that they don't
> need.
>
> So I'm not certain that the lifetime specification you describe has the
> desired behavior in the case of fire-and-forget workers (workers that don't
> need to interact with their parents).

Correct. The model I describe would terminate the nested workers W2 and W3 when W1 was closed.

Approach #2 could be altered to accommodate this scenario. Rather than running the "terminate a worker" algorithm which would effectively kill the operation-in-progress in W2 and W3, the worker would simply set the close flag on all of its owned workers to "true". Per the spec, this would allow the workers to finish any pending tasks they had queued up, but not allow any new work to be added, and they would terminate after that.

I simulated this in Opera 11 to see how they handled it by having Document D create a W1 (in a fire-and-forget model as you describe). In W1's initial script it fires-and-forgets W2, but then expressly calls "close()" on itself. W2's initial script is to start an XHR GET request for another resource, then at readyState4 it sets a timeout to XHR another file and so on four times over about ten seconds. Using a network monitoring tool, I observe that W2 is downloaded, but no XHR request is ever made.

When I remove the explicit "close()" call, then all four resources are requested from W2 (it stayed alive).

My conclusion is that Opera has a hybrid model where express "close()" or "terminate()" calls will cascade-close the nested workers, but otherwise (in fire-and-forget cases at least) the worker is either 1) not closed at all because no GC happens or 2) orphaned and continues to run as per spec. I can't really tell either way without knowing when the system decides to close W1.

In general I acknowledge that fire-and-forget scenarios are important-enough that they should be supported (as they are supported in Opera today). Perhaps this negates the need to pursue alternate approach #2?

Received on Tuesday, 15 March 2011 02:34:41 UTC