RE: publish Last Call Working Draft of Web Workers; deadline March 7 from Ian Hickson on 2011-06-14 (public-webapps@w3.org from April to June 2011)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 14 Jun 2011 19:19:23 +0000 (UTC)
To: Adrian Bateman <adrianba@microsoft.com>
cc: public-webapps <public-webapps@w3.org>, Travis Leithead <Travis.Leithead@microsoft.com>
Message-ID: <Pine.LNX.4.64.1106141843240.19153@ps20323.dreamhostps.com>
On Wed, 9 Mar 2011, Adrian Bateman wrote:
> 
> Based on our understanding of the web worker lifetime model (Section 
> 4.4), dedicated workers are allowed to enter into an "orphaned" state 
> where they have a message port that is keeping them alive (see example 
> at the end of this feedback).

I do not believe this is entirely accurate. It's the combination of having 
a document owner and having something that protects it (like a port) that 
keeps a worker alive.


> We can imagine scenarios where the 
> orphaned workers are still able to provide "results" to a document 
> (e.g., via connecting to a shared worker), however these use cases 1) 
> seem largely irrelevant, 2) can be handled by shared workers if needed 
> and 3) overly complicate the implementation (in our analysis) of 
> dedicated workers.

I strongly disagree with point (1). The great thing about the 
MessageChannel / Web Worker model is that you can create a worker, have it 
vend a port, and then forget about the worker but still have everything 
work. It is an absolutely key feature of the API. I don't see how shared 
workers would do this better.

Could you elaborate on why it complicates the implementation?


> We note that no browser appears to implement the lifetime model as 
> specified in the latest editor's draft (that we can test).

Do you have a test I could examine to test this?


> 1 - Lifetime based on a dedicated worker's document "reachability": This 
> alternate lifetime model keeps a dedicated worker alive until it can no 
> longer communicate with the document(s) to which it is associated 
> (through its implicit port or any other port). This proposed lifetime 
> model is based on graph reachability, where the nodes in the graph are 
> web workers and the arcs in the graph are implicit and explicit message 
> ports owned by a worker (i.e., "the worker's ports"). A dedicated 
> worker's lifetime is managed by whether the dedicated worker can "reach" 
> the document(s) in its list of "the worker's documents". See the example 
> at the end for how the currently speced lifetime model changes with this 
> approach.

It has to be more than just reachability of the original document, because 
otherwise if an iframe vends a port from a worker to its parent, and then 
drops all references, this would expose specifics about GC behaviour.


> 2 - Lifetime that prevents orphaning dedicated workers: In this 
> alternate lifetime model, orphaned dedicated workers are never allowed, 
> and the lifetime of the worker is strictly controlled by its implicit 
> port. Therefore, whenever a worker creates another worker, if the 
> "parent" worker is terminated or closed, then the "child" worker will be 
> terminated or closed as well (preventing the child from becoming an 
> orphan). This model is enforced regardless of other message ports that 
> the child may have.

This doesn't seem significantly simpler than what we have now, for 
implementations (it's just keeping track of one port instead of a list), 
while being significantly less useful for authors (no "fire-and-forget" 
model is possible). Since "fire-and-forget" is an important use case, I do 
not believe we should do this.


> Example that creates an orphaned dedicated workers:
>
> Steps:
>   1. Document 'D' creates dedicated worker 'W1' 
>   2. Dedicated worker W1 creates a dedicated worker 'W2' 
>   3. Document 'D' creates dedicated worker 'W3' 
>   4. Dedicated worker W3 creates a dedicated worker 'W4'
>      (At this point W1 and W3 are "parent" workers and W2 and W4 are "child" workers.)
>   5. W1 creates a message channel and passes the channel's ports to document 'D' and 'W2' 
>   6. W3 creates a message channel and passes the channel's ports to document 'D' and 'W4'
>      ('D' now has an independent message port for W2 and W4.)
>   7. Document 'D' creates a message channel and passes the channel's ports to 'W2' and 'W4'
>      (W2 and W4 now have a direct communication channel between themselves.)
>   8. Document 'D' terminates worker 'W1'
>      (Terminating W1 causes all W1's ports to be disentangled [step 15 of section 4.5
>      processing model] which effects W2's implicit port; however, W2 is not terminated
>      because it is still considered a "protected" worker, since its list of the worker's
>      ports is not empty.)
>   9. Document 'D' terminates worker 'W3'
>      ('D' still has communication ports with W2 and W4 and can test that they are still
>      alive. W2 and W4 are now "orphaned" from their original creator, but still have a
>      connection to the document 'D'.)
>   10. Document 'D' closes the port connected to 'W2'
>       (W2 is now only connected via a message port to W4, and can send information to
>       'D' via W4.)
>   11. Document 'D' closes the port connected to 'W4'
>       (Document 'D' now has *no* connections to W2 or W4-those workers are completely
>       orphaned from it. However, W2 and W4 are still alive because they are "protected"
>       since they have a message port connection to each other.)
>
> At this point, the only way (that we can think of) for W2 and W4 to "report back" to
> document 'D' is by connecting to a shared worker that can broker communications between
> these workers and document 'D' (if document 'D' connects to this same shared worker).

They can also communicate via the network, or via IndexDB.

Note that there's a whole host of reasons why they might not need to ever 
communicate back to D, though.

Suppose that W1 is a Network API, and W3 is a User Contacts Database API.

D creates W1 and W3 because it needs a network and it needs contacts. It 
then asks W1 for a port so that it can give W3 access to the network; W1 
sends back to D that port and D sends it on to W3.

W3 then spans a worker (W4) to handle network synchronistion, and passes 
it the network port. It can drop its direct connection to W4 because W4 is 
just going to be monitoring the database and the network (via its port to 
W1) and updating things.

W1 similarly offloads its low-level network duties to a separate worker, 
in this case W2.

At this point, everything is working just fine, with the contacts database 
being updated in the background. Why should D, W1, and W3 keep all the 
references to the objects they are never going to use again? It seems like 
it would make the API really confusing to authors if things broke unless 
they kept references around. Nothing in the Web platform has worked like 
that so far -- for example, you can create an XHR object, set up event 
handlers, and then forget all about it and it'll still contact the network 
and do its stuff.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 14 June 2011 19:20:02 UTC