[whatwg] MessagePorts in Web Workers: implementation feedback

I agree with Drew's assessment that MessagePorts in combination with  
Workers are extremely complicated to implement correctly, as currently  
specified. In fact, the design seems to push towards having lockable  
shared state, even though one potential advantage of the message  
passing design is to avoid locking and shared state.

Besides removing MessagePorts as a way to communicate with workers,  
another possibility is simplifying the life cycle requirements. For  
example, getting rid of the keepalive rule, whereby both MessagePorts  
remain live so long as either is otherwise live, would remove the  
majority of the complexity. I don't think the slight convenience of  
that rule is worth the extra implementation cost.

On May 7, 2009, at 1:39 PM, Drew Wilson wrote:

> Hi all,
>
> I've been hashing through a bunch of the design issues around using  
> MessagePorts within Workers with IanH and the Chrome/WebKit teams  
> and I wanted to follow up with the list with my progress.
>
> The problems we've encountered are all solveable, but I've been  
> surprised at the amount of work involved in implementing worker  
> MessagePorts (and the resulting implications that MessagePorts have  
> on worker lifecycles/reachability). My concern is that the amount of  
> work to implement MessagePorts within Worker context may be so high  
> that it will prevent vendors from implementing the SharedWorker API.  
> Have other implementers started working on this part of the spec yet?
>
> Let me quickly run down some of the implementation issues I've run  
> into - some of these may be WebKit/Chrome specific, but other  
> browsers may run into some of them as well:
>
> 1) MessagePort reachability is challenging in the context of  
> separate Worker heaps
>
> In WebKit, each worker has its own heap (in Chrome, they will have  
> their own process as well). The spec reads:
> User agents must act as if MessagePort objects have a strong  
> reference to their entangled MessagePort object.
>
> Thus, a message port can be received, given an event listener, and  
> then forgotten, and so long as that event listener could receive a  
> message, the channel will be maintained.
>
> Of course, if this was to occur on both sides of the channel, then  
> both ports would be garbage collected, since they would not be  
> reachable from live code, despite having a strong reference to each  
> other.
>
> Furthermore, a MessagePort object must not be garbage collected  
> while there exists a message in a task queue that is to be  
> dispatched on that MessagePort object, or while the MessagePort  
> object's port message queue is open and there exists a message event  
> in that queue.
>
> The end result of this is the need to track some common state across  
> an entangled MessagePort pair such as: number of outstanding  
> messages, open state of each end, and number of active references to  
> each port (zero or non-zero). Turns out this last bit will require  
> adding new hooks to the JavaScriptCore garbage collector to detect  
> transitioning between 1 and 0 references without actually freeing  
> the object - not that difficult, but possibly something that other  
> implementers should keep in mind.
> 2) MessagePorts dramatically change the worker lifecycle
>
> Having MessagePorts in worker context means that Workers can outlive  
> their parent window(s) - I can create a worker, pass off an  
> entangled MessagePort to another window (say, to a different  
> domain), then close the original window, and the worker should stay  
> alive. In the case of WebKit, this causes some problems for things  
> like worker-initiated network requests - if workers can continue to  
> run even though there are no open windows for that origin, then it  
> becomes problematic to perform network requests (part of this is due  
> to the architecture of WebKit which requires proxying network  
> requests to window context, but part of this is just a general  
> problem of "how do you handle things like HTTP Auth when there are  
> no open windows for that origin?")
>
> Finally, the spec defines a fairly broad definition of what makes a  
> worker reachable - here's an excerpt from my WebKit Shared Worker  
> design doc, where I summarize the spec (possibly incorrectly - feel  
> free to correct any misconceptions):
>
> Permissible
>
> The spec specifies that a worker is permissible based on whether it  
> has a reachable MessagePort that has been entangled at some point in  
> the past with an active window (or with a worker who is itself  
> permissible). Basically, if a worker has ever been entangled with an  
> active window, or if it's ever been entangled with a worker who is  
> itself permissible (i.e. it's associated with an active window via a  
> chain of workers that have been entangled at some point in the past)  
> then it's permissible.
>
> The reason why the "at some point in the past" language is present  
> is to allow a page to create a fire-and-forget worker (for example,  
> a worker that does a set of long network operations) without having  
> to keep a reference to that worker around.
>
> Once the referent windows close, the worker should also close, as  
> being permissible is a necessary (but not sufficient) criteria for  
> being runnable.
> Active needed
>
> A permissible worker is active needed if:
> it has pending timers/network requests/DB activity, or
> it is currently entangled with an active window, or another active  
> needed worker.
>
> The intent behind #1 is to enable fire-and-forget workers that don't  
> exit until they are idle. The intent behind #2 is that an idle  
> worker shouldn't exit as long as it's reachable by an active window  
> (possibly chained through other workers).
> The end result is that for each worker we need to keep track of a  
> big list of every window it's ever been entangled with. As workers  
> become entangled with other workers, they each inherit the list of  
> entangled windows from the other worker. As windows become inactive,  
> we then walk the lists of every worker to remove references to the  
> window and properly shutdown the worker as appropriate. All of this  
> with the appropriate cross-thread synchronization, of course :)
> Likewise, determining when a worker is active needed requires  
> tracking a graph of entangled message ports, and walking that graph  
> to determine whether a given worker is reachable by any active  
> window. Typically this is only needed when either a window closes,  
> or when a worker goes idle.
>
> Again, none of these issues individually are insurmountable, but in  
> total they add up to a significant amount of work for what should be  
> a fairly incremental improvement (going from dedicated workers to  
> shared workers). Have other vendors started investigating what it  
> takes to implement SharedWorkers (and therefore MessagePorts in  
> workers)?
>
> Another approach for SharedWorkers would be to give them an implicit  
> MessagePort-esque API like dedicated Workers and not allow passing  
> in MessagePorts to postMessage(). This would mean that references to  
> workers can't really be passed around to other windows/workers, but  
> rather are kept per-origin. Dedicated workers could work as they do  
> now in Firefox/WebKit (with no MessagePorts). The SharedWorker  
> lifecycle could be significantly simplified such that a SharedWorker  
> is permissible as long as there's an active window under the same  
> origin (no more walking some distributed cross-thread dependency  
> graph).
> The thing we'd give up is the capabilities-based API that  
> MessagePorts provide, but I'd argue that the workaround is simple:  
> the creating window can just act as a proxy for the worker. IMO, the  
> implementation burden far outstrips the benefit of allowing direct  
> foreign access to workers. Literally 90% of the work on my plate for  
> SharedWorkers seems to derive from MessagePorts in one form or  
> another, which seems completely wrong.
> I'd like to hear your thoughts on this - are people open to removing  
> MessagePort support from Workers?
>
> -atw
>
>
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090507/8b749d5b/attachment-0001.htm>

Received on Thursday, 7 May 2009 15:28:46 UTC