Re: Sync API for workers from Glenn Maynard on 2012-09-02 (public-webapps@w3.org from July to September 2012)

From: Glenn Maynard <glenn@zewt.org>
Date: Sun, 2 Sep 2012 14:16:30 -0500
To: Andrew Wilson <atwilson@google.com>
Cc: olli@pettay.fi, Rick Waldron <waldron.rick@gmail.com>, David Bruant <bruant.d@gmail.com>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <CABirCh-binFHfBO3TBW_yvkRCTWUagOUpmPp7j6spk-uA2wXog@mail.gmail.com>

On Sun, Sep 2, 2012 at 12:24 PM, Andrew Wilson <atwilson@google.com> wrote:

> Just wanted to point out that all of the arguments for a wait-for-reply
> API in workers also apply to SharedWorkers. It's trickier for SharedWorkers
> since they use MessagePorts, and we probably don't want to expose this kind
> of API to pages (which also use MessagePorts). But I would strongly prefer
> a solution that would be applicable to all kinds of workers, not just
> dedicated workers.
>

You can do that by giving MessagePort another interface in workers, eg.
MessagePortSync or MessagePortWorkers, which inherits from MessagePort and
adds eg. getMessage().  It might need a bit of finessing to switch
interfaces during structured clone.

Alternatively--and as I type this I like it better--add a getMessage(port)
method to WorkerGlobalScope.  Simplicity aside, I like that it can
naturally support getMessage([port1, port2, port3], 100).  With
port.getMessage(), there's no way to wait for a message from multiple ports
(think select()/poll()).  This doesn't give any way to specify the worker's
implicit port, though; I guess that could be a special case, eg. pass in
null.

>
> I'm not entirely certain what the semantics of getMessage() are, though -
> if you grab a message via getMessage(), does this imply that normal
> onmessage event handlers are not run (or perhaps are run after we re-enter
> the event loop)?
>

It shouldn't still dispatch onmessage asynchronously.  That's confusing,
and also, it means the messages would build up in the queue until the
script returns.  Due to the nature of the feature, the script may not
return for a long time, or it may receive lots of messages before it does.
Also, if you may handle the message during processing (via getMessage), and
also when idle (via onmessage), this means it's hard to ensure you don't
process messages twice.

The two options that have come up are:

1: don't dispatch onmessage at all if getMessage returns a message.
getMessage() consumes the message from the queue.
2: dispatch onmessage synchronously, before returning from getMessage
(which can also return the message or not).

They're mostly equivalent; you can build either on top of the other.
(createEvent isn't actually exposed to WorkerGlobalScope in order to
implement #2 from #1, but that's a separate issue.)

I just noticed a strong argument against #2: it's recursive.  Without any
strong benefit, that seems like a good thing to avoid.

I am not optimistic that we can do deadlock prevention in the general case
> with MessagePorts, for the same reason that it's prohibitively difficult to
> reliably garbage collect MessagePorts when they can be passed between
> processes.
>

Would you consider this an implementation-blocking problem?

By the way, another option is to remove the ability to block, so it always
behaves as getMessage(0)--return a waiting message, but don't wait for
one.  That would also make it impossible to deadlock.  Being able to wait
for a message would be a nice plus, but I don't think I've seen any use
cases that really require it.  (If this is done, there's no need to be able
to give multiple ports to getMessage, as I mentioned at the top, since you
can just call getMessage separately for each port.)

-- 
Glenn Maynard

Received on Sunday, 2 September 2012 19:16:58 UTC