Re: Sync API for workers from Jonas Sicking on 2012-09-06 (public-webapps@w3.org from July to September 2012)

From: Jonas Sicking <jonas@sicking.cc>
Date: Thu, 6 Sep 2012 00:21:07 -0700
To: olli@pettay.fi
Cc: Glenn Maynard <glenn@zewt.org>, Andrea Marchesini <amarchesini@mozilla.com>, David Bruant <bruant.d@gmail.com>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <CA+c2ei9YoAWitux8zv0mAWoFe3L+CZk7YCiX0g2oJRGY__Pztg@mail.gmail.com>
On Wed, Sep 5, 2012 at 11:56 PM, Olli Pettay <Olli.Pettay@helsinki.fi> wrote:
> On 09/06/2012 09:49 AM, Jonas Sicking wrote:
>>
>> On Wed, Sep 5, 2012 at 11:30 PM, Olli Pettay <Olli.Pettay@helsinki.fi>
>> wrote:
>>>
>>> On 09/06/2012 09:12 AM, Jonas Sicking wrote:
>>>>
>>>>
>>>> On Wed, Sep 5, 2012 at 11:02 PM, bugs@pettay.fi <bugs@pettay.fi> wrote:
>>>>>
>>>>>
>>>>> On 09/06/2012 08:31 AM, Jonas Sicking wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 5, 2012 at 8:07 PM, Glenn Maynard <glenn@zewt.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Sep 5, 2012 at 2:49 AM, Jonas Sicking <jonas@sicking.cc>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The problem with a "Only allow blocking on children, except that
>>>>>>>> window can't block on its children" is that you can never block on a
>>>>>>>> computation which is implemented in the main thread. I think that
>>>>>>>> cuts
>>>>>>>> out some major use cases since todays browsers have many APIs which
>>>>>>>> are only implemented in the main thread.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You can't have both--you have to choose one of 1: allow blocking
>>>>>>> upwards,
>>>>>>> 2:
>>>>>>> allow blocking downwards, or 3: allow deadlocks.  (I believe #1 is
>>>>>>> more
>>>>>>> useful than #2, but each proposal can go both ways.  I'm ignoring
>>>>>>> more
>>>>>>> complex deadlock detection algorithms that can allow both #1 and #2,
>>>>>>> of
>>>>>>> course, since that's a lot harder.)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Indeed. But I believe #2 is more useful than #1. I wasn't proposing
>>>>>> having both, I was proposing only doing #2.
>>>>>>
>>>>>> It's actually technically possible to allow both #1 and #2 without
>>>>>> deadlock detection algorithms, but to keep things sane I'll leave that
>>>>>> as out of scope for this thread.
>>>>>>
>>>>>> [snip]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think that's by far the most
>>>>>>> interesting category of use cases raised for this feature so far, the
>>>>>>> ability to implement sync APIs from async APIs (or several async
>>>>>>> APIs).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> That is certainly an interesting use case. I think another interesting
>>>>>> use case is being able to write synchronous APIs in workers whose
>>>>>> implementation uses APIs that are only available on the main thread.
>>>>>>
>>>>>> That's why I'm not interested in only blocking on children, but rather
>>>>>> only blocking on parents.
>>>>>>
>>>>>>>> The fact that all the examples that people have used while we have
>>>>>>>> been discussing synchronous messaging have spun event loops in
>>>>>>>> attempts to deal with messages that couldn't be handled by the
>>>>>>>> synchronous poller makes me very much think that so will web
>>>>>>>> developers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> getMessage doesn't spin the event loop.  "Spinning the event loop"
>>>>>>> means
>>>>>>> that tasks are run from task queues (such as asynchronous callbacks)
>>>>>>> which
>>>>>>> might not be expecting to run, and that tasks might be run
>>>>>>> recursively;
>>>>>>> none
>>>>>>> of that that happens here.  All this does is block until a message is
>>>>>>> available on a specified port (or ports), and then returns it--it's
>>>>>>> just
>>>>>>> a
>>>>>>> blocking call, like sync XHR or FileReaderSync.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The example from Olli's proposal 3 does what effectively amounts to
>>>>>> "spinning an event loop". It pulls out a bunch of events from the
>>>>>> normal event loop and then manually dispatches them in a while loop.
>>>>>> The behavior is exactly the same as spinning the event loop (except
>>>>>> that non-message tasks doesn't get dispatchet).
>>>>>
>>>>>
>>>>>
>>>>> It is just dispatching events.
>>>>> The problems we (Gecko) have had with event loop spinning in main
>>>>> thread
>>>>> relate mainly to the problems where "unexpected" events are dispatched
>>>>> while
>>>>> running the loop, as an example user input events or events coming from
>>>>> network.
>>>>> getMessage/waitForMessage does not have that problem.
>>>>
>>>>
>>>>
>>>> I'm not sure what you mean by "just dispatching events". That's
>>>> exactly what event loop spinning is.
>>>
>>>
>>>
>>> No. waitForMessage example I wrote down just dispatches DOM events in a
>>> loop.
>>> That is a synchronous operation and you know exactly which events you're
>>> about to dispatch.
>>> If you run the generic event loop, you also end up running
>>> timers and getting input from network and user etc. and you can't
>>> controls those.
>>
>>
>> Just because they are message events doesn't mean that "you know
>> exactly which events you're about to dispatch". That's basically
>> equivalent to saying that it's safe to spin the event loop in Gecko as
>> long as you only dispatch nsIRunnables that were dispatched from Gecko
>> code, as opposed to native events from the native event loop.
>>
>> Note that messages can be sent to the worker in response to network
>> and UI events on the main thread.
>>
>>>> Why are the gecko events any more "unexpected" than the message events
>>>> that the example dispatches.
>>>
>>>
>>> We don't want to block certain events in Gecko (like user input to
>>> chrome).
>>> Blocking events in worker code is ok.
>>
>>
>> I don't understand what you are saying here.
>>
>>>> If they at that point call into a library which
>>>> starts pulling messages off of the task queue and dispatches them,
>>>> they'll run into the same problems as we've had.
>>>
>>>
>>> ...but then it is up to the library to handle the case properly and
>>> dispatch events async.
>>
>>
>> But if it dispatches them asynchronously, they have lost their place
>> in the message queue. I.e. now they are placed after all other
>> incoming message events. Such event reordering is likely to break
>> application level logic.
>>
>> And like I said in my original email, you can fix that reordering by
>> completely redoing all you message event handling and using a
>> framework. But I'd like to find a solution that doesn't require that.
>> Especially since that still wouldn't fix the fact that messages get
>> reordered compared to other types of tasks on the task queue.
>>
>>> Though, dispatching events async so that other new message events don't
>>> get
>>> handled before them
>>> would require some new API.
>>
>>
>> Exactly. And you'd have to make sure that they get dispatched in the
>> correct order compared to the other events that you are pulling out
>> using waitForMessage. I.e. simply an API which inserts them in the
>> beginning of the task queue wouldn't be enough. You'd need something
>> like an iterator which lets you iterate the task queue and inspect
>> messages and only pull out selected ones. And allow that iterator to
>> block once it sees no more incoming messages.
>>
>> But that is a whole lot messier than the other proposals that have
>> been discussed. And what you would effectively have is something that
>> amounts to a separate communication channel which you can pull
>> messages out of while ignoring all other pending messages in other
>> channels. So why not simply use an actual separate channel instead?
>
>
> Because data sent from parent to child might arrive in different order than
> in which it was sent, and workers couldn't know that, nor the parent.

First off, that reordering is happening in proposal 3 just as much as
in the other proposals. Say that you have two message channels to a
worker, say the implicit channel through Worker and a "normal"
MessageChannel, and send messages though both of them. If the code in
the worker calls waitForMessage on the worker global scope, that will
reorder the message delivery such that the messages sent through
Worker are delivered before messages sent through the MessageChannel.

Second, I'm not sure what you mean by "workers couldn't know that".
The whole point of the synchronous waitForMessage function is that it
stops all other messages in all other channels and, and all other
tasks from all task sources, and waits for a message from the selected
channel. That is very explicit.

Third, they wouldn't arrive in a different order, they would just be
processed in a different order. Which is exactly what is happening if
you pull selected messages out of the event loop. And the parent
wouldn't know it had happened any more than in proposal 1 or 2.

I really don't see a way to make things sane if we try to send
synchronous and asynchronous messages through the same message
channel. Message passing between isolated threads is not a new
concept, neither is having blocking reads for those messages. But are
there any other systems out there that allow mixing synchronous and
asynchronous reading from the same message channel?

And more importantly, are there any benefits to it?

I think the only sane way to make this work is for message channels to
be dedicated sync or async. I think it's still up for debate what
syntax to use for setting these channels up. I.e. do you create it as
sync/async, or do you choose the mode of operation at the time when
you start the channel?

> (Also, proposal 1 has the rather major problem with multiple event listeners.)

If we go with the solution of having separate channels for sync
messages, but use waitForMessage for doing blocking reads from these
channels, then this problem would go away. The asynchronous side of
the sync channel would look like a normal async port, and so you'd
simply call postMessage any time you wanted to send a message.

/ Jonas
Received on Thursday, 6 September 2012 07:22:10 UTC