Re: Sync API for workers from Glenn Maynard on 2012-09-07 (public-webapps@w3.org from July to September 2012)

From: Glenn Maynard <glenn@zewt.org>
Date: Thu, 6 Sep 2012 19:18:47 -0500
To: Jonas Sicking <jonas@sicking.cc>
Cc: Andrea Marchesini <amarchesini@mozilla.com>, David Bruant <bruant.d@gmail.com>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <CABirCh-wA-siZHgqere9otjzH=RaYJnVKJK00gng6qeuHDgqbA@mail.gmail.com>
Just to ping a detail, so it's not lost in history: it should also be
possible to peek at a

On Thu, Sep 6, 2012 at 12:31 AM, Jonas Sicking <jonas@sicking.cc> wrote:

> > 1: allow blocking upwards, 2: allow blocking downwards,
>
> Indeed. But I believe #2 is more useful than #1. I wasn't proposing
> having both, I was proposing only doing #2.
>

OK.  I think I disagree, but this is orthogonal to which API approach is
used, since it's easy to take every proposal and flip it either way.

That is certainly an interesting use case. I think another interesting
> use case is being able to write synchronous APIs in workers whose
> implementation uses APIs that are only available on the main thread.
>

I understand the concept, but I'm having trouble coming up with useful
examples.  Can you give one?

The only case that comes to mind is blocking for user input; for example,
requesting the UI thread to ask the user his name, and then waiting for the
response.  (That might be useful, but I think it's far useful than generic
sync worker APIs.)  It sounds like you're talking about something like
"construct a DOM tree, do some stuff to it and return a result", but I
can't think of a useful example for that.

The only DOM APIs I've *really* wanted in workers is eg. HTMLImageElement,
as part of getting WebGL into workers, but this wouldn't help there.

The example from Olli's proposal 3 does what effectively amounts to
> "spinning an event loop". It pulls out a bunch of events from the
> normal event loop and then manually dispatches them in a while loop.
> The behavior is exactly the same as spinning the event loop (except
> that non-message tasks doesn't get dispatchet).
>

I don't like it either, but it was only needed in his proposal because it
didn't support MessagePorts, so he had to do his own ad hoc filtering on a
single port.

Claiming that we don't need to explain all edge cases to authors and
> just give them a simplified version would, I think, be ignoring the
> complexity of software that people write using the web platform.
>

Most of the complexity of the algorithm is in the mechanics of
implementation, rather than its effects.  Users don't have to know that
it's the receiving thread handling the flag, or that an internal message is
sent to clear the flag on the other side of the channel; these are
algorithmic details.

So it sounds like you are ok with not permitting the using both
> synchronous and asynchronous messages to the same port then? As long
> as some ports allow synchronous messages and others allow asynchronous
> messages. Leaving aside the issue of how and when it is determined
> that a port is sync vs. async.
>

I don't really like it or think it's necessary, but it doesn't seem
crippling and I'd live with it if needed to come to a resolution.  It does
lead to making people jump some extra hoops, though.

The original use case that led me to this in the first place was an
autocomplete worker.  The worker receives an ordinary message with the
user's typed text, eg. {text: "intern"}, and starts searching.  The search
may take longer than it takes for the user to type the next letter, and I
wanted to be able to immediately stop the search if another message comes
along, so I can restart with the new text.  The simplest way to do that is
to occasionally poll for a new message (zero timeout), and when {text:
"interne"} comes along, restart the search.

This isn't impossible with this restriction, but you do have to jump a few
extra hoops.  It'd need a separate "cancel" port which can be accessed
synchronously; when a message shows up on it, cancel the search so the next
{text} message (on a regular port) can be received.

Actually, I don't think that would work, since the order of messages across
ports is unspecified...

An aside: it should always be possible to poll a message port, regardless
of whether you're a parent or child.  (It doesn't cause deadlocks since
it's nonblocking, and it allows the above scenario even if blocking is only
allowed from the parent side.)

 Unless you structure you code such that it's the responsibility of the
> consumer of the API to create the channel. That way the consumer of
> the API can choose if it wants to use blocking or non-blocking.
>

But then you still end up with a port which is heavily restricted in where
and how it can be passed around.  If you want to pass it to your
great-great-grandfather thread, you have to post it to your parent, who
posts it to his parent, who posts it to his parent.  Posting it directly
there isn't possible.  Passing it to siblings or uncles isn't possible at
all.

The same is true on the worker side; if it wants to pass its port to its
grandchild, it has to pass it to its child, who passes it again to its
child.  You have to carefully structure your messaging to always do this.

And do note that even your proposal requires the async side of a
> message channel to be aware of that the other side might be using
> synchronous polling. If it wants to support the other side polling
> messages synchronously, it needs to take care not to pass its end of
> the channel to the "wrong" places.
>

You can never be 100% agnostic, but it's very easy to write code that works
with both without careful arrangements.  For example, if you're a dedicated
worker tree, and you work entirely within that tree, you can pass a pipe
around within your black box however you want without affecting whether
your user--a parent thread holding the other side of the pipe--can block on
its side.


On Thu, Sep 6, 2012 at 3:57 AM, Olli Pettay <Olli.Pettay@helsinki.fi> wrote:

> I think I prefer your approach over the proposals 1 and 2 because it makes
> it
> clear that it is all about different communication channel so it should be
> more obvious to the API user that data ordering can't be guaranteed.
>

getMessage guarantees data ordering within a port.  You receive messages in
the order they were posted to the channel.

(It isn't if you pass multiple ports to getMessage, but that's how
MessagePort already works--the order of messages across different
MessagePorts isn't guaranteed, since each port creates its own task source.)

-- 
Glenn Maynard
Received on Friday, 7 September 2012 00:19:16 UTC