Re: Sync API for workers from Rick Waldron on 2012-09-01 (public-webapps@w3.org from July to September 2012)

From: Rick Waldron <waldron.rick@gmail.com>
Date: Sat, 1 Sep 2012 16:30:15 -0400
To: Glenn Maynard <glenn@zewt.org>
Cc: David Bruant <bruant.d@gmail.com>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <8F8DF1FD78CD448FBD9F62969EFB55A6@gmail.com>
On Saturday, September 1, 2012 at 4:02 PM, Glenn Maynard wrote:

> On Sat, Sep 1, 2012 at 11:49 AM, David Bruant <bruant.d@gmail.com (mailto:bruant.d@gmail.com)> wrote:
> > A Sync API for workers is being implemented in Firefox [1].
> > I'd like to come back to the discussions mentionned in comment 4 of the bug.
> > 
> > A summary of points I find important and my comments, questions and concerns
> > 
> > # Discussion 1
> > ## Glenn Maynard [2] Use case exposed:
> > Ability to cancel long-running synchronous worker task
> > "Terminating the whole worker thread is the blunt way to do it; that's
> > no good since it requires starting a new thread for every keystroke, and
> > there may be significant startup costs (eg. loading search data)."
> > => It's a legitimate use case that has no good solution today other than
> > cutting the task in smaller tasks between which a cancellation message
> > can be interleaved.
> 
> The solution proposed in 783190 seems more complex and less useful than the one Sicking and I discussed.  To summarize that one: add a getMessage(timeout) method, which consumes and returns the next message (causing onmessage to not be called[1]).  If timeout is nonzero, wait for a message for up to that duration; if zero the function never blocks (eg. peek for a waiting message).  If the timeout expires, returns null.
> 
> This turns the first example in 783190 into:
> worker.js: var res = getMessage(timeout);
> page.html: worker = new Worker(...); setTimeout(function() { worker.postMessage(data, transferrable); }, 1000); I think this has several advantages.
> 
> - Mozilla's proposal effectly creates a separate, parallel messaging channel on the MessagePort; synchronous vs. asynchronous messages.  This is simpler: messages are just messages, and no new API is exposed outside of workers.
> - User messaging protocols are much simpler.  For example, take a long-running processing task in a worker which wants to be able to receive a "stop what you're doing, I have new information that affects your processing task" message.  With this proposal, the UI thread (or whatever) simply sends a message with the new information.  With Mozilla's proposal, it would have to wait for the thread to periodically send a "do you have anything to tell me?" message, in order to be able to send a response that the thread can receive synchronously.
> - Polling is much cheaper.  With Mozilla's proposal, you have to send a message to another thread, then sit and wait until you get a response.  If it's the UI thread, that may take many milliseconds, since it may be busy doing other things.  With this proposal, polling for new messages in a processing loop should never block due to activity in the other thread.
> - The resulting message protocols are more robust.  With the "query/response" approach, if someone fails to send a response, the worker will wait forever or time out.
> 
> 
> [1] We didn't come to agreement on whether it's better to return the message or to call onmessage synchronously, but that's a detail; whichever approach is used, it's possible to implement the other in script.
> 
> > # Discussion 2
> > ## Joshua Bell [5]
> > "This can be done today using bidirectional postMessage, but of course
> > this requires the Worker to then be coded in now common asynchronous
> > JavaScript fashion, with either a tangled mess of callbacks or some sort
> > of Promises/Futures library, which removes some of the benefits of
> > introducing sync APIs to Workers in the first place."
> > => What are these benefits?
> 
> The benefit of being able to write linear code.  I don't think anyone who's written complex algorithms in JavaScript can seriously dispute this as anything but a huge win.

I can seriously dispute this, as someone who involved in research and development of JavaScript programming for hardware. Processing high volume serialport IO is relatively simple with streams and data events. It's just a matter of thinking differently about the program. 


Rick

 
> 
> > ## Glenn Maynard [7]
> > "I think this is a fundamental missing piece to worker communication.  A
> > basic reason for having Workers in the first place is so you can write
> > linear code, instead of having to structure code to be able to return
> > regularly (often awkward and inconvenient), but currently in order to
> > receive messages in workers you still have to do that."
> > => A basic reason for having workers is to move computation away from
> > window to a concurrent and parallel computation unit so that the UI is
> > not blocked by computation. End of story. Nothing to do with writing
> > linear code.
> 
> That's another good reason; it doesn't in any way reduce the importance of being able to write linear code, which *is* an important use case of workers.  It's precisely why we have APIs like FileReaderSync.
> 
> > If JavaScript as it is doesn't allow people to write code
> > as they wish, once again, it's a language issue. Either ask a change in
> > the language or create a language that looks the way you want and
> > compiles down to JavaScript.
> 
> This has nothing to do with JavaScript/ECMAScript as a language.  The ugliness of having to implement algorithms in an event-based way is caused by the way the Web uses the language, not the language itself.
> 
> > I wish to add that adding a sync API (even if the sync aspect is
> > asymetrical as proposed in [1]) breaks the event-loop run-to-completion
> > model of in-browser-JavaScript which is intended to be formalized at
> > [concurr]. This model is what prevents web pages from ever freezing from
> > a deadlock. The proposed API preserves this, but create the threat of
> > deadlocks for workers.
> 
> The deadlock you're picturing is already possible.  Two threads deadlocking on getMessage is equivalent to two threads returning to the event loop expecting a message from the other.  In both cases, they'll both stop forever; it's just two ways of causing the same problem.
> 
> That said, it's possible to restrict the API in a way that prevents this: only expose getMessage on messages DedicatedWorkerGlobalScope, not on MessagePort itself.  (That is, you can only block on messages from your creating thread.)  That'd be a harsh limitation, but if it came down to it I'd take it over not having this feature at all.  It looks like this is also the approach described in 783190, at least from the example in the first comment.
> 
> > Besides programmer convenience, few arguments have been advanced to
> > justify the breakage of the current concurrency model (I don't even
> > think the breakage has been mentionned at all!).
> 
> I don't believe it breaks the concurrency model significantly more than workers do inherently.  Whether you're receiving a message with a function call or by returning to the event loop and resuming with a timer, the timing of the message vs. the interval in which you're waiting for it is nondeterministic.
> 
> For that matter, all of this is conceptually equivalent to sending the message to a server, and having the worker peek for messages using sync XHR, just without the expensive network communication in the middle.
> 
> -- 
> Glenn Maynard
>
Received on Saturday, 1 September 2012 20:30:58 UTC