[whatwg] WebWorkers vs. Threads

On Wed, Aug 13, 2008 at 11:50 AM, Shannon <shannon at arc.net.au> wrote:
> The WebWorkers implementation (scary! hide your children!!):
>
> --- worker.js ---
> updateGlobalLa = function (e) {
>  var localLa = someLongRunningFunction( e );
>  workerGlobalScope.port.postMessage("set la = "+ localLa);
> }
> workerGlobalScope.port.AddEventListener("onmessage", updateGlobalLa, false);
> workerGlobalScope.port.postMessage("get la");
>
> --- main.js ---
> // global object or variable
> var la = 0;
>
> handleMessage = function(e) {
>  if (typeof e.match("set la"))
>     la = parseInt(e.substr(3));
>  } else if (typeof e.match("get la")) {
>     worker.postMessage(la.toString());
>  }
> }
> var worker = new Worker("worker.js");
> worker.AddEventListener("onmessage", handleMessage, false);
>
>
> Unlike the one-line example above we increment the global value based on
> some long-running calculation on its original value (rather than just add
> 1). This shows a more realistic use case for threading. Unfortunately our
> potentially dangerous one-liner is now an equally dangerous 18-line monster
> spread over 2 files and we STILL haven't solved the issue of another worker
> or the main context updating 'la' between our original postMessage query and
> our response.

You're right that if you try to use workers like threads, you end up
with threads. A more message-passing style implementation is easier.
In particular you would not want to allow the worker to 'get' and
'set' individual variables, but pass it messages about work you would
like it to perform, and have it pass back messages about the result.
This is less flexible than threading but easier to reason about.

// main.js
var la = 0; // what is with this variable name?
var worker = createWorker("worker.js");
worker.port.addEventListener("message", function(e) {
  la = parseInt(e.message);
  alert(la);
}, false);

// worker.js
workerGlobalScope.port.addEventListener("message", function(e) {
  workerGlobalScope.port.sendMessage(someLongRunningFunction(parseInt(e.message)));
}, false);

A more realistic example would have the worker also doing synchronous
IO between chunks of longRunningFunctions, and then finally passing a
result back to the UI.

> I should also point out that even this simple, naive and probably incorrect
> example still took me nearly 2 hours to write - largely due to the
> complexity of the WebWorkers spec and the lack of any decent examples.
> Honestly anyone who thinks this interface is supposed to make things easier
> is kidding themselves.

In my experience with Gears, it has not been difficult for people to
get started with workers. Many programmers are already familiar with
message passing style concurrency, so it is easy for them to pick up
workers. But even when they don't, they are conceptually simple.

I'm not sure what happened in your case. It is true that the spec is
not written as a tutorial for developers, but as a rulebook for
implementors, and this makes it hard to grok for new developers. That
may have contributed.

There are a bunch of examples that Ian has kindly written at the very
top of the document. What was unhelpful about them?

> Regardless of the kind of Getters/Setters/Managers/Whatever paradigm you use
> in your main thread you can never escape the possibility that 2 workers
> might want exclusive access to an essential global object (ie, DOM node or
> global setting). So far I have not found any real-world programming language
> or hardware that can do this without some kind of side-effect or programming
> construct (ie, locks, mutexes, semaphores, etc...). What WebWorkers is
> really doing is requiring the author to write their own.

You are thinking about this wrong. Don't try to give two chunks of
your program concurrent access to shared state; that is impossible.
Instead realize there is no shared state and factor your program into
two pieces -- one to do the heavy lifting and one to manipulate the
UI. Then create a protocol for them to communicate with message
passing.

> I don't think I can stress enough how many important properties and
> functions of a web page are ONLY available as globals. DOM nodes, style
> properties, event handlers, window.status ... the list goes on. These can't
> be duplicated because they are properties of the page all workers are
> sharing. Without direct access to these the only useful thing a worker can
> do is "computation" or more precisely string parsing and maths.

You're forgetting the ability to do synchronous IO and the ability to
share workers between pages. Both of these benefits have been
explained in previous messages.

At this point I suspect we will have to agree to disagree. Perhaps
keep an eye on the spec as it continues to evolve. Perhaps it will
start to grow on you.

- a

Received on Wednesday, 13 August 2008 13:03:54 UTC