[whatwg] WebWorkers vs. Threads

Aaron Boodman wrote:

> There are a bunch of examples that Ian has kindly written at the very
> top of the document. What was unhelpful about them?
>
>   
After reading this I went back to look for them. What happened 
originally was that I followed one of the links and seeing only a single 
line of text and minimal worker activation code assumed wrongly that the 
demo was a placeholder for code to be added later. This made sense to me 
at the time because the spec is still draft and no browsers implement 
workers (so a demo wouldn't work anyway). It didn't occur to me at the 
time to manually type the address of the worker.js file into the browser 
to retrieve the actual worker implementation (probably because it was 4am).

>
> You're right that if you try to use workers like threads, you end up
> with threads. A more message-passing style implementation is easier.
> In particular you would not want to allow the worker to 'get' and
> 'set' individual variables, but pass it messages about work you would
> like it to perform, and have it pass back messages about the result.
> This is less flexible than threading but easier to reason about.
>   
>> Regardless of the kind of Getters/Setters/Managers/Whatever paradigm you use
>> in your main thread you can never escape the possibility that 2 workers
>> might want exclusive access to an essential global object (ie, DOM node or
>> global setting). So far I have not found any real-world programming language
>> or hardware that can do this without some kind of side-effect or programming
>> construct (ie, locks, mutexes, semaphores, etc...). What WebWorkers is
>> really doing is requiring the author to write their own.
>>     
>
> You are thinking about this wrong. Don't try to give two chunks of
> your program concurrent access to shared state; that is impossible.
> Instead realize there is no shared state and factor your program into
> two pieces -- one to do the heavy lifting and one to manipulate the
> UI. Then create a protocol for them to communicate with message
> passing.
>   

I understand this and I probably confused things by writing such a naive 
example. The point was lost in the resulting fuss about the 
implementation details. The point I was trying to make is that that 
separating code into "doers" and "thinkers" or UI/processing models is a 
luxury afforded to a limited scope of applications that don't require 
tight synchronisation or direct access to limited resources. I know you 
are aware of this but it forms a valid argument for implementing real 
threads. Message-passing frameworks are fine for certain tasks but quite 
useless or annoying for others (which is what I was trying to demonstrate).

A better example of the need for global access is a real-world issue I 
run into regularly. That is walking the DOM (such as when fixing browser 
shortcomings or performing actions based on a tag class or attribute). 
So far this is the only task I have ever performed in Javascript that 
had a noticable impact on the UI responsiveness. If I was ever going to 
move something to a worker thread this would be it. In the current 
WebWorkers this is not an option nor even available via workaround (I 
doubt you can marshal a copy of the DOM across to a worker with any sort 
of efficiency). On the other hand traditional thread or coroutine 
implementations would not be so constrained. It wouldn't even matter if 
the DOM was read only.

It's well and good to insist on message-passing as the sole method of 
interaction and I accept it has many benefits. What I'm trying (and 
failing) to get across is that the class of applications that use 
message-parsing and isolation in the manner of WebWorkers are also the 
same class of applications that are generally wasteful, slow or 
difficult to implement in Javascript. Think about the kind of 
applications that use parallel "compute nodes" and you'll realise that 
98% don't exist outside of academia and laboratories due to 
synchronisation, network latencies and other issues that implementing 
Javascript workers won't solve. More importantly though there is a lack 
of general computing software that requires this model.

In contrast there are literally thousands of desktop applications that 
could be ported from C, Perl or Python to a shared data Javascript 
environment but I don't see how without a closer emulation of the 
typical threading model adopted by 90% of programming languages.

>   
>> I don't think I can stress enough how many important properties and
>> functions of a web page are ONLY available as globals. DOM nodes, style
>> properties, event handlers, window.status ... the list goes on. These can't
>> be duplicated because they are properties of the page all workers are
>> sharing. Without direct access to these the only useful thing a worker can
>> do is "computation" or more precisely string parsing and maths.
>>     
>
> You're forgetting the ability to do synchronous IO and the ability to
> share workers between pages. Both of these benefits have been
> explained in previous messages.
>   

Once again someone mentions synchronous IO. I'm unfamiliar with any 
blocking Javascript IO operations except those explicitly created by the 
author (and I generally disagree with their logic for doing so). XHR is 
non-blocking. Even imageObject.src = 'pic.jpg' is non-blocking. I'm 
still waiting for somebody to tell me what Javascript operations 
actually block the UI except where the author has made a conscious 
decision to do so; ie:

longRunningFunction()
vs.
setTimeout(longRunningFunction,0)

As for sharing workers between pages, this is a property of 
MessagePorts, not WebWorkers. I could easily create a coroutine, thread 
or even a setTimeout loop to acheive the same thing provided I only send 
primitive data rather than object references (which is all MessagePorts 
allows anyway). WebWorkers makes this easier yes but so would a better 
proposal. This isn't a matter of WebWorkers vs. nothing. It's about 
whether WebWorkers limitations, no matter how well intentioned, make it 
useful at all to web developers.

This discussion has helped me understand your reasoning behind 
webworkers but truthfully I always knew the general 'why' of it. What 
I'm trying to find out is whether anybody has a genuine need for a 
Javascript compute node or whether authors would be better served by 
threads or coroutines that manage a shared DOM according to the rules of 
normal multitasking paradigms that have serve us since the first SMP 
machines were built.

I've trawled through many sources of information since starting this 
discussion and the overall impression I get is:

a.) Nobody has ever created a successful wait-free, lock-free system for 
x86 hardware.
b.) No one solution to this problem has ever been suitable to more than 
a subset of parallel applications.
c.) Despite its faults simple locking is currently the most common and 
successful paradigm for multi-core environments.

Which leaves me with:

a.) WebWorkers solves a specific class of problems (multiple 
compute/logic nodes for mathematical and scientific applications)
b.) Threads solves another set of problems (multiple action nodes on a 
large common dataset for general computing)
c.) WebWorkers and Threads may not be mutually exclusive. A thread could 
probably host or interact with a WebWorker and vice-versa.

Which leaves me thinking there is a good argument for having both 
paradigms at some point rather than one or the other. Any thoughts on this?

> At this point I suspect we will have to agree to disagree. Perhaps
> keep an eye on the spec as it continues to evolve. Perhaps it will
> start to grow on you.
>   

To do that it would have to at minimum allow the passing of Javascript 
primitives. Booleans, Integers, Floats, Strings, Nulls and Arrays should 
be passed by value (removing any custom properties they might have been 
given). Marshalling everything through Unicode strings is a terrible idea.

Shannon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20080814/b2b4560d/attachment.htm>

Received on Thursday, 14 August 2008 02:54:26 UTC