Re: Opening discussion on StreamWorker from Charles Pritchard on 2011-11-19 (public-webapps@w3.org from October to December 2011)

From: Charles Pritchard <chuck@jumis.com>
Date: Fri, 18 Nov 2011 18:01:19 -0800
To: Andrew Wilson <atwilson@google.com>
CC: Charles Pritchard <chuck@visc.us>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <4EC70DEF.5070806@jumis.com>

On 11/18/11 5:35 PM, Andrew Wilson wrote:
> On Thu, Nov 17, 2011 at 7:30 PM, Charles Pritchard <chuck@jumis.com 
> <mailto:chuck@jumis.com>> wrote:
>
>     On 11/17/2011 4:52 PM, Charles Pritchard wrote:
>
>         Currently, Web Workers provides a "heavy" scope for
>         multithreaded Web Apps to handle heavy data processing.
>
>         I'd like to draw on those specs and create a new lightweight
>         scope useful for various data processing tasks typically
>         associated with stream processing and GPUs.
>
>     Pseudo-code:
>     onmessage(data) { for(... data) { data[i] *= fancyness; };
>     postMessage(data); };
>
>     In doing this, could attach to CSS such as:   img { filter:
>     custom(url('basicpixelworker.js')); }.
>
>     The worker may only use postMessage once, and it must send back an
>     array of the same size.
>     There are no other options, no ways to pass a message to other
>     contexts, no File or IDB or other APIs.
>     The concept here is to be very restrictive. That way, no data is
>     leaked, and it behaves more like a WebGL shader (think GPGPU) than
>     our existing web worker context.
>
>     If it's rigid, we can get very good performance, high parallelism,
>     and modularity. We can also get quick implementation from vendors.
>     And they can decide when they want to optimize.
>
>
> Can you clarify what optimizations are enabled by these workers? It's 
> not clear to me that removing APIs makes starting up a worker any more 
> efficient, and I don't think significant efficiencies are enabled by 
> restricting workers to only sending/receiving a single message per 
> execution.

For the image filtering use case -- a restricted worker process would be 
as secure as WebGL is for pixel shaders.
That's the main reason for removing the APIs.

I don't think significant efficiencies are enabled by single get/post 
messages either. But it may make implementation of optimizations easier.
This is proposal intended for a new worker type, a subset of existing 
worker behavior. It's not meant to alter existing workers.

I can't speak to performance optimizations with expertise. I'd need to 
hunt down some experts in the field to give a useful information.

A sufficiently "simple" JS program could be optimized to run on a GPU 
array, but that's not a short-term goal.

For working with simple parallelism:

Intel went one route, see: ParallelArray Data Structure, Elemental 
Functions:
https://github.com/RiverTrail/RiverTrail/wiki/API-Design

W16 went another route using STM noting "most effective optimizations 
were disabled to simplify implementation":
https://github.com/sheremetyev/w16/blob/master/README.md

I'm focused on pixel shaders as a use case. RiverTrail makes 
optimization opportunities fairly explicit by introducing new data types,
W16 makes minimal changes on V8 to try to enhance parallelism.

-Charles

Received on Saturday, 19 November 2011 02:01:53 UTC