[whatwg] Workers from Ian Hickson on 2008-07-18 (public-whatwg-archive@w3.org from July 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 18 Jul 2008 10:59:00 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0807181027510.11948@hixie.dreamhostps.com>
I have written a new WHATWG specification for Workers:

   http://www.whatwg.org/specs/web-workers/current-work/


On Wed, 30 May 2007, Maciej Stachowiak wrote:
> 
> The Worker Pools module also seems fairly straightforward: <http:// 
> code.google.com/apis/gears/api_workerpool.html>. I suggest just adding 
> this mostly as-is if it proves useful (with the obvious difference that 
> the message callbacks would become DOM events).

I changed the API quite a bit from what Gears had, but Workers in the 
style of Gears' Worker Pools are now in HTML5.


On Thu, 20 Sep 2007, Aaron Boodman wrote:
> 
> I have done some thought on how to simplify the worker api, and I think 
> it could be as easy as:
> 
> var w = createWorker("foo.js");
> w.sendMessage("messageName", jsonStyleObject);
> w.addEventListener("messageName", function(e) {
>   // e.args is a jsonStyleObject that was sent from the worker.
> }, false);
> 
> Inside the worker:
> 
> worker.addEventListener("messageName", function(e) {
>   // e.args is a jsonStyleObject that was sent from the worker.
> }, false);
> w.sendMessage("messageName", jsonStyleObject);

With the new API this would be:

   var w = createWorker("foo.js");
   w.onload = function () {
     w.postMessage("data");
   };
   w.onmessage = function (e) {
     // e.message has the data
   };

Inside the worker (foo.js):

   onconnect = function (o) {
     o.messagePort.onmessage = function (e) {
       // e.message has the data
     };
     o.messagePort.postMessage("data");
   };

Right now we don't have structured data, but that's marked an open issue 
in the spec for all the postMessage() methods.


On Tue, 11 Dec 2007, Jim Jewett wrote:
> >
> > When you enter JS the UI thread gets blocked,
> 
> This is the real problem.  Is there a way to specify a yielding API? 
> *This* script would block until the storage was completed, but the rest 
> of the browser (possibly even other scripts for this very page) would 
> not be blocked?
> 
> It looks like it *can* be done by breaking the function into smaller 
> pieces, and passing the next piece in as a successCallback ... but is 
> there a way to make this split easier?

This could now be done using workers.


On Thu, 14 Feb 2008, Dimitri Glazkov wrote:
>
> Since postMessage API is looking more an more like the Gears worker 
> messaging API (or better), can we go one step further and introduce 
> workers into the HTML5, defined as invisible windows with limited 
> capabilities:
> 
> WorkerWindow openWorker(in DOMString url);

Yes.

> with:
> 
> interface WorkerWindow {
> 
>    // for consistency with Window
>    readonly attribute Window window;
>    readonly attribute Window self;

Done.

>    // caps
>    readonly attribute ClientInformation navigator;

Haven't added this yet, but only because I haven't defined it properly for 
normal windows yet either.

>    // session/local storage
>    readonly attribute Storage sessionStorage;
>    ...

sessionStorage makes no sense when you don't have sessions. Added 
localStorage though.

>    // database stuff
>    Database openDatabase(...)

Yes.

>    // to open new worker windows
>    WorkerWindow openWorker(in DOMString url);

Yes.

>    // messaging
>    void postMessage(...)

Yes.

>    // some events
>    attribute EventListener onabort;
>    attribute EventListener onload;
>    attribute EventListener onunload;
> }

In the worker we have onconnect (for when the connection starts) and 
onunload (for when it is closed brutally). The ports have onload (for when 
the connection starts), onmessage, onunload (for when the connection is 
dropped) and onerror (for when the script couldn't be loaded).


On Thu, 14 Feb 2008, Aaron Boodman wrote:
>
> Well, as long as you've brought it up, I was working on a proposal too:
> 
> http://code.google.com/p/google-gears/wiki/HTML5WorkerProposal

I believe all the features available in this idea are available in the 
spec I wrote too.


On Thu, 14 Feb 2008, Geoffrey Garen wrote:
> 
> Why call these "windows" at all? They seem to have no relationship 
> physical windows, or the JavaScript "window" object.

It makes it easier to take code intended for the main window and move it 
to a worker if we use the same terminology.


> > WorkerWindow openWorker(in DOMString url);
> 
> Can I supply a URL to an HTML file here? Does the file load and parse as 
> an HTML document? Is the document accessible to the worker?

Right now it ignores MIME types and treats it as pure JS.


> Since the whole point of the worker is to do JavaScript work, should 
> this string be a script instead of a URL?

Scripts are resources with URLs.


> How do I pass data to a worker?

postMessage on the channel to the worker.


> Is there an API contract regarding synchronization and/or order of 
> execution?

There's no shared state.


> When is a worker considered loaded? Unloaded? Aborted?

That spec hopefully defines this in enough detail to be implemented.


On Thu, 14 Feb 2008, Aaron Boodman wrote:
> 
> However, I think that developers should be able to start sending 
> messages to workers immediately, before the worker has loaded. These 
> messages should be queued and delivered when the worker loads.

As written, this isn't possible, you have to wait for the onload event on 
the port before sending events.

But really there's no reason why this should be the case. I can just as 
easily set up the ports so that they are immediately present, and then 
fire the error event (and discard events that were preemptively sent). 
Should we do this?


On Tue, 19 Feb 2008, Scott Hess wrote:
> 
> It seems to me that this is an area where if you give an inch, the 
> developer wants another inch.  If you have something called "window", 
> then you're just moving things around - instead of saying "Where is my 
> window?", developers get to say "Why can't my window do X?"  Since this 
> is all new ground, it might be more reasonable to define the set of 
> things you want to have in your worker context, and then contrive to add 
> those things to your UI context.  That way you're explaining what is 
> there, rather than excusing what is not there.

Send people my way when they ask you these questions.


On Tue, 19 Feb 2008, Aaron Boodman wrote:
> 
> I'm not necessarily sold on making the worker context be the global 
> object. I always thought having the Window object be the global object 
> was a bit unfortunate, myself.
> 
> What if we had separate objects:
> 
> - the global scope (with all the typical JS globals, and maybe XMLHttpRequest)
> - workerContext (with all the worker stuff, plus cookies, location, etc)

We'd need something on the global object to point to the worker context... 
why add the extra level of indirection?


On Tue, 19 Feb 2008, Maciej Stachowiak wrote:
> 
> If XMLHttpRequest is one of the APIs available on background threads, 
> does that include its XML parsing/serialization features (responseXML 
> and the ability to pass a Document as the post data)?

It is, and it does not.


On Tue, 19 Feb 2008, Aaron Boodman wrote:
> 
> I think it should be spec'd with those features required. Gears probably 
> won't implement the XML part initially.

Nobody wanted to implement the XML part, so it's not in.


On Wed, 9 Jul 2008, Aaron Boodman wrote:
> 
> - synchronous network access

XMLHttpRequest can be sync and is on the worker threads.


> - storage access in general

Done.


> - synchronous db access

Not done yet, but on the cards. (This is more work since it involves a new 
API, not just repurposing an old one.)


> - access to a subset of the capabilities from the window.location
> object, for example the "href" property and the "reload" method. We
> have found that some workers want to reload themselves when they find
> they can no longer communicate with their origin server

I don't follow. Could you elaborate? What happens to all the communication 
channels that were set up to that worker if it reloads itself?


> - access to read and write cookies. We have found that some workers
> want to be able to modify the cookies for their origins

Not yet specified, as it is non-trivial, but I've noted the request. Is 
providing the 'cookie' attribute of the Document interface enough?


> - access to some sort of printline/console debugging facility

Not yet in the spec but noted; I'll add it in due course.


> I still think you should be able to pass JSON-style objects between
> workers without needing to do the serialization yourself.

Agreed; we want this for frame postMessage() too. I've added a note to the 
spec to this effect. It's non-trivial to spec since there are so many ways 
to end up with something more than just dictionary/array/number/string/ 
boolean data.


> Another idea (and something that is present in Gears) is the ability to 
> load workers from another origin. This provides a way to do controlled 
> cross-origin communication that is more lightweight than loading an 
> iframe and doing postMessage() with it.

This isn't allowed right now. Running script from another origin that was 
expecting to not be run except in a worker from the same origin could be 
problematic. However, you could create an iframe to that host, have the 
iframe create a worker, and then have the iframe pass a channel to the 
worker back to you and then self-destruct, leaving the worker from another 
origin being the only thing around.


On Wed, 9 Jul 2008, ddailey wrote:
> 
> a.. URLs: Workers should be spawned from URLs, not from strings, since 
> script rarely has access to its own source.
> 
> could you elucidate a bit more? Doesn't JavaScript usually have access 
> to its own source? I'm not sure when it doesn't. and isn't JavaScript 
> still the primary client side scripting vehicle in HTML5?

An executing JavaScript script doesn't actually have access to its own 
source, typically, no. I mean, it could get a hold of it using XHR or 
something, but that's not really useful.

Experience with Gears has been that people just want to run scripts from 
URLs, so that's what is provided here. (At a pinch, you can always just 
use a data: URI.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 18 July 2008 03:59:00 UTC