RE: Workers from Ian Hickson on 2008-07-20 (public-html@w3.org from July 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Sun, 20 Jul 2008 06:29:06 +0000 (UTC)
To: Justin James <j_james@mindspring.com>
Cc: public-html@w3.org
Message-ID: <Pine.LNX.4.62.0807200613390.12994@hixie.dreamhostps.com>
On Sun, 20 Jul 2008, Justin James wrote:
> > 
> > How would you communicate with such a mechanism?
> 
> I suppose it could take a second argument for a thread-safe messaging 
> object.

That's basically what MessagePorts are, and basically how createWorker() 
works, except that it creates the ports for you as a convenience:

> >    var port = createWorker(url);
> 
> Yes, I am sure that if I saw the world from the eyes of the Gears team, 
> that might seem like the best way to do it. But I'm from a more 
> traditional background, and frankly, the idea of passing an URL to a 
> script seems incredibly backwards and clumsy. Offhand, I cannot recall 
> ever seeing a system of any sort where you basically say, "execute the 
> code located by this reference".

It's exactly how the Web works today:

   <script src="url"></script>


> > There's no new object for workers in this proposal, actually, from the 
> > caller side. In fact, as far as I can tell what the current Workers 
> > spec does and what you propose are identical modulo the method name, 
> > as shown above.
> 
> Throughout the draft, it makes references to the "WindowWorker object". 

Right, that's the ECMAScript global object from the point of view of code 
running in the worker.


> I want to see a *function* for executing the work in a thread (even if 
> it is a method of the Window object), not a "WindowWorker object" with a 
> hidden/invisible "Execute" method.

The WindowWorker object isn't how you execute code, it's just the global 
object. Whatever mechanism we use, we have to have a global object.


> You also brought up the possibility of using the data: URI scheme to 
> replicate this functionality. Examination of RFC 2397 
> (http://tools.ietf.org/html/rfc2397) shows that this is a wholly 
> inadequate approach:
> 
> "The "data:" URL scheme is only useful for short values. Note that some 
> applications that use URLs may impose a length limit; for example, URLs 
> embedded within <A> anchors in HTML have a length limit determined by 
> the SGML declaration for HTML [RFC1866]. The LITLEN (1024) limits the 
> number of characters which can appear in a single attribute value 
> literal, the ATTSPLEN (2100) limits the sum of all lengths of all 
> attribute value specifications which appear in a tag, and the TAGLEN 
> (2100) limits the overall length of a tag."
> 
> Thanks in part to a REALLY shoddy spec (it doesn't define "for sure" 
> just how long the data can be) and in part to uneven implementations of 
> URI/URL length maximums amongst browsers (last I checked, at least), it 
> is impossible for a developer to rely upon data: to carry a script 
> reliably.

As far as I can tell, data: URLs of megabytes in length work fine in all 
major shipping browsers that support data: URLs. Can you give an example 
of a major browser that supports data: URLs but doesn't support long 
enough data: URLs to handle the script you want to handle? (And why would 
you have that script in text form instead of accessible from a URL?)



> Therefore, if you want to support any use cases (the ones I list below 
> are just a sampling of potential use cases) of 
> self-store/self-generating code (and why not, since ECMAScript is quite 
> clearly geared towards *precisely* that kind of work!), doing it the way 
> the Gears team suggests is not the right approach.

I respect your opinion, but practical experience from actual Web authors 
writing code with experimental Workers implementations have more weight. :-)


> Here's a few use cases for a proper asynchronous eval():
> 
> * Connections to the HTTP server are expensive. Why go back to the 
> server to request a script when it could have been delivered on the 
> initial document load to begin with?

data: URLs handle this fine.


> * Maybe the script/code to be executed is a result of user input. Why 
> post that code to a server, only to get it back to execute it (I show 
> below why the data: URL approach is not a good one)?

Actually modifying the code on the fly based on input is a terrible 
programming practice. That's what variables are for. Just invoke the 
script normally and pass a message.


> * Anyone performing complex calculations (say, image editing; as 
> repellent as Web-based image editors are to me, a lot of folks are 
> working on them, it seems like).

Why would you not just have a computation worker running already and just 
pass it the computations you want it to do?


> * Anyone writing something to be really adaptive to user behavior will 
> be improved by self-generated code.

Do you have an example?


> * Anyone looking to leverage the idle CPUs of their clients, instead of 
> burning server CPU cycles with HTTP session initiations.

Just create the worker ahead of time and then start it and stop it using 
messages.


> One final note on the existing draft: I also find it problematic that 
> the locating and procurement of the script located with the URL 
> parameter does *not* occur in a separate thread. Considering that in a 
> many cases, the HTTP connect/transmit/close cycle takes a huge portion 
> (if not the majority) of the total execution time, I think you lose most 
> of the intended benefit by having that be in the main thread.

I think you're misreading the spec. The fetching of the resource happens 
asynchronously.


> Also, the spec may need to have a cap/throttle built in, so developers 
> don't try to do things that forces browser vendors to install a 
> cap/throttle which then breaks code.

Browsers are allowed to throttle the code as much as they like. We can't 
really do anything else since user agents run on such varied hardware that 
there's no way to really guarantee particular performance characteristics 
anyway.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 20 July 2008 06:29:43 UTC