- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 27 Aug 2008 10:18:19 +0000 (UTC)
- To: Justin James <j_james@mindspring.com>
- Cc: public-html@w3.org
On Sun, 10 Aug 2008, Justin James wrote:
> >
> > I was going to add a note, but then I noticed that it actually already
> > says it twice -- once in the "create a worker" algorithm, and once in
> > the "run a worker" algorithm. What should I add to make it clearer?
>
> I think that making it step #1 in the enumerated list would do the
> trick. The last time I looked at it, I realized that the reason that I
> kept missing it, is because I was looking at the list to see what was
> happening, but it is in the paragraph before the list. Since it *is* a
> step in creating the working, I think that adding it to the list would
> be reasonable.
Done.
> > > I agree that different platforms will have different cap/throttle
> > > levels. But the code authors need to be able to check to see if they
> > > hit it!
> >
> > Why?
>
> Because it is *very* common to take an "alternate" route if a thread
> will not run immediately. Some use cases:
>
> * For a critical task, if I've hit the limit, I may choose to *not*
> create a separate thread, and instead choose to run it in the primary
> thread:
>
> if (Window.WorkerLimitMet) {
> eval(GetURL(url));
> } else {
> createWorker(url);
> }
I don't really buy that example (you'll hit network limits long before CPU
limits for I/O tasks), and I can't really think of any realistic ones, so
I'm not convinced of this use case.
> * For a time-sensitive, but unimportant task (say, putting up graphic
> "please wait" in response to user input that will only be on the screen
> for a second or so), it is better to just bypass the logic altogether
> than to wait on it:
>
> if (!Window.WorkerLimitMet) {
> createWorker(url);
> }
You'd never use a worker for UI-related tasks, since the workers can't get
to the UI. What realistic cases would there be for worker-level tasks that
are unimportant enough that you could just not do them?
> * Some applications may very well wish to limit or restrict user input
> until the queue can accept more work. For example:
>
> while (Window.WorkerLimitMet) {
> Form1.SubmitButton.Enabled = false;
> sleep(100);
> }
Users are quite capable of noticing when their computer is under load, I
don't think it makes sense to artificially limit how much work the
computer can do like this.
> If we can't dictate how many workers may run at once due to platform
> limits, then developers need to know when they are at those limits.
We don't provide a way for applications to know when they hit other
limits, and I don't really see this as special.
> Doing something onMouseOver() is a good example. If someone is wildly
> waving their mouse, better to start dropping it than to queue up
> workers. Think about this kind of code for a moment:
>
> onMouseOver = "createWorker(urlToScript)"
>
> user starts waving their mouse wildly...
I can't see _any_ valid reason to _ever_ create a worker from mouse
movements. What possible use case could that have? Just create one worker
and queue work up with it.
> > It could also create a worker, but run it slowly.
>
> It *could*, but that would be supremely dumb behavior; each thread takes
> up space in memory, regardless of whether or not it is running.
Workers aren't _that_ expensive. If a worker is using 100% CPU on a core,
you'll run out of cores long before you run out of memory. Running workers
slowly (sharing cores) seems much more reasonable than not running them at
all.
> > I don't know how we would even go about testing such requirements.
>
> That's why I suggest we define what a throttling mechanism is allowed to
> do, and what it is not allowed to do, and provide a mechanism for
> detecting throttle and an overload of createWorker() that accepts a
> timeout value. There is a reason why implementations are various "thread
> pool" type objects provide this functionality, and it isn't for the sake
> of needed extra documentation. :)
This may be something we'll have to add in future, but for now I really
don't see this as something critical enough for the first version.
> > > For example:
> > >
> > > for (i = 0; i <= 1000000; i++) {
> > > arrayOfMessagePorts[i] = createWorker(arrayOfURLs[i]);
> > > }
> > >
> > > Yes, I know that it is an extreme example (not really, if you want
> > > to do something to the individual pixels of an image in
> > > parallel...), but it illustrates the problem well.
> >
> > How is this different to, say:
> >
> > for (i = 0; i <= 1000000; i++) {
> > arrayOfDocuments[i] = document.implementation.createdocument(null,
> > null, null);
> > }
> >
> > ...?
>
> It's the same at a technical level, but quite different from a
> programmer's viewpoint. A programmer, writing what you wrote, has the
> expectation that they are creating 1,000,000 objects, and knows it
> before the code even runs, and can make the decision to do it based on
> that information up front. A programmer writing what I wrote does not
> know in advance how many objects they are creating (they know that
> eventually 1,000,000 object will have been created, but has no idea how
> many will be in scope at any given time), and depending on the UA, it
> may or may not run well. So it's a matter of perception, not technical.
I don't buy that. If you are firing 1000000 workers back to back, you
don't expect them to complete quickly enough that you only have 10 or so
active at a time. The whole point of workers is you use them for long
computation, if they could return so quickly, then using workers is just
adding unnecessary overhead.
> I'm stating that the spec needs to explicitly state that this is
> *undefined* and up to the UA.
It already does:
# User agents may impose implementation-specific limits on otherwise
# unconstrained inputs, e.g. to prevent denial of service attacks, to
# guard against running out of memory, or to work around
# platform-specific limitations.
-- http://www.whatwg.org/specs/web-workers/current-work/#conformance
> > This seems unlikely. All use cases I can think of for running many
> > scripts will all be running the same one (or few) scripts, not many
> > many different ones.
>
> Since as far as I can tell, the only way to pass parameters to these
> scripts is via the URL itself, I think that you are missing out.
You can pass parameters using postMessage().
> Let's say you want to do some image processing, so you're going through
> the pixels of an image:
>
> var sBaseURL = 'http://www.domain.com/scripts/pixelprocess.aspx?colorCode=';
>
> for (x = 0; x < image.width; x++) {
> for (y = 0; y < image.height; y++) {
> messagePorts[x, y] = createWorker(sBaseURL + image.pixels[x,
> y].color);
> }
> }
Good lord, don't do that.
Just shard the image into a few pieces and postMessage() the data from
each shard to a worker. Creating one worker per pixel of an image is
completely ridiculous.
> > This again is just a limitation of IE's implementation. (Though one
> > has to wonder, why would you generate a URL of more than 32KB?
> > Wouldn't it make more sense to generate the part that changes, and
> > then fetch the rest as part of an importScripts() call?)
>
> You wouldn't want to generate an *URL* of more than 32 KB, but you quite
> often have a *script* of more than 32 KB!
You wouldn't have 32KB of script that changes each time. You'd just have a
small bit of code changing each time, and the rest could be imported, and
not part of the URL.
> I'm finding that an absolutely huge hole in this implementation is in
> passing initial parameters. The only way I am seeing to pass parameters
> in, is with the message port system. The example in the current draft
> involves a whopping TEN (10), yes, TEN (10) lines of code in order to
> extract TWO (2) parameters as initial input. That is simply
> unacceptable.
This will be solved when we allow structured data passing later.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 27 August 2008 10:18:34 UTC