- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 27 Aug 2008 10:18:19 +0000 (UTC)
- To: Justin James <j_james@mindspring.com>
- Cc: public-html@w3.org
On Sun, 10 Aug 2008, Justin James wrote: > > > > I was going to add a note, but then I noticed that it actually already > > says it twice -- once in the "create a worker" algorithm, and once in > > the "run a worker" algorithm. What should I add to make it clearer? > > I think that making it step #1 in the enumerated list would do the > trick. The last time I looked at it, I realized that the reason that I > kept missing it, is because I was looking at the list to see what was > happening, but it is in the paragraph before the list. Since it *is* a > step in creating the working, I think that adding it to the list would > be reasonable. Done. > > > I agree that different platforms will have different cap/throttle > > > levels. But the code authors need to be able to check to see if they > > > hit it! > > > > Why? > > Because it is *very* common to take an "alternate" route if a thread > will not run immediately. Some use cases: > > * For a critical task, if I've hit the limit, I may choose to *not* > create a separate thread, and instead choose to run it in the primary > thread: > > if (Window.WorkerLimitMet) { > eval(GetURL(url)); > } else { > createWorker(url); > } I don't really buy that example (you'll hit network limits long before CPU limits for I/O tasks), and I can't really think of any realistic ones, so I'm not convinced of this use case. > * For a time-sensitive, but unimportant task (say, putting up graphic > "please wait" in response to user input that will only be on the screen > for a second or so), it is better to just bypass the logic altogether > than to wait on it: > > if (!Window.WorkerLimitMet) { > createWorker(url); > } You'd never use a worker for UI-related tasks, since the workers can't get to the UI. What realistic cases would there be for worker-level tasks that are unimportant enough that you could just not do them? > * Some applications may very well wish to limit or restrict user input > until the queue can accept more work. For example: > > while (Window.WorkerLimitMet) { > Form1.SubmitButton.Enabled = false; > sleep(100); > } Users are quite capable of noticing when their computer is under load, I don't think it makes sense to artificially limit how much work the computer can do like this. > If we can't dictate how many workers may run at once due to platform > limits, then developers need to know when they are at those limits. We don't provide a way for applications to know when they hit other limits, and I don't really see this as special. > Doing something onMouseOver() is a good example. If someone is wildly > waving their mouse, better to start dropping it than to queue up > workers. Think about this kind of code for a moment: > > onMouseOver = "createWorker(urlToScript)" > > user starts waving their mouse wildly... I can't see _any_ valid reason to _ever_ create a worker from mouse movements. What possible use case could that have? Just create one worker and queue work up with it. > > It could also create a worker, but run it slowly. > > It *could*, but that would be supremely dumb behavior; each thread takes > up space in memory, regardless of whether or not it is running. Workers aren't _that_ expensive. If a worker is using 100% CPU on a core, you'll run out of cores long before you run out of memory. Running workers slowly (sharing cores) seems much more reasonable than not running them at all. > > I don't know how we would even go about testing such requirements. > > That's why I suggest we define what a throttling mechanism is allowed to > do, and what it is not allowed to do, and provide a mechanism for > detecting throttle and an overload of createWorker() that accepts a > timeout value. There is a reason why implementations are various "thread > pool" type objects provide this functionality, and it isn't for the sake > of needed extra documentation. :) This may be something we'll have to add in future, but for now I really don't see this as something critical enough for the first version. > > > For example: > > > > > > for (i = 0; i <= 1000000; i++) { > > > arrayOfMessagePorts[i] = createWorker(arrayOfURLs[i]); > > > } > > > > > > Yes, I know that it is an extreme example (not really, if you want > > > to do something to the individual pixels of an image in > > > parallel...), but it illustrates the problem well. > > > > How is this different to, say: > > > > for (i = 0; i <= 1000000; i++) { > > arrayOfDocuments[i] = document.implementation.createdocument(null, > > null, null); > > } > > > > ...? > > It's the same at a technical level, but quite different from a > programmer's viewpoint. A programmer, writing what you wrote, has the > expectation that they are creating 1,000,000 objects, and knows it > before the code even runs, and can make the decision to do it based on > that information up front. A programmer writing what I wrote does not > know in advance how many objects they are creating (they know that > eventually 1,000,000 object will have been created, but has no idea how > many will be in scope at any given time), and depending on the UA, it > may or may not run well. So it's a matter of perception, not technical. I don't buy that. If you are firing 1000000 workers back to back, you don't expect them to complete quickly enough that you only have 10 or so active at a time. The whole point of workers is you use them for long computation, if they could return so quickly, then using workers is just adding unnecessary overhead. > I'm stating that the spec needs to explicitly state that this is > *undefined* and up to the UA. It already does: # User agents may impose implementation-specific limits on otherwise # unconstrained inputs, e.g. to prevent denial of service attacks, to # guard against running out of memory, or to work around # platform-specific limitations. -- http://www.whatwg.org/specs/web-workers/current-work/#conformance > > This seems unlikely. All use cases I can think of for running many > > scripts will all be running the same one (or few) scripts, not many > > many different ones. > > Since as far as I can tell, the only way to pass parameters to these > scripts is via the URL itself, I think that you are missing out. You can pass parameters using postMessage(). > Let's say you want to do some image processing, so you're going through > the pixels of an image: > > var sBaseURL = 'http://www.domain.com/scripts/pixelprocess.aspx?colorCode='; > > for (x = 0; x < image.width; x++) { > for (y = 0; y < image.height; y++) { > messagePorts[x, y] = createWorker(sBaseURL + image.pixels[x, > y].color); > } > } Good lord, don't do that. Just shard the image into a few pieces and postMessage() the data from each shard to a worker. Creating one worker per pixel of an image is completely ridiculous. > > This again is just a limitation of IE's implementation. (Though one > > has to wonder, why would you generate a URL of more than 32KB? > > Wouldn't it make more sense to generate the part that changes, and > > then fetch the rest as part of an importScripts() call?) > > You wouldn't want to generate an *URL* of more than 32 KB, but you quite > often have a *script* of more than 32 KB! You wouldn't have 32KB of script that changes each time. You'd just have a small bit of code changing each time, and the rest could be imported, and not part of the URL. > I'm finding that an absolutely huge hole in this implementation is in > passing initial parameters. The only way I am seeing to pass parameters > in, is with the message port system. The example in the current draft > involves a whopping TEN (10), yes, TEN (10) lines of code in order to > extract TWO (2) parameters as initial input. That is simply > unacceptable. This will be solved when we allow structured data passing later. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 27 August 2008 10:18:34 UTC