[Efficient Script Yielding] - Clamping from Jason Weber on 2011-07-01 (public-web-perf@w3.org from July 2011)

From: Jason Weber <jweber@microsoft.com>
Date: Fri, 1 Jul 2011 22:27:13 +0000
To: James Robinson <jamesr@google.com>, "public-web-perf@w3.org" <public-web-perf@w3.org>
CC: Jatinder Mann <jmann@microsoft.com>
Message-ID: <8442F4DCA0FE304198740526F60E8D8B07050875@TK5EX14MBXC243.redmond.corp.microsoft.>
Of course, we can't get all authors to write ideal javascript code.  After all, we only had to add a clamp to setTimeout() and setInterval() because people were creating tight loops with timeouts and using 100% of available CPU (see https://bugzilla.mozilla.org/show_bug.cgi?id=123273, for example).  This new proposal provides another way for bad authors to recreate the problems that lead to the clamp being necessary for setTimeout()/setInterval() but it doesn't seem to allow any new use cases that a good author could achieve today.


I agree that in an ideal world we wouldn't need to clamp setTimeout(0) to 4ms and could use that address this pattern rather than needing to add a new API. Between 1990 and 2009 setTimeout was implicitly clamped (by either the browser or the operating system) to at least 10ms on Mac OS X and Windows. When a developer wrote, debugged, and tested code on their machine using setTimeout(0) they would receive at most 100 callbacks a second.

This lead to a lot of incorrect assumptions. For example, on Windows setTimeout(0) would result in 64 callbacks a second which is ~60fps. That meant they could successfully use setTimeout(0) for script based animations which is of course a bug in the script but worked because of the underlying clamps. The previous clamps helped developers write a lot of buggy code that essentially ran at the refresh rate.

It's challenging to change the implicit rules around a core API like setTimeout after 20 years with so much buggy code out there. That's essentially what we're doing by reducing the clamps to 4ms. If the legacy throttles hadn't existed I doubt that we would need to clamp to setTimeout(0) which would make this discussion moot.

We don't believe that setImmediate should be clamped. Use agents need to remain responsive and prioritize work accordingly. We shouldn't allow DOS scenarios and should be responsible citizens on the operating system. However, if a developer would like to use the entire core of a machine for computation on the UI thread and that webpage is in the foreground (implying user engagement) that should be allowed.




From: James Robinson [mailto:jamesr@google.com]
Sent: Wednesday, June 29, 2011 6:00 PM
To: Jason Weber
Cc: public-web-perf@w3.org; Jatinder Mann
Subject: Re: Efficient Script Yielding - First Editors Draft

Hi Jason,

Thanks for posting this draft.  Comments inline.
On Tue, Jun 28, 2011 at 3:17 PM, Jason Weber <jweber@microsoft.com<mailto:jweber@microsoft.com>> wrote:
One of the deliverables that we took on as part of the expanded Web Performance Working Group was to find a way to allow javascript applications to more efficiently yield control to the host (browser) and receive immediate callbacks when the host has completed processing pending work (for example handling user input of document layouts).

We had the action item to summarize the motivations for the Efficient Script Yielding deliverable, which we're doing through this email, and to publish the first editors draft which can be found here: http://dvcs.w3.org/hg/webperf/raw-file/tip/specs/setImmediate/Overview.html

One important note to consider is that the setTimeout() clamp only applies to nested timeouts (see http://www.whatwg.org/specs/web-apps/current-work/multipage/timers.html#dom-windowtimers-settimeout step 4), so in the examples on that page the clamp does not apply and setTimeout(..., 0) actually has the exact same behavior that I think you are going for with setImmediate().  It's probably worth coming up with examples where there is a difference in behavior between setTimeout() and setImmediate() to illustrate the uses cases more clearly.



As the working group has discussed, we believe there's an opportunity for a new API that allows the web developers to (1) efficiently use the CPU without wasted cycles, (2) efficiently use the CPU in bursts to conserve power, (3) improve performance for end user scenarios, and (4) feels familiar to current API's and programming patterns.

Today we see setTimeout and setInterval used for three primary patterns:


1)      Scheduling distant future callbacks (at least 500ms)

2)      JavaScript Based Animation

3)      Breaking apart long running scripts.

We think about #1 as the cases where the current setTimeout pattern works well. For example, you may want to update stock quotes or check for new email on a regular schedule. The problems with #2 are well understood and the working group has a great proposal in place with requestAnimationFrame. The "efficient yielding" deliverable is intended to more efficiently solve scenario #3.

Today, browsers don't process events while long running scripts are executing. This includes everything from UI updates, to user input, to end user features like spell checking. Even though the JavaScript may be manipulating the DOM or updating styles, these updates aren't presented to the user until after the script yields. To allow applications to remain responsive and to process visual changes, web developers are forced to sprinkle setTimeouts throughout their code allowing the browser to process pending work and then call script back at a future time.

The setTimeout callback frequency on Windows has traditionally been around 64 callbacks a second which aligns with the 15.6ms timer frequency. The HTML5 specification recommends 250 callbacks a second which means a 4ms timer frequency. This positively improves the perceived performance around pattern #3 however it comes at the cost of actual performance (interference) and more importantly power consumption. We've measured this extensively on Windows and decreasing the timer frequency from 15.6ms to 4ms impacts battery life by around 22% for common customer scenarios. This is a hardware factor and not specific to Windows. And as we consider forward looking hardware trends we expect this to become more of an issue.

There's an important distinction between the overall rate at which timers fire and the side effects of changing the WM_TIMER frequency.  Using more CPU will have an impact on battery life and power consumption equally in all operating systems, of course.  The effect of changing WM_TIMER's frequency from 15.6 to 4 (or any other value) on battery life, etc, is a misfeature of windows since changing this setting affects every process in the system.  This is not true for any other operating system.  I'd like to focus on the former issue alone.

Decreasing timer resolutions may help with some patterns, however it doesn't fully solve the underlying problem. If you think about the bubble sorting example, a developer doesn't actually know how long a single pass will take. To keep the browser responsive they yield frequently, often during each sorting pass. If a modern script engine can perform that pass in 1ms that means 3ms or 75% of the CPU time are wasted and not available to the web developer.

I believe you're referring to this example by 'the bubble sorting example': http://ie.microsoft.com/testdrive/Performance/setImmediateSorting/Default.html - right?  This test is poorly authored since the individual steps it is running take significantly less time to run than the timer it is setting.  For example on my box processing_min at the end of the run is 1 indicating that no piece of work took more than 1 millisecond to complete.  It is, however, easy to tell how long a piece of script has run in javascript, which allows for far more efficient scheduling if the author wants to complete a piece of work in the shortest possible amount of time without starving all other tasks in the system.  Consider this pseudocode:

var timesliceMillis = 10;

function doWorkWithPauses()
{
  var timerId = window.setTimeout(doWorkWithPauses(), timesliceMillis);
  var start = Date.now();
  while (haveMoreWork() && Date.now() - start < timesliceMillis)
    doSomeWork();
  if (!haveMoreWork())
    window.clearTimeout(timerId);
}

this snippet will continue to execute chunks of work until the timeslice has exceeded, and then yield.  Since the timer is set before work starts executing when the script yields the timer is immediately eligible to fire and so the only pause is waiting for other tasks in the various task queues to be dispatched.  There's no need to yield to the operating system at all, in fact.

I've uploaded a version of the sorting demo with an option to properly schedule work here:
http://webstuff.nfshost.com/setimmediate/setImmediate%20API.html

I think you'll find that this method compares favorably to the setImmediate() proposal since it avoids jumping in and out of the javascript VM as often, but the page remains responsive since it does not block the task queue for longer than 10ms + the time to do one step of work (which is extremely short for this demo on competent JS engines) modulo garbage collection pauses, etc.  On my box, the test completes after yielding roughly 4 times, giving the animation a chance to update.  I picked 10ms timeslices arbitrarily, any timeslice value >=4ms would work fine in an HTML5 compliant browser.  Larger timeslices are slightly more efficient, but smaller timeslices allow more opportunities for animations and other script to run.

Of course, we can't get all authors to write ideal javascript code.  After all, we only had to add a clamp to setTimeout() and setInterval() because people were creating tight loops with timeouts and using 100% of available CPU (see https://bugzilla.mozilla.org/show_bug.cgi?id=123273, for example).  This new proposal provides another way for bad authors to recreate the problems that lead to the clamp being necessary for setTimeout()/setInterval() but it doesn't seem to allow any new use cases that a good author could achieve today.

Are there any valid use cases that cannot be satisfied by existing techniques?  As it currently exists, it seems that if setImmediate() were to be implemented it is very likely that user agents would have to introduce a clamp on it for the same reason that there is a clamp on setTimeout(), at which point there's little reason to have the API at all.

- James


That's why we believe there's an opportunity for an API that allows the web developers to (1) efficiently use the CPU without wasted cycles, (2) efficiently use the CPU in bursts to conserve power, (3) improve performance for end user scenarios, and (4) feels familiar to current API's and programming patterns.

There has been a lot of discussion in the web community and ECMA working groups around the future of the javascript language and the possibility of moving the event queue into the javascript runtime itself. Those are interesting discussions however forward looking and outside the prevue of this deliverable. We would like to leave the larger discussion for the experts on the ECMA side and focus this discussion around a targeted API that will solve the immediate problem and fit well into the HTML4/HTML5 patterns of today.

Here's the first draft of what a "setImmediate" API may look like. We know a few people have expressed concerns around the API name. The "set" portion of the name follows the setTimeout and setInterval naming conventions, and the "Immediate" portion was intended to communicate the immediate nature of the callback. This feels like a good initial name which we validated doesn't have compatibility implications across the top 1 million sites. We expect to iterate on the name based on feedback as the design evolves.

We're looking forward to your thoughts on the first draft.

As an aside, we now have drafts for all three of the new API's we brought into the performance working group charter this spring. It's cool to see Page Visibility,  Request Animation Frame, and Efficient Script Yielding all starting to come together. Congratulations everyone.

Thanks,
Jatinder and Jason
Received on Friday, 1 July 2011 22:27:56 UTC