Re: QSA, the problem with ":scope", and naming

On 10/19/11 5:41 AM, Yehuda Katz wrote:
> Right. I'm representing the position of jQuery. Sizzle (John's selector
> engine, used by jQuery) chose to optimize certain common selectors after
> an analysis of selectors used by jQuery found that a large percentage of
> all selectors used were a few simple forms that were amenable to
> getElement(s)By* optimizations. I provided a link to the code that
> implements this earlier in this thread.

OK, so the code is at 
https://github.com/jquery/sizzle/blob/master/sizzle.js#L1150-1233 and 
does the following optimizations:

1)  If the selector is a vanilla tag, use getElementsByTagName instead. 
  This works, at least in Gecko, from userland due to the fact that 
getElementsByTagName results are cached until GCed.  If I test with a 
different argument for every call, the performance of querySelectorAll 
and getElementsByTagName is basically identical.[1]  Doing a similar 
optimization in the C++ code would be somewhat difficult, since you 
don't know when to drop your cache, but would be doable.  I would expect 
the common case here to be repeated queries for the same tag name.... 
In any case, in WebKit the caching effect is not present and the 
getElementsByTagName codepath is maybe 10% faster than querySelector.

2)  As above, for class names.  Works for the same reasons.  WebKit 
seems to have a getElementsByClassName cache too.

3)  Mapping Sizzle("body") to document.body.  This isn't a valid 
optimization for querySelector, since there can in fact be multiple 
<body> tags and since furthermore document.body can be a <frameset>.  A 
UA could try to optimize this case by keeping track of the <body> tags 
and such, at some cost on every DOM mutation.

4)  Mapping Sizzle("#id") with document a context to 
getElementById("id").  This isn't a valid optimization for querySelector 
because there can be multiple elements with the same id; in fact that's 
pretty common.  A UA can work around this in various ways (e.g. WebKit 
only makes the case when the id is unique take a fast path last I 
checked), though.

That sound about right?

-Boris

[1] I tested by running the script below against 
http://www.whatwg.org/specs/web-apps/current-work/ and got these results 
in Chrome:

querySelectorAll('div'): 915
Array.prototype.slice.call(getElementsByTagName('div'), 0): 948
querySelectorAll('div'+i): 938
Array.prototype.slice.call(getElementsByTagName('div' + i), 0): 847
querySelectorAll('.div'): 889
Array.prototype.slice.call(getElementsByClassName('div'), 0): 8
querySelectorAll('.div'+i): 910
Array.prototype.slice.call(getElementsByClassName('div' + i), 0): 824

and these in Firefox:

querySelectorAll('div'): 767
Array.prototype.slice.call(getElementsByTagName('div'), 0): 20
querySelectorAll('div'+i): 773
Array.prototype.slice.call(getElementsByTagName('div' + i), 0): 749
querySelectorAll('.div'): 817
Array.prototype.slice.call(getElementsByClassName('div'), 0): 8
querySelectorAll('.div'+i): 812
Array.prototype.slice.call(getElementsByClassName('div' + i), 0): 809

Script is:

     // Prevent dead-code optimizations
     var holder;
     function time(f) {
       var start = new Date;
       for (var i = 0; i < 100; ++i)
         holder = f(i);
       return (new Date) - start;
     }
     tests = [
       { description: "querySelectorAll('div')",
         func: function() { return document.querySelectorAll("div") } },
       { description: 
"Array.prototype.slice.call(getElementsByTagName('div'), 0)",
         func: function() { return 
Array.prototype.slice.call(document.getElementsByTagName("div"), 0); } },
       { description: "querySelectorAll('div'+i)",
         func: function(i) { return document.querySelectorAll("div" + i) 
} },
       { description: 
"Array.prototype.slice.call(getElementsByTagName('div' + i), 0)",
         func: function(i) { return 
Array.prototype.slice.call(document.getElementsByTagName("div"+i), 0); } },
       { description: "querySelectorAll('.div')",
         func: function() { return document.querySelectorAll(".div") } },
       { description: 
"Array.prototype.slice.call(getElementsByClassName('div'), 0)",
         func: function() { return 
Array.prototype.slice.call(document.getElementsByClassName("div"), 0); } },
       { description: "querySelectorAll('.div'+i)",
         func: function(i) { return document.querySelectorAll(".div" + 
i) } },
       { description: 
"Array.prototype.slice.call(getElementsByClassName('div' + i), 0)",
         func: function(i) { return 
Array.prototype.slice.call(document.getElementsByClassName("div"+i), 0); } }
     ];
     function runTest() {
       var results = []
       for (var i = 0; i < tests.length; ++i)
         results.push(tests[i].description + ": " + time(tests[i].func));
       document.write(results.join("<br>"));
     }

Received on Wednesday, 19 October 2011 15:18:17 UTC