- From: Boris Zbarsky <bzbarsky@mit.edu>
- Date: Tue, 28 Apr 2015 00:44:22 -0400
- To: Glen Huang <curvedmark@gmail.com>
- CC: public-webapps <public-webapps@w3.org>
On 4/27/15 11:27 PM, Glen Huang wrote: > When you say "var node = list[0];" walks the DOM until the first item is found, do you mean it only happens under the condition that some previous code has changed the DOM structure? Or that no one has ever asked the list for its [0] element before, yes. > If not, the returned list object will be marked as up-to-day, and accessing the first element is very cheap? In Gecko, yes. > I ask because in the first paragraph you said the returned list and returned first element is probably precomputed. In the case of the microbenchmark where you just ask for it repeatedly without changing the DOM, yes. > After UA has parsed html, it caches a hash table of elements with class names (also all element with ids, all elements with tag names, etc in different hash tables), keyed under the class names. At least in Gecko, that's not how it works. There _is_ a hashtable mapping ids to element lists used for getElementById that is populated eagerly. There is also hashtable mapping the pair (class string, root) to an element list that's used by getElementsByClassName and is populated lazily. Likewise, a hashtable mapping the pair (tagname, root) to an element list that's used by getElementsByClassName; this one is also populated lazily. > When getElementsByClassName() is called, and the DOM hasn't been modified, it simply creates a list of elements with that class name from the hash table. No, it just gets the list pointer from the hashtable (if any) and returns it. That is, getElementsByClasName("foo") === getElementsByClassName("foo") tests true. If there is no list pointer in the hashtable, an empty list is created, stored in the hashtable, and returned. > When the first element is accessed from that list, and the DOM still isn't modified, the element is returned directly. If the list has computed it before. If not, it walks the DOM until it finds its first element, then adds that one element to the list and returns it. > The hash table is kept in sync with the DOM when it's modified. The id hashtable is. The class/tagname hashtables aren't kept in sync per se, since they're lazily populated anyway, but the lists stored in them may need to be marked dirty. > And if the DOM is changed after the list is returned but before it's accessed Or before the list is returned. The order really doesn't matter here; what matters is whether the DOM is changed after the previous access, if any. > Why can't querySelector benefit from these hash tables? It could, somewhat. querySelector with an id selector does in fact benefit from the id hashtable. For the specific case of querySelector with a class selector, we _could_ internally try to optimize a bit, especially for the class case (the tag name case is a bit harder because, for example, querySelector("foo") and getElementsByTagName("foo")[0] can return different nodes depending on the value of the string "foo" and whether it contains any ':' characters and whatnot). > I currently feel the urge to optimize it myself by overriding it with a custom function which will parse the passed selector, and if it's a simple selector like "div", ".class", "#id", call the corresponding getElement*() function instead. Then you'll end up with incorrect behavior in some cases, which you may of course not care about. > Why can't UAs perform this for us? To some extent they could. In practice this hasn't come up as a bottleneck in anything we've profiled so far, so people have avoided adding what seems like unnecessary complexity, but if there's a real-life example (not a microbenchmark) where this is being a problem that would certainly help a lot with getting this sort of thing on the radar. > If my mental model is correct It's not quite. > The only price it pays is parsing the selector. Not even that, possibly; at least Gecko has a cache of parsed selectors. > Is it because authors don't use querySelector often enough that UAs aren't interested in optimizing it? Or more precisely don't use it in ways that make it the performance bottleneck. -Boris
Received on Tuesday, 28 April 2015 04:44:52 UTC