Re: Asymptotic computational complexity bounds for DOM methods from Jonas Sicking on 2006-02-28 (public-webapi@w3.org from February 2006)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 28 Feb 2006 13:29:51 -0800
To: Maciej Stachowiak <mjs@apple.com>
Cc: "L. David Baron" <dbaron@dbaron.org>, "Web APIs WG (public)" <public-webapi@w3.org>
Message-ID: <4404C0CF.6000100@sicking.cc>

Maciej Stachowiak wrote:
>>>> Mozilla's implementation of Node.nextSibling is O(N) in the number of
>>>> (prior) siblings of the element.
>>>
>>> Are you sure? I tried this test, with varying numbers of spans in the
>>> div. With an O(N) implementation of nextSibling, it should be taking O
>>> (N^2) time in the number of spans, but the scaling seems to be linear
>>> in Firefox 1.5 as expected. With ~2000 spans I get a time of 134,
>>
>> So my knowledge turns out to be a little out of date.  There are some
>> caching optimizations to make it faster in some cases:
>>   https://bugzilla.mozilla.org/show_bug.cgi?id=240884
>> I'm haven't looked at those in detail, though, so I'm not quite sure
>> what's needed to defeat them, but it's still O(N) for at least some use
>> cases.
>>
>> Also, the coefficient for the O(N) term could be really small, since
>> searching an array is a very simple operation; I suggest trying bigger
>> numbers.
> 
> These numbers are big enough to show a significant difference in the 
> times, and the difference shows linear scaling, not quadractic, so the 
> underlying nextSibling operation is definitely O(1), at least in this case.
> 
> I also tried the "nextToLast" case with 160,000 child nodes and I still 
> got constant time.

The code in mozilla is O(N), but we have a caching mechanism that will
make it O(1) that you should hopefully hit most of the time.

I Can't really explain why you don't get a slowdown if you insert
thousands of nodes and then do

document.getElementById("nextToLast").nextSibling

what happens if you first do

getElementsByTagName('div')[0].firstChild.nextChild

and then do the timing? If that's still fast then my guess is that it's
still O(N) but the 'k' value is really small (it's just a linear search
in an array).

> I tried reading the code for nsIContent and I think the cache could be 
> defeated by doing an insert but I couldn't get this to happen either. I 
> will admit to being somewhat baffled by the code.

Hehe, we have to keep up our reputation of having complex code, right :)
If you're interested the code lives in nsAttrAndChildArray::IndexOfChild

http://lxr.mozilla.org/mozilla/source/content/base/src/nsAttrAndChildArray.cpp#214

Inserting childNodes will not defeat the cache. Getting .nextSibling
will be a bit slower but it's still O(1). You can defeat the cache by
getting the nextSibling of a random childnode. Something like

nl = document.getElementsByTagName('div')[0].childNodes;
var startTime = new Date();
for (i = 0; i < 1000; i++) {
   nl[Math.random() * nl.length()].nextSibling;
}
var endTime = new Date();
alert(endTime - startTime);

You can also defeat the cache by walking the nextSibling chain of 6 long
(> 15) child lists at the same time. However that will be fixed in
Firefox three where we'll use an improved caching algorithm (you'll have
to walk 129 child lists and not even then you will completely defeat it).

> Anyway, it's obvious there is a lot of variation here in design 
> approaches, so maybe there is no good guarantee to give content authors.

One thing we should do is to make sure that no operations have too slow
complexity given a reasonably simple implementation. Stuff like O(N^2)
complexity should be desperately avoided and slower then that forbidden.

And for methods with bad complexity we could mention this in the spec
and even give optimization hints for authors.

/ Jonas

PS. I'd be curious to know how safari avoid O(N) in either .nextSibling 
or .childNodes[x].

Received on Tuesday, 28 February 2006 21:29:44 UTC