[Bug 10801] Limit the number of iterations in the loops in the AAA

http://www.w3.org/Bugs/Public/show_bug.cgi?id=10801

Henri Sivonen <hsivonen@iki.fi> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
                 CC|                            |jgraham@opera.com,
                   |                            |jonas@sicking.cc,
                   |                            |w3c@adambarth.com
         Resolution|NEEDSINFO                   |

--- Comment #2 from Henri Sivonen <hsivonen@iki.fi> 2010-10-13 12:32:22 UTC ---
(In reply to comment #1)
> I considered this when first writing the parser (it's not just the AAA where
> you really want limits — it's actually reasonable to put limits on the depth of
> the stack in general), but I was of the opinion then that that would fall under
> the "hardware limitations" clause and be left up to the UAs, since the limits
> would grow over time.

It's not clear that we'd want to increase the limits over time in order to get
diminishing returns from supporting even crazier legacy content as the time
goes on.

> Also, there are different ways to handle limits. The most clever way to handle
> an infinite depth of <font>s, for example, is probably to drop some of the
> <font>s from the middle of the stack, preferably those with no attributes, or
> at least with no event handlers and no IDs.

The way both Gecko and WebKit currently handle AAA limits is that the inner
loop and the outer loop have a cap on max iterations per loop entry. If the max
is hit, the code exists the loop without any fancy cleanup.

> It might make sense to suggest places for limits and how to handle them,
> though, especially if it is important for interop. If you have good suggestions
> for what to do there, please reopen the bug. I'd be happy to consider them.

Unless there's a good reason to do something more fancy than what Gecko and
WebKit do, I suggest picking a number for the max iterations of the outer loop
per entry to the loop and another number for the max iterations of the inner
loop per loop entry.

Philip ran an instrumented parser over 422814 pages that parsed successfully.
Here's an analysis of that data:

maxInnerLoopEntries (cutoff: 0.999900)
0.9110: <= 0
0.9940: <= 1
0.9989: <= 2
0.9995: <= 3
0.9997: <= 4
0.9998: <= 5
0.9998: <= 6
0.9999: <= 7
0.9999: <= 8
Max: 114

maxOuterLoopEntries (cutoff: 0.999900)
0.0506: <= 0
0.9110: <= 1
0.9677: <= 2
0.9813: <= 3
0.9889: <= 4
0.9953: <= 5
0.9978: <= 6
0.9989: <= 7
0.9994: <= 8
0.9996: <= 9
0.9998: <= 10
0.9998: <= 11
0.9999: <= 12
0.9999: <= 13
Max: 88

This means that 5% of the pages didn't run the AAA at all. 99.53% of the pages
ran the outer loop 5 or fewer times. The largest number of outer loop
iterations was 88.

In Gecko and WebKit, the limits are 10 iterations for the inner loop and 10
iterations for the outer loop. It seems to me it would be safe to pick e.g. 8
iterations for the outer loop and and 3 for the inner.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 13 October 2010 12:32:27 UTC