- From: <bugzilla@jessica.w3.org>
- Date: Wed, 13 Oct 2010 12:32:22 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10801 Henri Sivonen <hsivonen@iki.fi> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED CC| |jgraham@opera.com, | |jonas@sicking.cc, | |w3c@adambarth.com Resolution|NEEDSINFO | --- Comment #2 from Henri Sivonen <hsivonen@iki.fi> 2010-10-13 12:32:22 UTC --- (In reply to comment #1) > I considered this when first writing the parser (it's not just the AAA where > you really want limits — it's actually reasonable to put limits on the depth of > the stack in general), but I was of the opinion then that that would fall under > the "hardware limitations" clause and be left up to the UAs, since the limits > would grow over time. It's not clear that we'd want to increase the limits over time in order to get diminishing returns from supporting even crazier legacy content as the time goes on. > Also, there are different ways to handle limits. The most clever way to handle > an infinite depth of <font>s, for example, is probably to drop some of the > <font>s from the middle of the stack, preferably those with no attributes, or > at least with no event handlers and no IDs. The way both Gecko and WebKit currently handle AAA limits is that the inner loop and the outer loop have a cap on max iterations per loop entry. If the max is hit, the code exists the loop without any fancy cleanup. > It might make sense to suggest places for limits and how to handle them, > though, especially if it is important for interop. If you have good suggestions > for what to do there, please reopen the bug. I'd be happy to consider them. Unless there's a good reason to do something more fancy than what Gecko and WebKit do, I suggest picking a number for the max iterations of the outer loop per entry to the loop and another number for the max iterations of the inner loop per loop entry. Philip ran an instrumented parser over 422814 pages that parsed successfully. Here's an analysis of that data: maxInnerLoopEntries (cutoff: 0.999900) 0.9110: <= 0 0.9940: <= 1 0.9989: <= 2 0.9995: <= 3 0.9997: <= 4 0.9998: <= 5 0.9998: <= 6 0.9999: <= 7 0.9999: <= 8 Max: 114 maxOuterLoopEntries (cutoff: 0.999900) 0.0506: <= 0 0.9110: <= 1 0.9677: <= 2 0.9813: <= 3 0.9889: <= 4 0.9953: <= 5 0.9978: <= 6 0.9989: <= 7 0.9994: <= 8 0.9996: <= 9 0.9998: <= 10 0.9998: <= 11 0.9999: <= 12 0.9999: <= 13 Max: 88 This means that 5% of the pages didn't run the AAA at all. 99.53% of the pages ran the outer loop 5 or fewer times. The largest number of outer loop iterations was 88. In Gecko and WebKit, the limits are 10 iterations for the inner loop and 10 iterations for the outer loop. It seems to me it would be safe to pick e.g. 8 iterations for the outer loop and and 3 for the inner. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Wednesday, 13 October 2010 12:32:27 UTC