- From: <bugzilla@jessica.w3.org>
- Date: Wed, 13 Oct 2010 12:50:26 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10802 Henri Sivonen <hsivonen@iki.fi> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED CC| |jgraham@opera.com, | |jonas@sicking.cc, | |w3c@adambarth.com Resolution|NEEDSINFO | --- Comment #3 from Henri Sivonen <hsivonen@iki.fi> 2010-10-13 12:50:25 UTC --- Philip ran an instrumented parser over 422814 pages that parsed successfully. Here's an analysis of that data: maxNonFontDuplicates (cutoff: 0.999000) 0.9422: <= 0 0.9868: <= 1 0.9928: <= 2 0.9953: <= 3 0.9965: <= 4 0.9971: <= 5 0.9975: <= 6 0.9980: <= 7 0.9983: <= 8 0.9986: <= 9 0.9987: <= 10 0.9989: <= 11 Max: 7687 maxFontDuplicates (cutoff: 0.999000) 0.9468: <= 0 0.9826: <= 1 0.9890: <= 2 0.9918: <= 3 0.9933: <= 4 0.9943: <= 5 0.9950: <= 6 0.9956: <= 7 0.9960: <= 8 0.9966: <= 9 0.9969: <= 10 0.9973: <= 11 0.9975: <= 12 0.9977: <= 13 0.9978: <= 14 0.9980: <= 15 0.9981: <= 16 0.9982: <= 17 0.9982: <= 18 0.9985: <= 19 0.9986: <= 20 0.9986: <= 21 0.9987: <= 22 0.9987: <= 23 0.9988: <= 24 0.9988: <= 25 0.9988: <= 26 0.9989: <= 27 0.9989: <= 28 0.9990: <= 29 Max: 6829 This means that when adding a non-<font> formatting element to the list of formatting elements, on 94% of pages there was no identical element (element name and all attribute names and values matching) on the list *after the latest marker if any* already. On 99% of pages, there were 2 or fewer duplicates already on the list (after the latest marker if any). The worst case seen was 7687 duplicates. In the case of <font> duplicates, on 99% of pages, there were 3 or fewer duplicates already on the list (after the latest marker if any). The worst case seen was 6829 duplicates. So the worst cases are really crazy, so it makes sense to pick some limits. Furthermore, very low limits take care of the vast majority of cases. I'd be inclined not to differentiate between <font> and non-<font>, and simply allowing a maximum of two identical elements already on the list when adding a third. Again, please see http://lists.w3.org/Archives/Public/public-html/2010Sep/0163.html for how to deal with removing duplicates. I think it would make sense to put the limit in the spec, because it would suck if an HTML5-compliance scoring site like http://html5test.com/ put 4 identical formatting start tags in a test case and called an implementation non-conforming. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Wednesday, 13 October 2010 12:50:32 UTC