- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 25 Feb 2010 22:17:42 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9071 --- Comment #16 from Simon Pieters <simonp@opera.com> 2010-02-25 22:17:42 --- (In reply to comment #10) > I changed my regexps to only look at pages that match this: > > /<!doctype\s+html\s+public\s+"[^"]+"\s*\[/i > /<!doctype\s+html\s+public\s+'[^']+'\s*\/i > > The results, looking for this data in the Google index, found about 0.000125% > of pages are have this particular DOCTYPE pattern. Note, though, that this > doesn't include DOCTYPEs that are simply bogus, e.g. that have a missing quote > in the system identifier part, which Philip's data _does_ catch. > > Here's a random selection of some of the matching pages: > > http://www.austinwyatt.co.uk/property-details-rpsMSE-AWE090138 doesn't matter > http://www.bairstow-eves.co.uk/content/011_Legal_Information doesn't matter (doesn't seem to have css applied) > http://www.bairstoweves.co.uk/content/008_Offices/ doesn't matter > http://www.boekbesprekingen.nl/cgi-bin/auteur.cgi?auteur=311737&type=biografie N/A > http://www.chappellandmatthews.co.uk/content/006_Information/001_HIPs/ doesn't matter > http://www.countrywidescotland.co.uk/property-details-rpsCWN-ADR080332 doesn't matter > http://www.daiwaint.co.jp/stock/Sheets/SK1.htm doesn't matter > http://www.diolla.ru/catalog/pharmacy/preparates/anticought/nose-drops/p_103129 N/A > http://www.entwistlegreen.co.uk/property-details-rpsBAD-RUN090794 doesn't matter > http://europroject.pl/index.php?pid=3:15:44 slight layout change, but doesn't matter > http://www.frankinnes.co.uk/content/001_Contact_Us/001_Sales/ doesn't matter > http://www.gallex.ch/gallex/1/141.41.html doesn't matter > http://www.gpees.co.uk/content/001_Search/004_New_Homes/ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" [url=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> has different spacing in standards mode and quirks mode. Can't tell which is intended. > http://www.jazz-network.com/kumpf/p-lyrik.html doesn't matter > http://www.manncountrywide.co.uk/property-details-rpsMSE-CWS090264 doesn't matter > http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []> needs quirks > http://oestjyllandsflyt.dk/privatflytning/flyttetilbud/ doesn't matter > http://www.palmersnell.co.uk/content/002_To_Let/002_Lettings/ doesn't matter > http://www.spencers.co.uk/content/002_To_Let/001_Lettings_Area_Search/ doesn't matter > http://www.strattoncreber.co.uk/property-details-rpsSTC-REH090342 doesn't matter > http://www.sugano-foods.co.jp/products2.html N/A > http://runker_room.tripod.com/tiestalk/japped.htm doesn't matter > http://www.lpl.univ-aix.fr/projects/multext/CES/CES1.Annex7.html doesn't matter > http://www.winncom.com/moreinfo/item/5054-BSUR-LR-US/index.html N/A Many of these seem to be based on the same template (the ones that have [url= in the doctype). > I'm leaning towards not changing the spec, based on the rarity of this and > based on Simon's findings earlier in this bug. We could make "[" after public identifyer go into bogus doctype without setting force-quirks, while letting any other garbage character set force-quirks ("S" and "/" needed force-quirks from my earlier findings), and not regress compat. However, it's just one page of those analyzed that is affected by it (and for that page I couldn't tell whether it would actually be helped or not), so I would suggest to Avoid Needless Complexity. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Thursday, 25 February 2010 22:17:44 UTC