[Bug 9071] Handling of "[" in between-doctype-public-and-system-identifiers-state may not be ideal

http://www.w3.org/Bugs/Public/show_bug.cgi?id=9071





--- Comment #10 from Ian 'Hixie' Hickson <ian@hixie.ch>  2010-02-25 02:57:41 ---
I changed my regexps to only look at pages that match this:

   /<!doctype\s+html\s+public\s+"[^"]+"\s*\[/i
   /<!doctype\s+html\s+public\s+'[^']+'\s*\/i

The results, looking for this data in the Google index, found about 0.000125%
of pages are have this particular DOCTYPE pattern. Note, though, that this
doesn't include DOCTYPEs that are simply bogus, e.g. that have a missing quote
in the system identifier part, which Philip's data _does_ catch.

Here's a random selection of some of the matching pages:

http://www.austinwyatt.co.uk/property-details-rpsMSE-AWE090138
http://www.bairstow-eves.co.uk/content/011_Legal_Information
http://www.bairstoweves.co.uk/content/008_Offices/
http://www.boekbesprekingen.nl/cgi-bin/auteur.cgi?auteur=311737&type=biografie
http://www.chappellandmatthews.co.uk/content/006_Information/001_HIPs/
http://www.countrywidescotland.co.uk/property-details-rpsCWN-ADR080332
http://www.daiwaint.co.jp/stock/Sheets/SK1.htm
http://www.diolla.ru/catalog/pharmacy/preparates/anticought/nose-drops/p_103129
http://www.entwistlegreen.co.uk/property-details-rpsBAD-RUN090794
http://europroject.pl/index.php?pid=3:15:44
http://www.frankinnes.co.uk/content/001_Contact_Us/001_Sales/
http://www.gallex.ch/gallex/1/141.41.html
http://www.gpees.co.uk/content/001_Search/004_New_Homes/
http://www.jazz-network.com/kumpf/p-lyrik.html
http://www.manncountrywide.co.uk/property-details-rpsMSE-CWS090264
http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm
http://oestjyllandsflyt.dk/privatflytning/flyttetilbud/
http://www.palmersnell.co.uk/content/002_To_Let/002_Lettings/
http://www.spencers.co.uk/content/002_To_Let/001_Lettings_Area_Search/
http://www.strattoncreber.co.uk/property-details-rpsSTC-REH090342
http://www.sugano-foods.co.jp/products2.html
http://runker_room.tripod.com/tiestalk/japped.htm
http://www.lpl.univ-aix.fr/projects/multext/CES/CES1.Annex7.html
http://www.winncom.com/moreinfo/item/5054-BSUR-LR-US/index.html

I'm leaning towards not changing the spec, based on the rarity of this and
based on Simon's findings earlier in this bug.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Thursday, 25 February 2010 02:57:43 UTC