- From: Simon Pieters <simonp@opera.com>
- Date: Mon, 03 Mar 2008 09:48:23 +0100
- To: "Philip Taylor" <pjt47@cam.ac.uk>, "HTML WG" <public-html@w3.org>
On Thu, 28 Feb 2008 02:58:53 +0100, Philip Taylor <pjt47@cam.ac.uk> wrote: > > I've got some data about doctypes at > http://philip.html5.org/data/doctypes.html (125K pages from dmoz.org) > and http://philip.html5.org/data/doctypes-alexa.html (about 400 from > Alexa's list). I'm not entirely sure what this could be useful for, but > I'll point out a couple of things here. This is very useful information for Opera. We can determinate what would break when implementing HTML5 doctype switching. Thank you for this data. > 0.1% replaced the "...//EN" with their own language code, e.g. > http://www.edelweiss-reizen.nl has <!DOCTYPE html PUBLIC "-//W3C//DTD > XHTML 1.0 Strict//NL" > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> What's interesting to look at is doctypes that would be quirky if they ended in //EN (which are quirky in IE and Opera but not in Firefox or Safari). http://www.nic.funet.fi/~magi/metsola/ http://www.pinocchioarredi.it/ http://www.ultimahora.es/ http://www.cinarstvi.cz/ http://5w40.de/ http://www.campingplatz-reinhardshagen.de/ http://www.deutsche-fachwerkstrasse.de/ http://www.grasdorf.de/ http://www.protz-werder.de/ http://www.schacher-immobilien.de/ http://www.cameratasantcugat.com/ http://www.hagiva.co.il/ http://www.vargagabor.hu/ http://www.ilserbatoio.it/ http://www.cab.it/ http://kgit.amu.edu.pl/ http://www.osp.os.pl/ http://stdk.narod.ru/ http://www.palkin.ru/ http://www.slm-nsk.ru/ http://www.sunwaytours.ru/ http://www.losnuevostangos.com.ar/ http://www.elesis.com.tr/ http://usinfo.state.gov/esp/home/topics/us_society_values/geografia.html http://www.minotel.com/home.asp?xlanguage=DE http://www.vs-aigen.salzburg.at/ (ends in "//EN conova") http://www.eng-joheco.com/ http://www.caissepoplevis.com/ http://www.judo-store.com/ http://aziende.lab4.net/ http://powermetal.altervista.org/ http://www.architettopalladini.it/ http://deamicis-spa.com/ http://www.balparaplan.webm.ru/ http://www.taxi-office.ru/ (ends in "//RUS") http://www.serdardenktas.com/ The pages above render better in quirks mode than in standards mode in Opera and Firefox (I didn't test all in Firefox though). http://www.quintomiglio.com/ The page above renders better in standards mode than in quirks mode in Opera and Firefox. http://www.gedankenblicke.net/ This one renders better in standards mode than almost standards mode in Opera, but the same in Firefox, so it's probably a bug in Opera's almost standards mode. The rest of the about 60 pages I looked at looked ok in either quirks mode or standards mode. This means that Opera would break about 0.05% of pages of this sample if we implemented HTML5 doctype switching, assuming that the remaining pages I didn't look at were the same. I think this is pretty convincing that HTML5 needs to ignore what is in place of the "EN" at the end of the FPIs, that is instead of matching that the FPI is e.g. -//W3C//DTD HTML 3.2//EN, check that it starts with -//W3C//DTD HTML 3.2//. For the FPIs that end in //EN//2.0 and the like, I'd suggest to just drop them from the list since there are equivalent FPIs that end in //EN and the //2.0 would be treated as trailing garbage. -- Simon Pieters Opera Software
Received on Monday, 3 March 2008 08:48:37 UTC