W3C home > Mailing lists > Public > public-html@w3.org > February 2010

Re: Updated DOCTYPE versioning change proposal (ISSUE-4)

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Wed, 17 Feb 2010 14:56:58 +0000
Message-ID: <4B7C03BA.4050903@cam.ac.uk>
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: Henri Sivonen <hsivonen@iki.fi>, "public-html@w3.org" <public-html@w3.org>
Leif Halvard Silli wrote:
> Henri Sivonen, Wed, 17 Feb 2010 15:59:48 +0200:
>> On Feb 17, 2010, at 15:47, Leif Halvard Silli wrote:
>>> And until now I believed that anything that triggered strict parsing 
>>> up until now, would continue to do so.
>> There are simpler cases that show your belief wasn't true:
>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//FR">
> http://hsivonen.iki.fi/doctype/test-quirks.php?doctype=<!DOCTYPE+HTML+PUBLIC+"-%2F%2FW3C%2F%2FDTD+HTML+3.2+Final%2F%2FFR">
> My belief is shared: 
> http://www.w3.org/mid/685C7306-C689-4A04-8987-05670EF3B053@apple.com

Maciej said "If a doctype is present, then it the document is in strict 
mode unless it is one of the doctypes that specifically triggers quirks 
mode". This one is in the list ("The public identifier starts with: 
"-//W3C//DTD HTML 3.2 Final//""), so it specifically triggers quirks mode.

The exception Maciej didn't mention is that if the doctype is 
syntactically invalid in certain ways, then it will trigger quirks mode 
without looking in the list. Unexpected characters (like "[") *before* 
the system identifier trigger it; unexpected characters *after* the 
system identifier don't, which means pages like 
http://www.marlowe.co.uk/epages/Store2_Shop1549.sf will still be 
non-quirks. I see the latter case on around a dozen pages (out of half a 
million), and see the former case only on 
http://symptomresearch.nih.gov/preface/index.htm which actually looks 
*better* in quirks mode.

> What is the purpose of starting to treat that DOCTYPE as a quirks mode 
> trigger?

Browsers treat many doctypes differently, so HTML5 will inevitably start 
treating some of them differently to some existing browsers.

IE uses rules similar to http://philip.html5.org/docs/quirks.txt, so 
anything containing the string " HTML 3" (and not containing some other 
magic strings) will be quirks mode. HTML5 is much closer to what other 
browsers do, but modified based on data (large-scale surveys, bug 
reports, etc) to attempt to maximise the number of legacy pages that 
render as the author expected. The way to change the list is to provide 
data showing that compatibility would be improved by the change.

Philip Taylor
Received on Wednesday, 17 February 2010 14:57:32 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:58 UTC