- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Thu, 18 Feb 2010 16:47:13 +0000
- To: Boris Zbarsky <bzbarsky@MIT.EDU>
- CC: "public-html@w3.org" <public-html@w3.org>
Boris Zbarsky wrote: > On 2/17/10 4:29 AM, Philip Taylor wrote: >> Yes, but in pre-HTML5 browsers (IE, Firefox 3.6 without html5.enable, >> etc) doctypes will still only be parsed up to the *first* ">", so you >> will get the characters "]>" inserted as text into the body of the >> document > > That's the case with the HTML5 parser as well, no? Yes - that aspect of the parsing hasn't changed. (I think the only browser that attempts to parse this differently is Opera, which seems to ignore any ">" unless it has previously seen an equal number of "[" and "]" characters (in any order).) > I agree with Julian's concern: going from treating a doctype as > standards to treating a doctype as quirks seems like a bad idea to me. As a first approximation, changes are bad. As a second approximation, changes are bad if they break existing content. It's not clear what behaviour here will break least. The specific case is "[" after the public identifier, and before the system identifier. This can't happen in well-formed XML (the system identifier is required, and the internal subset comes after it), though I've heard that SGML allows it. It's handled in HTML5 (http://whatwg.org/html#between-doctype-public-and-system-identifiers-state) exactly like any other bogus character (i.e. forcing quirks mode), but Firefox appears to have a special case for "[" in this location (preventing quirks). Looking through half a million pages for the pattern (?i)<!doctype\s+html\s+public\s+"[^"]+"\s*\[ results in two sites: http://www.freemanforman.co.uk/ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" [url=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> http://symptomresearch.nih.gov/ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []> Looking for interesting pages on those sites: http://www.freemanforman.co.uk/content/001_Area_Search/ - in Firefox 3.6, the map renders incorrectly (it's positioned too far up/right and clipped) if html5.enable is *on* (which triggers quirks mode). http://symptomresearch.nih.gov/grantopportunities.htm - the menu items are too widely spaced and the skip link underlines are visible when html5.enable is *off*. So something breaks in Firefox either way. Possible options: * Ignore this, under the belief that minor breakage of 0.001% of sites (which have bogus doctypes and are already broken in some browsers) is not worth spending more time on. * Collect more data about whether special-casing "[" would cause more breakage or less breakage, and adjust the spec accordingly. (Probably need to look at tens or hundreds of millions of pages to get a good idea, since it's so rare.) * Make additional changes to the doctype logic so both of these pages can render correctly. Filed as http://www.w3.org/Bugs/Public/show_bug.cgi?id=9071 > -Boris -- Philip Taylor pjt47@cam.ac.uk
Received on Thursday, 18 February 2010 16:47:41 UTC