W3C home > Mailing lists > Public > public-html@w3.org > May 2008

Re: Parsing: Trailing garbage in doctype FPI (was: Re: Doctype usage data)

From: Simon Pieters <simonp@opera.com>
Date: Wed, 28 May 2008 19:46:51 +0200
To: "L. David Baron" <dbaron@dbaron.org>, "Ian Hickson" <ian@hixie.ch>, "Philip Taylor" <pjt47@cam.ac.uk>, hyatt@apple.com, "HTML WG" <public-html@w3.org>
Cc: www-style <www-style@w3.org>
Message-ID: <op.ublmfv1lidj3kv@hp-a0a83fcd39d2.belkin>

+www-style

On Fri, 23 May 2008 08:50:55 +0200, L. David Baron <dbaron@dbaron.org>  
wrote:

> On Thursday 2008-05-22 20:28 -0700, L. David Baron wrote:
>> On Friday 2008-05-23 03:19 +0000, Ian Hickson wrote:
>> > On Mon, 3 Mar 2008, Simon Pieters wrote:
>> > > >
>> > > > I've got some data about doctypes at
>> > > > http://philip.html5.org/data/doctypes.html (125K pages from  
>> dmoz.org)
>> > > > and http://philip.html5.org/data/doctypes-alexa.html (about 400  
>> from
>> > > > Alexa's list). I'm not entirely sure what this could be useful  
>> for,
>> > > > but I'll point out a couple of things here.
>> > >
>> > > [...] This means that Opera would break about 0.05% of pages of this
>> > > sample if we implemented HTML5 doctype switching, assuming that the
>> > > remaining pages I didn't look at were the same.
>>
>> It looks (from the limited context in the email) that you're talking
>> about making quirks-mode detection handle pages where the author has
>> manually changed the "EN" in the FPI to match the language of the
>> page content, or similar.
>>
>> Are the data you present showing that pages with these broken
>> DOCTYPEs break if they're not in quirks mode, or simply that pages
>> have these broken doctypes?  It's a pretty significant difference.
>
> Ah, it wasn't in the URLs quoted, but it was clear in
> http://lists.w3.org/Archives/Public/public-html/2008Mar/0013.html
> that the finding was really the former.
>
> Given that, I don't object to this change, although I would
> encourage being very hesitant to expand quirks mode to more pages.
> I suppose it's a pretty small set, though.
>
> (Does anybody have any data on which quirks pages (these, or quirks
> mode pages in general) actually depend on?)

Of the pages analyzed, these quirks were quite common, IIRC:

   * 100% height tables and expected them to fill up the whole page.
   * images in table cells and expected the cell to shrink-wrap the image.

I think it would be nice if CSS was changed so that at least the latter  
Just Worked like in Quirks mode and Almost Standards mode, and then Almost  
Standards mode and Standards mode can be merged into a single mode. Doing  
so would probably break some Standards mode pages that expect boxes to not  
shrink-wrap, but on the long term I think it would be worth it. (Also note  
that as implemented, the shrink-wrap quirk doesn't just apply to cells, it  
applies to normal blocks too.)


    In the light of today’s HTML5 design principles, CSS2 failed to Support
    Existing Content with its default behavior [of images in table cells].
    And that’s a bug in CSS2.
     -- http://hsivonen.iki.fi/almost-precedent/

-- 
Simon Pieters
Opera Software
Received on Wednesday, 28 May 2008 17:47:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:17 GMT