Re: several messages about content sniffing in HTML from Ian Hickson on 2008-02-29 (public-html@w3.org from February 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 29 Feb 2008 20:17:48 +0000 (UTC)
To: Julian Reschke <julian.reschke@gmx.de>, Geoffrey Sneddon <foolistbar@googlemail.com>, James Graham <jg307@cam.ac.uk>, Boris Zbarsky <bzbarsky@MIT.EDU>, Anne van Kesteren <annevk@opera.com>
Cc: "public-html@w3.org" <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0802291953150.6407@hixie.dreamhostps.com>

Thanks for the quick feedback on this. I've made the requested changes to 
improve the interoperability while keeping the sniffing at a minimum 
(though of course while any of the vendors keep shipping browsers that 
don't limit their sniffing -- in particular IE -- to the sniffing the 
spec allows, this might continue to grow).

On Fri, 29 Feb 2008, Julian Reschke wrote:
> > > 
> > > [Multiple Content-Type headers]
> > 
> > It seems like the HTTP spec should define how to handle that, but the 
> > HTTP working group has indicated a desire to not specify error 
> > handling behaviour, so I guess it's up to us.
> > 
> > IE and Safari use the first one, Firefox and Opera use the last one. I 
> > guess we'll use the first one.
> 
> Isn't the fact that FF and IE disagree here an indication that this 
> doesn't need to be specified?

It's an indication that it doesn't hugely matter what we specify at the 
moment, but it's certainly not an indication that we should leave it 
unspecified. We shouldn't leave _anything_ unspecified if it can affect 
interoperability, including error handling.

> Where does it stop? Are you planning to add new special cases any time 
> some Linux distro screws things up in a new way?

Only if they get enough market share that it affects the Web.

On Fri, 29 Feb 2008, Julian Reschke wrote:
> 
> What I wanted to point out is that it is *not* necessary for HTML5 to 
> pick one specific behavior, and for FF3 to change what it does right 
> now.

It's only necessary because the HTTP working group refuses to act in a 
responsible manner and actually specify how to write an interoperable user 
agent that is both compatible with the Web and handles invalid content. If 
HTTP defined how to do this, we wouldn't be stuck with defining it 
ourselves.

You are right that we could have picked the FF/Safari behaviour instead of 
the IE/Opera behaviour, but you are wrong that it is ok for all the 
browsers to do their own thing here. What we want is interoperability.

On Fri, 29 Feb 2008, Boris Zbarsky wrote:
> 
> Or because the header parser uses the first header that actually looks 
> like a valid content-type (e.g. contains a '/').  Specifying this _is_ 
> needed.

I agree. HTML5 covers this for now; we can remove it once HTTP has been 
updated to describe how to handle all these error cases. (There has been 
some talk of making an "HTTP5" specification in the vein of HTML5, taking 
HTTP and making a more comprehensive spec out of it, but so far the only 
work in this direction has been gsnedders' parsing draft.)

On Fri, 29 Feb 2008, Boris Zbarsky wrote:
> >
> > Actually the spec right now requires that there be no content sniffing 
> > if the Content-Encoding header is set... are you running into cases 
> > where that is a problem?
> 
> There are some bugs in Bugzilla, yes.  I don't have the bug numbers off 
> the top of my head.

Ok. I'll remove that requirement.

> > > Oh, one more note.  Gecko's sniffing behavior actually had to be 
> > > changed recently.  Unfortunately, the more recent Apache installs 
> > > changed from ISO-8859-1 to UTF-8 as the default encoding
> >
> > Uppercase only?
> 
> Yes.

Changed.

On Fri, 29 Feb 2008, Boris Zbarsky wrote:
> Julian Reschke wrote:
> > Roy pointed out (I think) that Apache's defaults did not change; so it 
> > must be some distributor/vendor causing this.
> 
> Yep.

Any idea who?

On Fri, 29 Feb 2008, Anne van Kesteren wrote:
> 
> Ok... Could you also define the quirks mode behavior while you're at it? 
> And maybe limited quirks mode if that's affected as well (I forgot).

Added it, quirks only.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 29 February 2008 20:18:06 UTC