Re: several messages about content sniffing in HTML from Geoffrey Sneddon on 2008-02-29 (public-html@w3.org from February 2008)

From: Geoffrey Sneddon <foolistbar@googlemail.com>
Date: Fri, 29 Feb 2008 16:43:42 +0000
To: Julian Reschke <julian.reschke@gmx.de>
Cc: James Graham <jg307@cam.ac.uk>, WHATWG List <whatwg@whatwg.org>, "public-html@w3.org WG" <public-html@w3.org>
Message-Id: <3F542C02-9888-43F2-A4BD-88A8865A19BE@googlemail.com>

On 29 Feb 2008, at 16:33, Julian Reschke wrote:

> Geoffrey Sneddon wrote:
>>>> It seems like the HTTP spec should define how to handle that, but  
>>>> the HTTP working group has indicated a desire to not specify  
>>>> error handling behaviour, so I guess it's up to us.
>>>> IE and Safari use the first one, Firefox and Opera use the last  
>>>> one. I guess we'll use the first one.
>>>
>>> Isn't the fact that FF and IE disagree here an indication that  
>>> this doesn't need to be specified?
>> Things aren't specified well enough until I can write an HTTP UA  
>> that can work in the real world (which, as someone dealing with  
>> feeds, I can tell you need without question support for content- 
>> type sniffing) from reading specifications without having to  
>> reverse-engineer anything.
>> ...
>
> Doesn't seem to apply to this case.
>
> A duplicate Content-Type header response indicates that the response  
> is invalid.

And guess what? Users don't like error messages. I want to know how to  
deal with it without having to look elsewhere (from the spec).

> Apparently, most browsers accept the response anyway, some of which  
> picking the first value, others the second. Both behaviors seem to  
> be acceptable to users.
>
> So there's nothing you *need* to reverse engineer in this case.

A page (<http://www.toledoblade.com/apps/pbcs.dll/section?Category=RSS01&mime=XML 
 >) that I came across recently had:

Content-Type: XML
Content-Type: text/XML

Using the first would break badly. I guess it seems to work because of  
content-type sniffing on an unknown (and invalid) header (or, as many  
feed readers do, totally ignoring it, with the exception of any  
charset parameter). Without content-type sniffing, that HTML 5 now  
allows, you need the last.

But as James says: how do I know that which behaviour I choose doesn't  
matter until I reverse engineer browsers to discover that?


--
Geoffrey Sneddon
<http://gsnedders.com/>

Received on Friday, 29 February 2008 16:44:09 UTC