- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 9 Aug 2006 01:14:06 +0000 (UTC)
- To: noah_mendelsohn@us.ibm.com
- Cc: Anne van Kesteren <annevk@opera.com>, www-tag@w3.org
On Tue, 8 Aug 2006 noah_mendelsohn@us.ibm.com wrote: > > This thread has referenced the very interesting blog entry at [1], and > makes the case that the TAG is off base in pushing the Web community [2] > quite strongly to give precedence to the HTTP Content-Type header over > content sniffing, keying on the URI suffix, etc. [...] > > [1] http://ln.hixie.ch/?start=1154950069&count=1 > [2] http://www.w3.org/2001/tag/doc/mime-respect-20060412 To give some context for my blog post: I've been pushing for Web browser vendors to fix this for approximately eight years, roughly half the lifetime of HTTP so far. I've worked hard in the Mozilla community, at Netscape, at Opera, in the Webkit community, in talks with Microsoft, at W3C meetings, in specifications, in writing test cases, and in writing documentation, over those eight years, trying to get this issue fixed. When Microsoft asked me for my list of top ten bugs that I'd like fixed in IE7, I listed just one: HTTP content sniffing. I even included nearly a hundred tests to help them do this. Over the years I have tried to get browsers to stop assuming that anything at the end of an <img src=""> was an image, and tried to get them to use the MIME type sent with the image instead of content-sniffing to determine the image type. I have tried to get browsers to use the Content-Type header when following hyperlinks, to stop them automatically downloading and showing videos that are marked as text/plain. I have tried to get browsers to obey the Content-Type headers when downloading files that <script> elements point to, even going as far as to pointing out the security implications of allowing authors to download any random file that happens to be in JS-like format (e.g. any JSON data), and reading it a if it was on their domain. I have tried to get browsers to obey the Content-Type of files when <link> elements are used to point to stylesheets. I have tried to get browsers to obey the Content-Type headers for when they handle <object> elements, going as far as including tests for this in the Acid2 test. I have tried to get browsers to rely on MIME types for detecting RSS and Atom feeds, instead of sniffing every HTML page before displaying it. Here is the sum total of what all the above, and all the other advocacy that I have done over the eight years I've been working on this, has done: 1. Mozilla, in standards mode, ignores CSS files that don't have Content-Type sent to 'text/css'. This took many years, and I've had to fight to keep this in several times. It only affects a small minority of sites that use standards mode, but even those sites sometimes fail to render correctly in Mozilla because of this, while rendering fine in other browsers. We get bugs filed on this regularly. 2. Mozilla and Opera have limited their sniffing of content sent as text/plain so that instead of sniffing on all text/plain content, they only sniff on the majority of text/plain content. 3. That's all. Only two minor things. This isn't because of lazyness. This is because ANY BROWSER THAT ACTUALLY TRIES TO IMPLEMENT THESE THINGS WOULD LOSE ALL MARKET SHARE. You simply cannot make a vendor do something that will make them lose marketshare. It won't work. Even vendors that have the best of intentions will immediately revert to "buggy" behaviour when implementing the "correct" behaviour causes thousands of customer support calls asking why their new browser broke the Web. >> I think it may be time to retire the Content-Type header, putting to >> sleep the myth that it is in any way authoritative, and instead have >> well-defined content-sniffing rules for Web content. > > I'm afraid I just don't get that. I would think the right answer would > be: let's not perpetuate these mistakes as new types spring up on the > Web. Let's work hard to get them sourced with proper media types, so > that we can have a pretty clean Web that scales well, albeit with a few > historical warts, rather than a free for all in which there's no > reliable way to establish a new type, or to reliably signal its use from > the server. So, the normative rule is: use Content-Type. The > accomodation is: cheat where already deployed content requires you to. I've done the "work hard" part. In fact, I think I might have done more work to get browsers to obey content types than anyone else on the planet. I've done the work when it comes to new types (e.g. RSS, Atom); I've done the work when it comes to old types (e.g. HTML, text/plain); I've done it for many vendors both internally and externally; I have some 97 or so test cases publically available for anyone to use to test their behaviour. There has to come a point where we realise that it doesn't work. My intent is to make the HTML5 spec define how browsers should content sniff, and when they should do so, so that we can get interoperable and reliable content sniffing in well-defined cases. The Content-Type header is still useful for certain things, e.g. specifying the encoding of text/plain content, or making the difference between text/plain, text/html, and text/xml resources, and therefore won't be completely thrown out. It just wouldn't be the only variable any more; in the majority of cases, it would be largely ignored. I believe this is a significantly more realistic way forward for the Web. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 9 August 2006 01:14:16 UTC