Re: review of content type rules by IETF/HTTP community

Julian Reschke wrote:
> Ian Hickson wrote:
>> Users wouldn't understand why the UA kept saying it, especially since 
>> it would say it for most pages on the Web. Making the errors only 
>> appear in error consoles is already done in many cases, and could be 
>> done in more, but that's a UA issue, not an interoperability issue, 
>> and thus out of scope for a specification. (Mozilla already reports 
>> Content-Type errors for stylesheets, but nobody cares.)
> 
> How do you know that nobody cares? What's the percentage of CSS served 
> with the wrong mime type?

As an extremely rough indication, I had a look at the Alexa Top 500 
sites (just searching with regexps for relevant <link>s and @imports) 
and found about five hundred referenced stylesheets.

One was incorrectly served: http://www.bebo.com/ uses 
http://www.bebo.com/css/flags.css which is text/html (and also uses 
http://www.bebo.com/css/flags.js which is text/html).

One wasn't a stylesheet: http://www.webshots.com/ has a <link 
rel="stylesheet" type="text/css" href=""> so the page is including 
itself (as text/html).

I also noticed (though I wasn't specifically looking for it) that 
http://www.hi5.com/ has a <link rel="shortcut icon" ...> for 
http://images.hi5.com/images/favicon.ico which is text/plain.


The Top 500 list is significantly different to more 'normal' pages (e.g. 
94% have <script> and 84% have <form> in those 500, compared to 66% and 
29% respectively for pages on dmoz.org), so I looked at 500 random 
dmoz.org pages and found 165 stylesheets.

One was incorrectly served: http://iaaa.nl/ has 
http://iaaa.nl/styles/blacktext.html which is text/html (but is actually 
a CSS file despite its name).

One wasn't a stylesheet: 
http://encarta.msn.com/encyclopedia_761579147/William_I_(of_England).html 
has a <link rel="stylesheet" type="text/css" id="eot" href="" /> (twice) 
and includes itself (as text/html).

One was just weird: http://www.louvre.fr/llv/commun/home.jsp has a 
half-random string of bytes like "�:l/css" if you do a HEAD request, 
but "text/css" if you do GET.


I can't non-trivially look at a larger number of pages to get more 
accurate information, and I didn't look for incorrect scripts or images 
(but noticed quite a few problems in those when looking by hand), so 
this isn't particularly useful except to suggest that numbers in the 
0-1% range would (if someone collected better data) sound reasonable for 
the amount of sites that will break unless stylesheet content-types are 
ignored.

-- 
Philip Taylor
philip@zaynar.demon.co.uk

Received on Friday, 24 August 2007 13:21:06 UTC