W3C home > Mailing lists > Public > www-tag@w3.org > November 2010

Re: Feedback on Internet Media Types and the Web

From: Karl Dubost <karld@opera.com>
Date: Mon, 8 Nov 2010 16:15:41 -0500
Message-Id: <98EA895B-F646-4306-BA32-D75292B2DEC4@opera.com>
Cc: "www-tag@w3.org WG" <www-tag@w3.org>
To: Larry Masinter <masinter@adobe.com>

Le 8 nov. 2010 à 15:46, Larry Masinter a écrit :
>> The document doesn't sufficiently acknowledge 
>> that for most binary file formats 
>> (particularly image files), the "magic number" 
>> of the file format is a much more
>> reliable indicator of the format than an out-of-band MIME type,
> 
> First: I'm not sure this is true. I know there are circumstances where the
> content-type label is wrong and sniffing gives the right answer, but there
> are also circumstances where the label is right and sniffing gives the
> wrong answer. So which is more prevalent, really? Do we have more data
> than scattered anecdotes?

I was trying to find resources on that too. But the two well known large public surveys about html are now a bit old and they do not mention the possible mismatch between image format and content type. 

(That aside a new public survey could be interesting)


# MAMA (Opera) - 2008 [3]

It doesn't conclude anything significant, plus the analysis might have led to some improper conclusions in this specific case. Brian Wilson says [2]:

        "If it could not determine the format from
        the file extension, MAMA would then
        download the HTTP HEAD of the referenced
        image and proceed to examine the image's
        MIME type to detect the format. This
        policy was a useful shortcut that really
        helped with the analysis script's overall
        performance."

And later on in the same page:

        In all, 372,895 MAMA URLs contained
        images in this group—over 10% of all
        pages analyzed! This seems like a much
        higher number than one would expect for
        image formats "on the fringe".


# Web Authoring Statistics (Google) - 2005 [4]

The page on images [5] doesn't give that much information about that.



[1]: http://devfiles.myopera.com/articles/554/mamaurlset-mimehistogram.htm
[2]: http://dev.opera.com/articles/view/mama-images-elements-and-formats/#formats
[3]: http://dev.opera.com/articles/view/mama/
[4]: http://code.google.com/intl/fr-FR/webstats/
[5]: http://code.google.com/intl/fr-FR/webstats/2005-12/element-img.html


-- 
Karl Dubost - http://dev.opera.com/
Developer Relations & Tools, Opera Software
Received on Monday, 8 November 2010 21:16:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:29 GMT