Re: NEW ISSUE: content sniffing from Adam Barth on 2009-03-31 (ietf-http-wg@w3.org from January to March 2009)

From: Adam Barth <w3c@adambarth.com>
Date: Tue, 31 Mar 2009 15:02:27 -0700
To: Adrien de Croy <adrien@qbik.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, ietf-http-wg@w3.org
Message-ID: <7789133a0903311502x2d0a59bbv2f9078626ee3c863@mail.gmail.com>

On Tue, Mar 31, 2009 at 2:54 PM, Adrien de Croy <adrien@qbik.com> wrote:
> So then surely the last word on what type of content something is, should be
> the actual content itself?

Such an algorithm would maximize compatibility but cost security.

Suppose we had an oracle that told us the "true" MIME type for a given
HTTP response.  The Content-Type header would still be an important
security feature.  For example, consider a server that replies with
the following:

Content-Type: image/gif

<html><body>I am an HTML document</body></html>

If a user agent treats this response as text/html (supposing the
oracle agrees with our intuition that this response is, in fact,
HTML), then the user agent has likely opened the server up to a
cross-site scripting attack.  Instead, the user agent should treat
this response as an image.

> So if any sniffing is to be done, surely it should only be the client?  In
> which case why don't clients just ignore the Content-Type header always and
> always try and determine the type themselves.  Some seem to do this already.

None of the major browsers do this anymore because of these security issues.

Adam

Received on Tuesday, 31 March 2009 22:03:22 UTC