Re: Sniffing and HTTP-bis (ACTION-309) from Henry S. Thompson on 2009-12-09 (www-tag@w3.org from December 2009)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Wed, 09 Dec 2009 13:55:03 +0000
To: www-tag@w3.org
Message-ID: <f5biqcguua0.fsf@hildegard.inf.ed.ac.uk>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

ht writes:

> At the TAG f2f in September, we discussed [1] Content-Type sniffing
> and the then-current state of the HTTPbis [2] insofar as it addresses
> this question (see section 3.2.1 *Type*).
>
> As it stands the draft only indirectly alludes to sniffing, in the
> following paragraph:
>
>   Content-Type specifies the media type of the underlying data. Any
>   HTTP/1.1 message containing an entity-body SHOULD include a
>   Content-Type header field defining the media type of that body,
>   unless that information is unknown. If the Content-Type header field
>   is not present, it indicates that the sender does not know the media
>   type of the data; recipients MAY either assume that it is
>   "application/octet-stream" ([RFC2046], Section 4.5.1) or examine the
>   content to determine its type.
> . . .
> I think we should in fact request the HTTPbis editors to reopen their
> Ticket #155 [4] with a suggestion that something along the following
> lines be added after the above-quoted paragraph in section 3.2.1:
>
>   If the Content-Type header field _is_ present, recipients SHOULD NOT
>   examine the content and override the specified type if the change
>   would significantly alter the security exposure ('privilege
>   escalation').
>
> This change is compatible with _Content-Type Processing Model_, a
> draft "responsible sniffing" Internet-Draft [5].

After discussion at the in-progress TAG f2f, here's revised
suggested text for inclusion in HTTPbis section 3.2.1:

  If the Content-Type header field _is_ present, a receipient which
  interprets the underlying data in a way inconsistent with the
  specified media type risks drawing incorrect conclusions.

  In practice, however, currently-deployed servers do not always
  provide correct Content-Type headers, with the result that some
  recipients examine the content and override the specified type.

  Such 'sniffing' SHOULD NOT be done unless there is evidence that the
  specified media type is in error (for example, because it is
  'text/plain' but there are bytes in the data which are not legal for
  the specified or defaulted charset).  In any case recipients SHOULD
  NOT override the specified type if the change would significantly
  alter the security exposure ('privilege escalation').

  Deploying any heuristic for detecting mistaken Content-Types risks
  overriding user intentions and misrepresenting data---accordingly
  recipients SHOULD provide for users to disable sniffing in general
  and/or in particular cases.

This probably goes too far in the opposite direction from my previous
offering -- comments and suggestions welcome.

ht

[1] http://www.w3.org/2001/tag/2009/09/24-minutes#item03
[2] http://trac.tools.ietf.org/wg/httpbis/trac/export/663/draft-ietf-httpbis/latest/p3-payload.html#rfc.section.3.2.1
[3] http://www.w3.org/2001/tag/group/track/actions/309
[4] http://trac.tools.ietf.org/wg/httpbis/trac/ticket/155
[5] http://ietfreport.isoc.org/idref/draft-abarth-mime-sniff/
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
                         Half-time member of W3C Team
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFLH6w3kjnJixAXWBoRAko7AJ4kz20aHG0rx08d1VOus4KgVX5TugCeIH30
GP2jczehoa79lNdxKs4QF40=
=Sgym
-----END PGP SIGNATURE-----
Received on Wednesday, 9 December 2009 13:55:38 UTC