W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2008

Re: security impact of dropping charset default

From: Adrien de Croy <adrien@qbik.com>
Date: Thu, 24 Jan 2008 11:50:21 +1300
Message-ID: <4797C4AD.4030906@qbik.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
CC: Julian Reschke <julian.reschke@gmx.de>, David Morris <dwm@xpasc.com>, HTTP Working Group <ietf-http-wg@w3.org>

forgive me if I'm barking up the wrong tree, but why is it the 
responsibility of HTTP (remembering what the 2nd 'T' stands for) to be 
concerned about the content of any entity being transferred? 

That would be like the postal service having to be concerned about the 
contents of envelopes, rather than just the addresses on the outside.

Security issues in these cases seem to be related to what happens to 
content once it is received by a browser.  HTTP is already out of play 
at that stage.

There are a zillion things that browsers do that aren't specified in 
HTTP.  Like rendering HTML for starters.

Apart from the content descriptions (i.e. Content-Type, charset etc) 
advertised by content providers and interpreted by content consumers, as 
long as HTTP isn't changing these, it's up to the server to provide it, 
and the client to accept it or otherwise, but the transport only has the 
responsibility to faithfully copy this data from one end of the 
communication to the other.  It can't be required to police it.

Roy T. Fielding wrote:
> On Jan 23, 2008, at 11:38 AM, Julian Reschke wrote:
>> Roy T. Fielding wrote:
>>> Because the only known way to avoid the security holes in existing
>>> browsers that sniff UTF-7 is to add a charset parameter even when
>>> the exact charset is not known to the server.  That is specific to
>>> HTTP and is a known problem due to browser's ignoring the existing
>>> requirements of HTTP that this thread intends to remove.
>> Hm.
>> 1) MIME says: default for text/* is US-ASCII.
>> 2) RFC2616 says: default for text/* is ISO-8859.
>> 3) Browsers do content sniffing, thus they ignore both 1) and 2).
>> So if we remove 2), how does this change the situation WRT sniffing?
> It doesn't.  It eventually allows for servers to be compliant with
> the new requirement (yes, removing a default is setting a new
> requirement because right now sniffing is non-compliant).
>> I'm not totally opposed to mentioning this, but I'd really like to 
>> understand how the intended change changes the situation...
> If a server sends generated representations in any text type today
> without adding a charset parameter, they will be accused of having
> "XSS security holes" which are mostly just a byproduct of stupid
> browsers that guess the UTF-7 charset.  That situation will
> not change until the browsers remove guessing of UTF-7 (and any other
> charsets of the same ilk).  In the past, servers could point to this
> section and say "this blatant non-compliant behavior on the part of
> the browser is responsible for any loss as a result of the content",
> whereas removing that paragraph leaves responsibility in doubt.
> Therefore, if the change is made as suggested, a corresponding
> entry in the Security Considerations must define this problem as
> a vulnerability in browsers and exclude such charsets from the
> guessing algorithm.  If not, then we are better served by not
> changing the paragraph because then at least the compliant behavior
> is safe (even if no browser chooses to be compliant).
> ....Roy

Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Wednesday, 23 January 2008 22:49:31 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:10:44 UTC