W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2008

Re: security impact of dropping charset default

From: Mark Nottingham <mnot@mnot.net>
Date: Thu, 24 Jan 2008 17:57:12 +1100
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <CCEBD089-E45A-4AF5-90CD-4589362F9483@mnot.net>
To: Adrien de Croy <adrien@qbik.com>

On 24/01/2008, at 9:50 AM, Adrien de Croy wrote:

> forgive me if I'm barking up the wrong tree, but why is it the  
> responsibility of HTTP (remembering what the 2nd 'T' stands for) to  
> be concerned about the content of any entity being transferred?
> That would be like the postal service having to be concerned about  
> the contents of envelopes, rather than just the addresses on the  
> outside.

Oh, but they do, at least last time I was in the US...

> Security issues in these cases seem to be related to what happens to  
> content once it is received by a browser.  HTTP is already out of  
> play at that stage.
> There are a zillion things that browsers do that aren't specified in  
> HTTP.  Like rendering HTML for starters.
> Apart from the content descriptions (i.e. Content-Type, charset etc)  
> advertised by content providers and interpreted by content  
> consumers, as long as HTTP isn't changing these, it's up to the  
> server to provide it, and the client to accept it or otherwise, but  
> the transport only has the responsibility to faithfully copy this  
> data from one end of the communication to the other.  It can't be  
> required to police it.

Are you saying that you're against adding a sentence or two to  
Security Considerations about this issue? So far, I've seen pretty  
strong support for doing so from a variety of people.


> Roy T. Fielding wrote:
>> On Jan 23, 2008, at 11:38 AM, Julian Reschke wrote:
>>> Roy T. Fielding wrote:
>>>> Because the only known way to avoid the security holes in existing
>>>> browsers that sniff UTF-7 is to add a charset parameter even when
>>>> the exact charset is not known to the server.  That is specific to
>>>> HTTP and is a known problem due to browser's ignoring the existing
>>>> requirements of HTTP that this thread intends to remove.
>>> Hm.
>>> 1) MIME says: default for text/* is US-ASCII.
>>> 2) RFC2616 says: default for text/* is ISO-8859.
>>> 3) Browsers do content sniffing, thus they ignore both 1) and 2).
>>> So if we remove 2), how does this change the situation WRT sniffing?
>> It doesn't.  It eventually allows for servers to be compliant with
>> the new requirement (yes, removing a default is setting a new
>> requirement because right now sniffing is non-compliant).
>>> I'm not totally opposed to mentioning this, but I'd really like to  
>>> understand how the intended change changes the situation...
>> If a server sends generated representations in any text type today
>> without adding a charset parameter, they will be accused of having
>> "XSS security holes" which are mostly just a byproduct of stupid
>> browsers that guess the UTF-7 charset.  That situation will
>> not change until the browsers remove guessing of UTF-7 (and any other
>> charsets of the same ilk).  In the past, servers could point to this
>> section and say "this blatant non-compliant behavior on the part of
>> the browser is responsible for any loss as a result of the content",
>> whereas removing that paragraph leaves responsibility in doubt.
>> Therefore, if the change is made as suggested, a corresponding
>> entry in the Security Considerations must define this problem as
>> a vulnerability in browsers and exclude such charsets from the
>> guessing algorithm.  If not, then we are better served by not
>> changing the paragraph because then at least the compliant behavior
>> is safe (even if no browser chooses to be compliant).
>> ....Roy
> -- 
> Adrien de Croy - WinGate Proxy Server - http://www.wingate.com

Mark Nottingham     http://www.mnot.net/
Received on Thursday, 24 January 2008 06:57:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:10:44 UTC