Re: [integrity] What should we hash? from Mark Nottingham on 2014-03-13 (public-webappsec@w3.org from March 2014)

From: Mark Nottingham <mnot@mnot.net>
Date: Thu, 13 Mar 2014 16:28:52 +1100
To: Devdatta Akhawe <dev.akhawe@gmail.com>
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>, Boris Zbarsky <bzbarsky@mit.edu>
Message-Id: <C1971F56-0FAE-4DD3-8603-BB988D511264@mnot.net>

That would be the representation - see:
  http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-26#section-3.1.1.5

You should still probably have a few words around it to clarify; this sort of thing is easy to mess up.

Cheers,


On 13 Mar 2014, at 4:24 pm, Devdatta Akhawe <dev.akhawe@gmail.com> wrote:

> Thanks! Intuitively, it seems there should be a simpler way to refer
> to "the message payload before content codings are applied.".
> 
> What does the content-type refer to? Type of what? For example, if I
> am not completely wrong, content-type of a text file with gzip
> content-encoding is still text/plain and so presumably talks about
> "before content codings are applied"
> 
> thanks
> Dev
> 
> On 12 March 2014 17:47, Mark Nottingham <mnot@mnot.net> wrote:
>> The HTTPbis docs are going to obsolete RFC2616 in a few weeks, so it's best to look at them.
>> 
>> I think you want to say that integrity operates upon the payload of the message - see
>>  http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-26#section-3.3
>> 
>> This is after chunk encoding, gzip transfer-codings, etc. have been removed. However, content-codings are still there, e.g., for gzip, deflate in the Content-Encoding header.
>> 
>> That's because Content-Encoding is considered a property of the representation in HTTP, even though many people implement it as a separate layer.
>> 
>> If you want to do it before content-encoding, you'd need to specify it explicitly; e.g., as "the message payload before content codings are applied."
>> 
>> Hope this helps,
>> 
>> On 12 Mar 2014, at 3:58 am, Devdatta Akhawe <dev.akhawe@gmail.com> wrote:
>> 
>>> Hi
>>> 
>>> One key question for integrity spec is "What should the browser hash?"
>>> Boris mentioned this previously
>>> http://lists.w3.org/Archives/Public/public-webappsec/2013Dec/0048.html
>>> 
>>> Informally, I am leaning towards hashing content after undoing stuff
>>> like gzip, deflate, chunked-encodings etc. Does that sound reasonable?
>>> 
>>> Next, how do we formalize (spec) this? In an ideal world, just saying
>>> "undo transfer-encoding" would be enough (i.e., spec would say "hash
>>> entity body"). But, common behavior is to apply gzip via
>>> Content-Encoding not transfer-encoding. And we want to hash after
>>> undoing gzip. (see Boris' email above)
>>> 
>>> Mark: Do you know good specification text for this? After looking at
>>> the HTTP RFC, one wording that springs to my mind is: ""After decoding
>>> the entity to the media-type referenced by the content-type header"
>>> 
>>> Thanks
>>> Dev
>> 
>> --
>> Mark Nottingham   http://www.mnot.net/
>> 
>> 
>> 

--
Mark Nottingham   http://www.mnot.net/

Received on Thursday, 13 March 2014 05:29:10 UTC