Re: Content-MD5 and partial responses from Yves Lafon on 2009-07-24 (ietf-http-wg@w3.org from July to September 2009)

From: Yves Lafon <ylafon@w3.org>
Date: Fri, 24 Jul 2009 08:41:36 -0400 (EDT)
To: Adrien de Croy <adrien@qbik.com>
cc: Henrik Nordstrom <henrik@henriknordstrom.net>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>, Larry Masinter <LMM@acm.org>
Message-ID: <alpine.DEB.1.10.0907240840300.24912@wnl.j3.bet>

On Fri, 24 Jul 2009, Adrien de Croy wrote:

>
> I also would have assumed that MD5 would cover the whole entity.  For the 
> reason that it's used as a signature on the entity.
>
> If you get a file in N parts, you can be fairly certain they are all parts of 
> the same entity if the C-MD5 is the same for each.
>
> But this doubles up with ETag.
>
> If the MD5 were calculated on only the transferred partial body it would need 
> to be calculated each time a part were served.
>
> So I think it comes down to what is the intended purpose of the header in the 
> first place.  My assumption would have been to cover potential corruption in 
> transit or detect modifications.  But obviously not secure since any agent in 
> the chain can recalculate it.  So in the end I feel it's of little value 
> which is probably why it's seldom used.

Another reason is also that for big content, on servers that are not 
caching the computed metadata, it is a good way to almost stop a server, 
so the server architecture may mandate not using it at all.


> Yves Lafon wrote:
>> On Thu, 23 Jul 2009, Henrik Nordstrom wrote:
>> 
>>> mån 2009-06-29 klockan 12:00 +1000 skrev Mark Nottingham:
>>>> After a quick look, my reading is that a Content-MD5 header on a
>>>> partial response reflects the bytes in that message, rather than the
>>>> whole (non-partial) response:
>>> 
>>> RFC2616 can apparently be read both ways depending on which parts of the
>>> specs you read, which is a bit of a problem for Content-MD5.
>>> 
>>> My reading is that Content-MD5 is computed on the variant and not the
>>> message-body. The reasoning behind this are:
>>>
>>>      * 206 is talked about to only contain ranges of the entity-body
>>>        (which btw conflicts with the general messaging format
>>>        definition of entity-body making 206 a special case).p4 4.
>>>        Combining Ranges
>> 206 is indeed a very special case.
>>
>>>      * How partial responses including their headers may be combined.
>>>        p4 4. Combining Ranges
>> Same for CL (which can be extracted form Content-Range)
>> There is indeed a story to be told about combining partial responses when 
>> Content-MD5 is there (or we can forbid C-MD5 in partial responses)
>>
>>>      * It being an Entity-Header. p3 5.8 Content-MD5
>> 
>> Well, Content-Length is also an entity header, however it applies to the 
>> transferred bytes in case of 206.
>> What would be the use of C-MD5 if it applies to the whole bag of bytes when 
>> you only get a part of it? It can't serve its purpose which is
>> integrity verification, so it makes far more sense if C-MD5 is applied to 
>> the transferred bytes, like C-Length
>>
>>>      * That sending Entity-Headers is forbidden in an conditional 206
>>>        response (MUST/SHOULD NOT) and required to be included in
>>>        unconditional 206 responses if it would have been sent in an 200
>>>        response.
>>>      *
>>>      *
>>> 
>>> 
>>> 
>> 
>
>

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves

Received on Friday, 24 July 2009 12:41:48 UTC