Re: Using HTTP Trailers [was: Content-Integrity header] from Zhong Yu on 2012-07-11 (ietf-http-wg@w3.org from July to September 2012)

From: Zhong Yu <zhong.j.yu@gmail.com>
Date: Wed, 11 Jul 2012 16:59:44 -0500
To: "Ludin, Stephen" <sludin@akamai.com>
Cc: Wenbo Zhu <wenboz@google.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CACuKZqF7+TFwXDWg0_43isN9rR8R1SHSZ-fZucAo9xBSFOZa0w@mail.gmail.com>
Transfer-Encoding and Content-Encoding overlap in functionality. For
example, gzip can be done in either way. But in reality, gzip is only
implemented with Content-Encoding, for whatever reason. It might be
safer to introduce a new content-coding than a new transfer-coding.

On Wed, Jul 11, 2012 at 4:27 PM, Ludin, Stephen <sludin@akamai.com> wrote:
> On the "Expression of Interest" side, I would be very interested in
> functionality that allows flexible piece-wise integrity checking that
> enables the user-agent to detect an issue and re-request a specific part
> on an object in question.  If forced into 1.1 land, Chunk trailers seem
> like a logical place to implement this.  As suggested below however, a
> separate transfer-encoding entirely may be a better way yo go. I'll need
> to tool on that a bit.
>
> Akamai will certainly support something like this.  To Mark's point about
> intermediaries potentially removing the trailers, I think that is and
> always will be true for anything under the 'Transfer-Encoding' umbrella.
> This is what differentiates this feature from the Content-MD5 header.  It
> does not take away from the functionality necessarily.  I expect that with
> a feature like this smart surrogates may even ADD the trailers on behalf
> of an origin to provide middle/last mile integrity.  Many ways to use this
> one.
>
> -stephen
>
> On 7/11/12 11:30 AM, "Zhong Yu" <zhong.j.yu@gmail.com> wrote:
>
>>If the value of a trailer is a function of the message body, and the
>>function is app specific, it's a little hard for filters (who
>>intercept and transform responses) to correctly transform the value of
>>the trailer.
>>
>>
>>On Wed, Jul 11, 2012 at 3:01 AM, Wenbo Zhu <wenboz@google.com> wrote:
>>>
>>>
>>> On Tue, Jul 10, 2012 at 7:50 PM, James M Snell <jasnell@gmail.com>
>>>wrote:
>>>>
>>>> Yeah, this has been my experience as well. Unfortunately, I'm not sure
>>>> if this is something that can be effectively fixed. Off the top of my
>>>> head, I don't know of a single server or client side http stack that
>>>> is capable of easily setting Trailers in the stream. I just checked
>>>> and support for trailers is still missing in the Servlet API, for
>>>> instance.
>>>
>>> There is no new API required, necessarily, to support trailers:
>>> 1. app code needs set the header Trailer: <header_name> before the
>>>response
>>> is committed;
>>> 2. container will ignore the header  <header_name> when the response is
>>> committed;
>>> 3. container will output the trailer when the response is finished.
>>>
>>> The request header "TE: trailers" may or may not be respected.
>>>
>>>
>>>>
>>>> I know this kind of thing is generally frowned upon, but one possible
>>>> approach is to make use of a special content-type... for instance...
>>>>
>>>>   HTTP/1.1 200 OK
>>>>   Trailer: Content-Integrity
>>>>   Content-Type: application/http-trailers; type="text/plain"
>>>>
>>>>   foobar
>>>>
>>>>   Content-Integrity:
>>>>
>>>>sha-256:aec070645fe53ee3b3763059376134f058cc337247c978add178b6ccdfb0019f
>>>>
>>>> This is just a strawman, but essentially, it would append a checksum
>>>> to the end of the content stream. It's not a trailer in the
>>>> traditional sense, it's actually encoded as part of the payload. In
>>>> lieu of proper Trailer support this is the only other way I can see it
>>>> working effectively in HTTP/1.1.
>>>>
>>>> For HTTP/2.0, I suppose it would be possible for us to take a more
>>>> appropriate approach either with trailers or by building the integrity
>>>> mechanism into the framing somehow.
>>>>
>>>> (Warning: thinking out loud *before* having consumed any beer... the
>>>> ideas expressed above may not make any sense. May need to apply
>>>> alcohol and try again)
>>>>
>>>> - James
>>>>
>>>> On Tue, Jul 10, 2012 at 4:59 PM, Mark Nottingham <mnot@mnot.net> wrote:
>>>> > chunk-extensions aren't widely supported in APIs, and usually dropped
>>>> > hop-by-hop (then again, trailers can be too).
>>>> >
>>>> > Some discussion of trailer support in Apache:
>>>> >
>>>> >
>>>>http://apache-http-server.18135.n6.nabble.com/HTTP-trailers-td4796242.ht
>>>>ml
>>>> >
>>>> > Š and using Trailers to surface debug information in Firebug:
>>>> >
>>>>https://groups.google.com/forum/?fromgroups#!topic/firebug/v5ldjoVThH8
>>>> >
>>>> > (yes, this is a bit of a project for me)
>>>> >
>>>> > Finally, I know some server-side implementers would would LOVE to be
>>>> > able to set things like ETag and Last-Modified in trailers, and have
>>>>them
>>>> > used by caches...
>>>> >
>>>> > Cheers,
>>>> >
>>>> >
>>>> > On 11/07/2012, at 4:33 AM, Ludin, Stephen wrote:
>>>> >
>>>> >> I really like the idea of placing the Digest in the chunk trailers.
>>>> >> Being
>>>> >> able to calculate these digests on the fly and not buffer the entire
>>>> >> message is critical in my opinion.
>>>> >>
>>>> >> Another concept that I have been playing with is providing digests
>>>>on
>>>> >> individual chunks using chunk-extension.  The rational for this is
>>>>for
>>>> >> very large objects.  With per-chunk digests the client would have
>>>>the
>>>> >> ability to re-request a specific corrupted section of an object
>>>>using a
>>>> >> range request rather than the entire object.  This can have enormous
>>>> >> perceived performance and reliability benefits for consumers of
>>>>things
>>>> >> such as software download and large media files.
>>>> >>
>>>> >> I was working on a draft to propose this, but I didn't feel it was
>>>>well
>>>> >> baked enough to share.  If there is interest in this type of
>>>> >> functionality
>>>> >> I will polish it up and post it.
>>>> >>
>>>> >> One issue to point is is that for these types of "frame" based
>>>> >> integrity
>>>> >> checks I generally feel like we are reinventing the content
>>>>integrity
>>>> >> portion of SSL/TLS.  Though I see the value in begin able to do this
>>>> >> apart
>>>> >> from SSL it forces the question at what point do you just switch
>>>>over
>>>> >> to
>>>> >> SSL to get the desired functionality?
>>>> >>
>>>> >> -stephen
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 7/9/12 4:00 PM, "Amos Jeffries" <squid3@treenet.co.nz> wrote:
>>>> >>
>>>> >>> On 10.07.2012 07:08, HAYASHI, Tatsuya wrote:
>>>> >>>> +1
>>>> >>>>
>>>> >>>> I know that this is demanded.
>>>> >>>> When I discussion about http-authentication and phishing,
>>>> >>>> it is requested by many people.
>>>> >>>> It is a difficult problem.
>>>> >>>> ex) proxy...
>>>> >>>>
>>>> >>>> I think that it is good to do this discussion now.
>>>> >>>>
>>>> >>>> ---
>>>> >>>> Tatsuya
>>>> >>>>
>>>> >>>> On Sat, Jul 7, 2012 at 8:23 AM, James M Snell wrote:
>>>> >>>>> In general, I'm +1 on the general idea albeit with a few
>>>>caveats...
>>>> >>>>>
>>>> >>>>> 1. To minimize complexity, only a single Content-Integrity header
>>>> >>>>> should be used. I don't want, as Roy points out, to have to
>>>>iterated
>>>> >>>>> through a bunch of unsupported header values looking for the one
>>>>I
>>>> >>>>> want. Just as it makes very little sense for an implementor to
>>>> >>>>> provide
>>>> >>>>> multiple Last-Modified, Etag and Content-Type headers in a single
>>>> >>>>> message; there should be only a single Content-Integrity
>>>>statement
>>>> >>>>> and
>>>> >>>>> I either understand it or I don't.
>>>> >>>
>>>> >>> Either the client advertises what it supports (opening itself to
>>>> >>> middleware erasing options they can't modify). Or the server uses
>>>> >>> multiple algorithms in hopes that the middleware cannot violate
>>>>them
>>>> >>> all.
>>>> >>> It makes perfect sense to have several levels of integrity check.
>>>>MD5,
>>>> >>> SHA1, AES in one response and allow the client to validate the
>>>> >>> strongest
>>>> >>> it can handle.
>>>> >>>
>>>> >>> There is also an arguable case for middleware wanting to add its
>>>>own
>>>> >>> hash to inform the client essentially "this is what I got given".
>>>>So
>>>> >>> the
>>>> >>> point of manipulation can be back-traced when the more secure
>>>> >>> end-to-end
>>>> >>> checks fail.
>>>> >>>
>>>> >>> If you want end-to-end integrity, don't stop at half measures.
>>>> >>> Particularly at half measures which can be corrupted.
>>>> >>>
>>>> >>>
>>>> >>>>>
>>>> >>>>> 2. The performance impact of calculating the digest needs to be
>>>> >>>>> carefully considered. I'd rather not be required to buffer a full
>>>> >>>>> representation in memory all the time just to calculate a header
>>>> >>>>> value. I know it's largely unavoidable, but perhaps there's some
>>>> >>>>> currently elusive solution that can be considered. For instance..
>>>> >>>>> allowing Content-Integrity to appear as a trailer at the end of a
>>>> >>>>> chunked response.
>>>> >>>
>>>> >>> As has been said Trailers happen here.
>>>> >>>
>>>> >>>>>
>>>> >>>>> 3. Something needs to be said about what happens if the
>>>> >>>>> Content-Integrity check fails. For instance, if a request
>>>>containing
>>>> >>>>> Content-Integrity is sent to the server and the server detects
>>>>that
>>>> >>>>> the signature is invalid, what should happen? what must happen?
>>>> >>>>> Likewise, how are intermediaries expected to treat the
>>>> >>>>> Content-Integrity header given that any intermediary is able to
>>>> >>>>> modify
>>>> >>>>> the payload at any time?
>>>> >>>
>>>> >>> This is going to be most useful on request/responses sent with
>>>> >>> "no-transform" of course.
>>>> >>> If the integrity was only a MD5 or SHA1 hash which middleware can
>>>>edit
>>>> >>> easily there is no end-to-end integrity, just hop-by-hop integrity.
>>>> >>>
>>>> >>>
>>>> >>> Also, there has to be a mutual secret between origin server and
>>>> >>> client.
>>>> >>> Without that, when integrity is compromised the transforming hop
>>>>will
>>>> >>> simply erase or replace the Content-Integrity header value. A
>>>>secret
>>>> >>> key
>>>> >>> unknown to that middleware is required to make the integrity hash
>>>> >>> break
>>>> >>> when it tries this.
>>>> >>>
>>>> >>> AYJ
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>> > --
>>>> > Mark Nottingham
>>>> > http://www.mnot.net/
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>
>>
>
Received on Wednesday, 11 July 2012 22:00:16 UTC