Re: MACing HTTP requests/responses (Re: Content-Integrity header) from James M Snell on 2012-07-12 (ietf-http-wg@w3.org from July to September 2012)

From: James M Snell <jasnell@gmail.com>
Date: Thu, 12 Jul 2012 10:00:35 -0700
To: Phillip Hallam-Baker <hallam@gmail.com>
Cc: Nico Williams <nico@cryptonector.com>, ietf-http-wg@w3.org
Message-ID: <CABP7Rbd77_dLbFTkQ1zHUzFGQcbE=X5TJtCQ+BQtBnzuNcnF_A@mail.gmail.com>
To provide one example... using the servlet API within Apache Tomcat
7... I can easily access trailers included in a post.. but there is no
obvious means of including a trailer in the response...

Given the input...

  POST /Testing/test HTTP/1.1
  Host: localhost
  Transfer-Encoding: chunked
  Content-Type: text/plain
  Trailer: x-foo
  TE: chunked

  4
  ABCD
  6
  EFGHIJ
  0
  x-foo: test

On the server side...

  protected void doPost(
    HttpServletRequest request,
    HttpServletResponse response)
      throws IOException {

    // outputs null since the trailer hasn't been parsed yet...
    System.out.println(request.getHeader("x-foo"));

    // consume the input
    InputStream in = request.getInputStream();
    int r = -1;
    while((r = in.read(new byte[100])) > -1) {}

    // outputs the value of the x-foo trailer... so far so good...
    System.out.println(request.getHeader("x-foo"));

    // let's do a chunked response and try that...
    response.addHeader("Trailer", "x-foo");
    OutputStream out = response.getOutputStream();
    for (int n = 0; n < 10000; n++) {
      out.write('a');
    }
    // x-foo is never included in the response...
    // there is no obvious means of including trailers
    // in chunked responses using the servlet api...
    response.addHeader("x-foo", "response");
  }

To make matters worse, whether or not the x-foo header is included in
the response is dependent entirely on the number of bytes written to
the output stream. If the number of bytes is small enough, tomcat
buffers everything and sends a non-chunked response, dropping x-foo in
as a header, rather than as a trailer, despite the explicit addition
of the "Trailer: x-foo" header. So far, I've been unable to uncover
any simple way of forcing chunked responses.

Now, to be fair, I could be overlooking something in my analysis, but
if tomcat and the servlet api do have a mechanism for sending trailers
in the response, it is, at best, quite non-obvious.

- James

On Thu, Jul 12, 2012 at 9:19 AM, James M Snell <jasnell@gmail.com> wrote:
> On Wed, Jul 11, 2012 at 6:51 PM, Phillip Hallam-Baker <hallam@gmail.com> wrote:
>> What are the problems inherent in using Content-Integrity as a Trailer?
>>
>> Do the legacy clients botch chunked encoding?
>>
>>
>
> Well, it appears that many existing 1.1 client, server and
> intermediary implementations have taken the "trailers are optional"
> idea to the extreme and either (a) provide no means of appending
> trailers to the output, (b) provide no means of accessing trailers
> included in the input or make it quite a bit more difficult to get to
> them or (c) do not properly forward trailers on as a chunked message
> flows from hop-to-hop. It's not clear at this point exactly how
> widespread this problem is but it appears pervasive enough to make a
> trailer-based content-integrity option problematic at best.
>
> - James
>
>> On Wed, Jul 11, 2012 at 7:44 PM, James M Snell <jasnell@gmail.com> wrote:
>>> Phillip, just want to make sure that I'm keeping up with the
>>> conversation thus far... Because of the problems inherent in using
>>> Content-Integrity as a Trailer, is the idea then that
>>> Content-Integrity would be a standard Header and that a new Transfer
>>> or Content Encoding would be defined that supports an incremental
>>> integrity check as a component of the encoding?
>>>
>>>   GET /some/uri HTTP/1.1
>>>   Host: example.org
>>>   TE: integrity
>>>
>>> For instance... something like...
>>>
>>>   HTTP/1.1 200 OK
>>>   Content-Type: application/octet-stream
>>>   Content-Integrity: SHA-256; modifier=123...; param="..."
>>>   Transfer-Encoding: integrity
>>>
>>>   10
>>>   {chunk of bytes}
>>>   {digest}
>>>   10
>>>   {chunk of bytes}
>>>   {digest}
>>>   0
>>>
>>> - James
>>>
>>> On Wed, Jul 11, 2012 at 4:07 PM, Phillip Hallam-Baker <hallam@gmail.com> wrote:
>>>> On Wed, Jul 11, 2012 at 2:43 PM, Nico Williams <nico@cryptonector.com> wrote:
>>>>> I agree that we need something better, and in particular that we ought
>>>>> to have a MAC instead of a plain hash.  The problem with a MAC is:
>>>>> whence the key?  Also, what should the MAC be applied to?
>>>>
>>>> The MAC should be applied to the 8-bit clean message content (i.e.
>>>> precisely that which is bounded by Content-Length)
>>>>
>>>> If we are talking about Web Services then the key would be established
>>>> through some application layer key exchange (TBS).
>>>>
>>>> The key requirement from a performance standpoint as I see it is that
>>>> the server has to be able to operate in a stateless fashion which
>>>> means using a ticket like approach.
>>>>
>>>>
>>>>> Using a MAC, having a shared session key, ties into HTTP
>>>>> authentication.  We can definitely have a generic MAC for HTTP and say
>>>>> that HTTP authentication mechanisms that can should output session
>>>>> keys.  And the HTTP authentication would also have to take care of MAC
>>>>> algorithm negotiation.  I'd be quite happy with this approach.
>>>>
>>>> I think that there is definitely an opportunity to make use of a
>>>> ticket mode to tie the HTTP authentication to the HTTP channel.
>>>>
>>>>
>>>>> One issue with this approach is: if we always use TLS (but we might
>>>>> not), why do the extra session crypto?  What do we gain?  Do we need
>>>>> to worry about content re-writing proxies, say, as in some 3G
>>>>> networks?  If we always use TLS then it suffices to ensure that a) TLS
>>>>> provides confidentiality protection, b) the server cert remains the
>>>>> same for the length of the login session, c) we have a unique,
>>>>> unpredictable session ID in the headers (what we might call a cookie,
>>>>> though we don't want it to be a cookie as such).
>>>>
>>>> TLS is very large, very complex and was engineered from the assumption
>>>> that there would be public key credentials on the client. Yes, people
>>>> can train it to do other tricks, but doing that is a lot more complex
>>>> than doing what we need in HTTP.
>>>>
>>>> In my particular Web Service I am using TLS but still want to have a
>>>> transport layer authentication protection because I don't want to do a
>>>> TLS public key negotiation on each transaction and I don't want to be
>>>> bound to TLS session expiry.
>>>>
>>>> In a large commercial environment the TLS processing is often
>>>> completely offloaded onto an accelerator that strips out the TLS and
>>>> hands clean IP packets to the Web Service. Another frequent screw case
>>>> is TLS proxies like bluecoat devices.
>>>>
>>>>
>>>> But even in the simplest TLS use case, the TLS security context is
>>>> really not exposed to the Web Server or the client in the way you
>>>> would need to use it for Web Services authentication in the commonly
>>>> used APIs. The problem is that TLS is designed to conceal all the
>>>> complexity of crypto from the application. That is why it was called
>>>> SSL at the start.
>>>>
>>>>
>>>>> In one post you talked about sequencing and replay protection for
>>>>> chunks.  Adding that to the MAC really gets us close to the MIC token
>>>>> features/design from RFCs 1964/4121 (Kerberos GSS mech).  We're
>>>>> talking about having a sequence number.  As you say: this isn't
>>>>> difficult; we've been doing this for a loooong time in Kerberos land.
>>>>
>>>> Heh, you could use a Kerberos token in my Omnibroker protocol if you
>>>> wanted to. But since it is an opaque string of bytes as far as the
>>>> client is concerned, well there is no reason to tie it to any one
>>>> approach.
>>>>
>>>> People have been using kerberized cookies for years. The problem being
>>>> that the cookie is not at all bound to the requests or responses.
>>>>
>>>>> Note that there's no need for sequence numbers to randomized given
>>>>> that we have session keys, but sequence number windows add to the
>>>>> state to be kept on the server side -- can we tolerate that? Note that
>>>>> while session key state might be kept on the client in an encrypted
>>>>> state ticket, session number windows cannot safely be kept that way --
>>>>> they must be kept locally.  I tend to think that sequencing and replay
>>>>> protection are the responsibility of the application -- all it needs
>>>>> to do is add a sequence number to the chunks and manage its own
>>>>> [per-resource] sequence number windows.
>>>>
>>>> The way I was thinking of helping the application was to provide a
>>>> feature that allows Content-Integrity header to specify a key modifier
>>>> as well as a ticket. The key used to calculate the MAC would then be
>>>> the XOR of the modifier and the authentication key associated with the
>>>> ticket.
>>>>
>>>>
>>>>> Altogether we need: a session key identifier in the headers (this
>>>>> should imply algorithm selection), a direction identifier (or separate
>>>>> keys for each direction), a sequence number if we need sequencing
>>>>> and/or replay detection, what content to MAC, and the MAC itself.
>>>>
>>>> Direction is already implicit in HTTP requests/responses.
>>>>
>>>> I don't think we need sequencing, we do need a modifier capability though.
>>>>
>>>>> Regarding what to MAC: the direction flag (unless we have diff keys
>>>>> for each direction), the channel binding for the TLS channel (if we
>>>>> have it), the URL?, some subset of headers? and the body.  Note that
>>>>> applying the MAC to any headers requires that we say something about
>>>>> canonicalization (e.g., "use the headers exactly as sent") and
>>>>> canonical order (if a subset of headers) (e.g., "in the relative order
>>>>> of appearance").  Header and body content need to be unambiguously
>>>>> separated in the MAC input.  Obviously we can't MAC all headers: some
>>>>> might be added by proxies, for example.
>>>>
>>>> I don't see the need to MAC any headers for a Web Service application.
>>>> Put all the information in the content block.
>>>>
>>>> Otherwise we would have to do the sort of thing that DKIM does to sign
>>>> headers and copy them.
>>>>
>>>>
>>>> --
>>>> Website: http://hallambaker.com/
>>>>
>>
>>
>>
>> --
>> Website: http://hallambaker.com/
Received on Thursday, 12 July 2012 17:01:29 UTC