Re: h2 header field names from James M Snell on 2014-09-04 (ietf-http-wg@w3.org from July to September 2014)

From: James M Snell <jasnell@gmail.com>
Date: Thu, 4 Sep 2014 08:46:15 -0700
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Amos Jeffries <squid3@treenet.co.nz>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CABP7RbcRCt_=897RUQNnRLnMGgbo3Pg5DAKH77VsUb79+U-5YQ@mail.gmail.com>

On Thu, Sep 4, 2014 at 8:03 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> On 2014-09-04 16:40, Amos Jeffries wrote:
>>
>> ...
>> ...
>>
>>> Regardless, what are you trying to accomplish with binary header
>>> values?
>>>
>>
>> Good question. In a nutshell, the gain is simplicity for
>> implementations no longer having to include base-64 encoder/decoders
>> or spend CPU cycles doing the coding.
>> ...
>
>
> I think the question is: why do you need binary data in header field values
> in the first place?
>

Just some background... This was been debated back and forth quite a
while ago. In a range of experiments I conducted a while back, I found
that by adopting efficient binary encodings for certain types of
header field values (dates, numbers, cookies, etc), we could actually
realize a significant reduction in bytes-on-the-wire. Dates, for
instance, could be encoded using as few as 4 or 5 bytes rather than
the current average of 29. At the time I was conducting those
experiments, I posted several iterations of a "binary optimized header
encoding" that defined one approach to handling the binary encoding in
a more-or-less backwards compatible way.

At the same time, the WG was starting to consider HPACK. At one of the
face to face get togethers (I forget which one), a decision was made
that HPACK could be simplified if it simply ignored the actual
encoding of header field values and instead just treated them as a
sequence of opaque bytes. It was also decided that efficient binary
encodings like what I had proposed would not be considered. This meant
that the existing header field value definitions remain untouched, but
that -- as far as the http/2 framing and header processing is
concerned -- header field values are just bags of bits, and it's up to
the application layer to figure it out from there.  The side effect of
this is that it's perfectly legal for me, at the framing layer, to
send any random sequence of bytes as the value of any header field. It
would be at least theoretically possible for a malicious actor to take
advantage of this fact by attempting to send an invalid sequence of
bytes to an intermediary/server that performs h2->h1 translation. That
translation code would need to be written defensively to ensure that
invalid or dangerous sequences are rejected.

While I'm certainly not going to argue it any further, I believe the
approach I proposed in the bohe drafts provided the most sensible and
measured approach as far as allowing binary header field values is
concerned -- particularly in that it did not attempt to do too much
and provided a clear approach to translating back to h1. I'm certainly
not a fan of what's currently documented in the h2 spec, but the time
for that debate has long since passed.

- James

> Best regards, Julian
>

Received on Thursday, 4 September 2014 15:47:03 UTC