Re: HTTP/2: allow binary data in header field values from James M Snell on 2017-08-29 (ietf-http-wg@w3.org from July to September 2017)

From: James M Snell <jasnell@gmail.com>
Date: Mon, 28 Aug 2017 22:35:21 -0700
To: Willy Tarreau <w@1wt.eu>
Cc: Piotr Sikora <piotrsikora@google.com>, HTTP Working Group <ietf-http-wg@w3.org>, Craig Tiller <ctiller@google.com>
Message-ID: <CABP7Rbdru1iQZRMRAeSed+oNuKsL0ZNvWCcUunXRN1_9jSnDFA@mail.gmail.com>
>From an implementer point of view, this could be problematic from a
security perspective but with a setting it would be easily mitigated. I for
one would definitely like to see this.

On Mon, Aug 28, 2017 at 9:10 PM, Willy Tarreau <w@1wt.eu> wrote:

> Hi,
>
> On Mon, Aug 28, 2017 at 06:34:31PM -0700, Piotr Sikora wrote:
> > The proposal I have in mind is based on what gRPC is already doing [1],
> i.e.:
> >
> > 1. Each peer announces that it accepts binary data via HTTP/2 SETTINGS
> option,
>
> This sounds reasonable.
>
> > 2. Binary header field values are prefixed with NUL byte (0x00), so
> > that binary value 0xFF is encoded as a header field value 0x00 0xFF.
> > This allows binary-aware peers to differentiate between binary headers
> > and VCHAR headers. In theory, this should also protect peers unaware
> > of this extension from ever accepting such headers, since RFC7540
> > requires that requests/responses with headers containing NUL byte
> > (0x00) MUST be treated as malformed and rejected, but I'm not sure if
> > that's really enforced.
>
> I've just checked my experimental implementation in haproxy and I'm
> indeed missing this check. There are two reasons to this, which I think
> could indicate others might have done the same mistake :
>   - the header values are prefixed with their length so there's no need
>     to parse their contents before copying ;
>
>   - the reminder of the list of forbidden characters is only present in
>     the security consideration section of 7540 and never mentionned in
>     7541, while it's mostly when processing hpack that such characters
>     have a chance to be detected.
>
> So at least some check would be needed on other implementations (well,
> I might very well have produced the only bogus one, but I think that
> overlooking this is easy).
>
> > 3. Binary-aware peers MUST base64 encode binary header field values
> > when forwarding them to peers unaware of this extension and/or when
> > converting to HTTP/1.1.
>
> I really don't like this, because it becomes possible for a sender to
> produce special values on the other side by benefitting from the
> decoding, thus evading certain filtering measures. For example, an
> agent may send :
>
>    Content-Encoding: \x00\x83\x38\xa9
>
> and the binary-aware recipient having to "encode" it as gzip would
> apply base64 to "\x83\x38\xa9" and would produce :
>
>    Content-Encoding: gzip
>
> You can apply the same principle to other fields like content-length or
> other and easily cause some trouble. If you need this, at the very least
> it's important to enclose the encoded value between some "rare enough"
> characters to prevent such risks. Something like this for example :
>
>    Binary-Header: $b64<Y2h1bmtlZA==>$
>
> The example above would produce  :
>
>    Content-Encoding: $b64<gzip>$
>
> That's just an example of course.
>
> > 4. Binary header field values cannot be concatenated, because there is
> > no delimiter that we can use.
>
> Good point.
>
> > NOTE: This proposal implies that endpoints SHOULD NOT use binary
> > header field values before receiving HTTP/2 SETTINGS from the peer.
> > However, since, at least in theory, all RFC7540-compliant peers
> > unaware of this extension MUST reject requests with headers containing
> > NUL byte (0x00) with a stream error, endpoints could opportunistically
> > use binary header field values on the first flight and assume that if
> > peer isn't aware of this extension, then it will reject the request,
>
> This is really the part which needs to be verified I think.
>
> I'm now thinking that prefixing the value with <CR><NUL> instead of <NUL>
> could be more efficient : while <NUL> may translate to an end of string
> on many implementations and simply result in a field silently being
> processed as empty, a <CR> in the middle of a string is always an
> error, unless the next character is <LF> in which case what follows is
> the next header field. In case a gateway would inappropriately forward
> the <CR> to HTTP/1, it would produce an invalid message, even if
> truncated on the <NUL> since the line would end in <CR><CR><LF>.
>
> > which can be subsequently retried with base64 encoded header field
> > values.
> >
> > I'd like to hear if anyone strongly disagrees with this proposal
> > and/or the binary data in header field values in general.
>
> I don't disagree and think that some fields could later become binary
> (dates, etc). I'm just extra careful for knowing that most HTTP/1 parsers
> are extremely lazy and that at least one H2 parser (the one I wrote) does
> not notice the presence of <CR>, <LF> or <NUL> in values. But this one is
> not yet deployed so it can be fixed. I don't know for other ones.
>
> Cheers,
> Willy
>
>
Received on Tuesday, 29 August 2017 05:36:04 UTC