HTTP/2: allow binary data in header field values

Hi,
as discussed with some of you in Prague, I'd like remove the
restriction on CR, LF & NUL characters and allow binary data in header
field values in HTTP/2.

Both HTTP/2 and HPACK can pass binary data in header field values
without any issues, but RFC7540 put an artificial restriction on those
characters in order to protect clients and intermediaries converting
requests/responses between HTTP/2 and HTTP/1.1.

Unfortunately, this restriction forces endpoints to use base64
encoding when passing binary data in headers field values, which can
easily become the CPU bottleneck.

This is especially true in multi-tier proxy deployments, like CDNs,
which are connected over high-speed networks and often pass metadata
via HTTP headers.

The proposal I have in mind is based on what gRPC is already doing [1], i.e.:

1. Each peer announces that it accepts binary data via HTTP/2 SETTINGS option,

2. Binary header field values are prefixed with NUL byte (0x00), so
that binary value 0xFF is encoded as a header field value 0x00 0xFF.
This allows binary-aware peers to differentiate between binary headers
and VCHAR headers. In theory, this should also protect peers unaware
of this extension from ever accepting such headers, since RFC7540
requires that requests/responses with headers containing NUL byte
(0x00) MUST be treated as malformed and rejected, but I'm not sure if
that's really enforced.

3. Binary-aware peers MUST base64 encode binary header field values
when forwarding them to peers unaware of this extension and/or when
converting to HTTP/1.1.

4. Binary header field values cannot be concatenated, because there is
no delimiter that we can use.

NOTE: This proposal implies that endpoints SHOULD NOT use binary
header field values before receiving HTTP/2 SETTINGS from the peer.
However, since, at least in theory, all RFC7540-compliant peers
unaware of this extension MUST reject requests with headers containing
NUL byte (0x00) with a stream error, endpoints could opportunistically
use binary header field values on the first flight and assume that if
peer isn't aware of this extension, then it will reject the request,
which can be subsequently retried with base64 encoded header field
values.

I'd like to hear if anyone strongly disagrees with this proposal
and/or the binary data in header field values in general. Otherwise,
I'm going to write a draft and hopefully we can standardize this
before HTTP/2-over-QUIC, so that binary header field values can be
supported there natively and not via extension.

[1] https://github.com/grpc/proposal/blob/master/G1-true-binary-metadata.md

Best regards,
Piotr Sikora

Received on Tuesday, 29 August 2017 01:34:55 UTC