Re: HTTP/2: allow binary data in header field values

On 29/08/17 13:34, Piotr Sikora wrote:
> Hi,
> as discussed with some of you in Prague, I'd like remove the
> restriction on CR, LF & NUL characters and allow binary data in header
> field values in HTTP/2.
> 
> Both HTTP/2 and HPACK can pass binary data in header field values
> without any issues, but RFC7540 put an artificial restriction on those
> characters in order to protect clients and intermediaries converting
> requests/responses between HTTP/2 and HTTP/1.1.
> 

The prohibition of these characters is a Security Considerations 
requirement in HTTP. It would be best to keep that fact clearly 
up-front. It was not a casual / arbitrary design decision, so the 
reasons for it cannot just be ignored when implementing or negotiating 
extended behaviour.

So long as HTTP/1 <-> HTTP/2 gateways exist the security attacks will 
remain a problem. This is not a theoretical problem either, 
intermediaries are still fending off active attacks and malformed agent 
messages involving these three characters in HTTP/1.x environment before 
HTTP/2 mapping even gets involved.

The simple problem is that one cannot guarantee the absence of a mapping 
gateway in any transaction. So it HAS to be considered by every agent 
involved.


> Unfortunately, this restriction forces endpoints to use base64
> encoding when passing binary data in headers field values, which can
> easily become the CPU bottleneck.

It is worth noting that base64 encoding is more efficiently encoded by 
HPACK. So avoiding it is a con', not a pro'.

> 
> This is especially true in multi-tier proxy deployments, like CDNs,
> which are connected over high-speed networks and often pass metadata
> via HTTP headers.

It is worth noting that the RFC7540 offers some benefits here. Any of 
their internal traffic using the extended ability that gets leaked will 
be actively rejected by RFC7540 compliant agents outside.


> 
> The proposal I have in mind is based on what gRPC is already doing [1], i.e.:
> 
> 1. Each peer announces that it accepts binary data via HTTP/2 SETTINGS option,
> 
> 2. Binary header field values are prefixed with NUL byte (0x00), so
> that binary value 0xFF is encoded as a header field value 0x00 0xFF.
> This allows binary-aware peers to differentiate between binary headers
> and VCHAR headers. In theory, this should also protect peers unaware
> of this extension from ever accepting such headers, since RFC7540
> requires that requests/responses with headers containing NUL byte
> (0x00) MUST be treated as malformed and rejected, but I'm not sure if
> that's really enforced.

There is no need for this. With the SETTINGS value already negotiating 
the ability HPACK simply needs to decode the wire syntax into a binary 
'string' header.

Agents that comply and reject the headers will not be negotiating to 
accept them. If the binary value sent in any particular message header 
does not use these trouble characters there is no harm in letting it 
through, so artificially forcing rejection is not beneficial here like 
the RFC 7540 requirement was for default / general use.


> 
> 3. Binary-aware peers MUST base64 encode binary header field values
> when forwarding them to peers unaware of this extension and/or when
> converting to HTTP/1.1.

As written that would violate RFC 7540. This requirement needs to take 
the form of prohibiting sending binary headers to any peers which has 
not explicitly negotiated the extension being defined.
  ie, comply with RFC7540 an all connections unless explicitly 
negotiated otherwise on a per-connection basis.


> 
> 4. Binary header field values cannot be concatenated, because there is
> no delimiter that we can use.

Of course they can. Every coding language has some form of array or 
linked-list structure available.

To display these type of header in *ASCII MiME format* on the other hand 
requires encoding by the display agent. HTTP/2 does not change any 
requirements around display, it is concerned only with the on-wire delivery.



> 
> NOTE: This proposal implies that endpoints SHOULD NOT use binary

No. MUST NOT. RFC 7540 still applies during this pre-negotiation period.
Agents which assume capabilities not specific in HTTP/2 *will* get into 
trouble eventually.


> header field values before receiving HTTP/2 SETTINGS from the peer.
> However, since, at least in theory, all RFC7540-compliant peers
> unaware of this extension MUST reject requests with headers containing
> NUL byte (0x00) with a stream error, endpoints could opportunistically
> use binary header field values on the first flight and assume that if
> peer isn't aware of this extension, then it will reject the request,
> which can be subsequently retried with base64 encoded header field
> values.
> 
> I'd like to hear if anyone strongly disagrees with this proposal
> and/or the binary data in header field values in general. Otherwise,
> I'm going to write a draft and hopefully we can standardize this
> before HTTP/2-over-QUIC, so that binary header field values can be
> supported there natively and not via extension.

How do you plan on making it "native HTTP/2" without replacing the whole 
HTTP/2 RFC *and* getting that new specification rolled out to the 
non-QUIC world?

(I am Seriously interested in that answer. Many of us middleware 
implementers have been pushing for UTF-8 / binary support in headers for 
around 10 years already and progress has been painfully slow).

It seems to me you [like several of us] are dreaming of 
HTTP/3-over-QUIC. Not HTTP/2-over-QUIC, extended or otherwise. I am very 
doubtful that getting all this done before QUIC rolls out is going to be 
possible - a negotiable extension is far more realistic and will allow a 
testing rollout to happen before everybody in the HTTP world has to 
change code for it.

Amos

Received on Tuesday, 29 August 2017 05:28:54 UTC