Re: HTTP/2: allow binary data in header field values

On 29/08/17 16:10, Willy Tarreau wrote:
> Hi,
> 
> On Mon, Aug 28, 2017 at 06:34:31PM -0700, Piotr Sikora wrote:
>> The proposal I have in mind is based on what gRPC is already doing [1], i.e.:
>>
>> 1. Each peer announces that it accepts binary data via HTTP/2 SETTINGS option,
> 
> This sounds reasonable.
> 
>> 2. Binary header field values are prefixed with NUL byte (0x00), so
>> that binary value 0xFF is encoded as a header field value 0x00 0xFF.
>> This allows binary-aware peers to differentiate between binary headers
>> and VCHAR headers. In theory, this should also protect peers unaware
>> of this extension from ever accepting such headers, since RFC7540
>> requires that requests/responses with headers containing NUL byte
>> (0x00) MUST be treated as malformed and rejected, but I'm not sure if
>> that's really enforced.
> 
> I've just checked my experimental implementation in haproxy and I'm
> indeed missing this check. There are two reasons to this, which I think
> could indicate others might have done the same mistake :
>    - the header values are prefixed with their length so there's no need
>      to parse their contents before copying ;
> 
>    - the reminder of the list of forbidden characters is only present in
>      the security consideration section of 7540 and never mentionned in
>      7541, while it's mostly when processing hpack that such characters
>      have a chance to be detected.
> 
> So at least some check would be needed on other implementations (well,
> I might very well have produced the only bogus one, but I think that
> overlooking this is easy).
> 
>> 3. Binary-aware peers MUST base64 encode binary header field values
>> when forwarding them to peers unaware of this extension and/or when
>> converting to HTTP/1.1.
> 
> I really don't like this, because it becomes possible for a sender to
> produce special values on the other side by benefitting from the
> decoding, thus evading certain filtering measures. For example, an
> agent may send :
> 
>     Content-Encoding: \x00\x83\x38\xa9
> 
> and the binary-aware recipient having to "encode" it as gzip would
> apply base64 to "\x83\x38\xa9" and would produce :
> 
>     Content-Encoding: gzip
> 
> You can apply the same principle to other fields like content-length or
> other and easily cause some trouble. If you need this, at the very least
> it's important to enclose the encoded value between some "rare enough"
> characters to prevent such risks. Something like this for example :
> 
>     Binary-Header: $b64<Y2h1bmtlZA==>$
> 
> The example above would produce  :
> 
>     Content-Encoding: $b64<gzip>$
> 
> That's just an example of course.
> 
>> 4. Binary header field values cannot be concatenated, because there is
>> no delimiter that we can use.
> 
> Good point.
> 
>> NOTE: This proposal implies that endpoints SHOULD NOT use binary
>> header field values before receiving HTTP/2 SETTINGS from the peer.
>> However, since, at least in theory, all RFC7540-compliant peers
>> unaware of this extension MUST reject requests with headers containing
>> NUL byte (0x00) with a stream error, endpoints could opportunistically
>> use binary header field values on the first flight and assume that if
>> peer isn't aware of this extension, then it will reject the request,
> 
> This is really the part which needs to be verified I think.
> 
> I'm now thinking that prefixing the value with <CR><NUL> instead of <NUL>
> could be more efficient : while <NUL> may translate to an end of string
> on many implementations and simply result in a field silently being
> processed as empty, a <CR> in the middle of a string is always an
> error,

No. Think TLS certificates and crypto keys in their binary form. Both CR 
and NUL are valid "string" characters. I'm suspecting delivery of such 
keys in headers is part of where this proposal is coming from.


> unless the next character is <LF> in which case what follows is
> the next header field. In case a gateway would inappropriately forward
> the <CR> to HTTP/1, it would produce an invalid message, even if
> truncated on the <NUL> since the line would end in <CR><CR><LF>.
> 

I don't think so. A bare-CR in RFC 723x is no longer an end-of-line 
character. When placed at the beginning or end of a field-value it is 
just a part of the BSP / OSP whitespace construction, and when placed 
mid-value it is part of the value string [albeit with malformed encoding].

The only invalid part here is the NUL. And ...

My experience from discussions with server admin in the past is that 
seeing CR NUL sequence where they expect HTTP/1 Mime-format headers 
people assume immediately that the NUL was supposed to be a LF and do 
something to 'fix' the problem by adding it as the message transits the 
Apache / IIS servers CGI gateway layer.


Amos

Received on Tuesday, 29 August 2017 05:45:09 UTC