Re: HTTP/2: allow binary data in header field values

Just to add to this, a semantic definition would prevent encoding optimization loss between multiple hops (e.g. h2->h1->h2). You ideally want the definition to persist so that any intermediary, and additionally any layer of the stack understands how to interact with it. 

-Jason

> On Aug 29, 2017, at 12:52 PM, Mike Bishop <Michael.Bishop@microsoft.com> wrote:
> 
> As with other typed header fields (and let's be clear, binary blob is just another type), this isn't about changing HTTP/2, it's about changing HTTP.  Currently, header fields in HTTP are, by definition, sequences of octets with a scoped range of valid values.  If you change the allowed values, that's a change at the semantic layer, not to any given transport mapping.  This is the HTTP WG; we can do that, but let's be clear what we're talking about.  But we'd need to have reasonable ways of ensuring that the values are sanitized before they're passed to "legacy" HTTP consumers.
> 
> As you note, HTTP/2 and HPACK are already perfectly capable of transporting these octets.  You can even Huffman-encode a binary blob if you want -- all possible values are listed in the table, though non-ASCII octets are severely disadvantaged.  That's precisely what the Security Considerations says -- HTTP/2 (i.e. the TCP mapping) is capable of transporting header values that aren't valid HTTP, and it's the HTTP layer's responsibility to validate that.  Obviously, if you rev HTTP to make those valid values, those checks would be modified.  The HTTP/QUIC mapping is no different -- it's capable of transporting these values already, but the HTTP layer knows they're not valid.
> 
> On the whole, I can see niche situations where this might be useful, but I think it will be difficult to deploy generally.  Our stacks essentially act as HTTP/1.1-to-2 intermediaries within client and server; we don't assume that the apps above our layer are HTTP/2-aware, though obviously we expose ways to take advantage of extra features.  Unless we wanted to add additional header set/get APIs that supported typing, I suspect we would initially opt not to advertise this extension rather than base64-encode headers upon arrival.  That's just extra work for no apparent benefit.
> 
> And if we're going to go this route and modify HTTP itself, let's have a reasonable set of types instead of just adding one at a time.  😊
> 
> -----Original Message-----
> From: Piotr Sikora [mailto:piotrsikora@google.com] 
> Sent: Monday, August 28, 2017 6:35 PM
> To: HTTP Working Group <ietf-http-wg@w3.org>
> Cc: Craig Tiller <ctiller@google.com>
> Subject: HTTP/2: allow binary data in header field values
> 
> Hi,
> as discussed with some of you in Prague, I'd like remove the restriction on CR, LF & NUL characters and allow binary data in header field values in HTTP/2.
> 
> Both HTTP/2 and HPACK can pass binary data in header field values without any issues, but RFC7540 put an artificial restriction on those characters in order to protect clients and intermediaries converting requests/responses between HTTP/2 and HTTP/1.1.
> 
> Unfortunately, this restriction forces endpoints to use base64 encoding when passing binary data in headers field values, which can easily become the CPU bottleneck.
> 
> This is especially true in multi-tier proxy deployments, like CDNs, which are connected over high-speed networks and often pass metadata via HTTP headers.
> 
> The proposal I have in mind is based on what gRPC is already doing [1], i.e.:
> 
> 1. Each peer announces that it accepts binary data via HTTP/2 SETTINGS option,
> 
> 2. Binary header field values are prefixed with NUL byte (0x00), so that binary value 0xFF is encoded as a header field value 0x00 0xFF.
> This allows binary-aware peers to differentiate between binary headers and VCHAR headers. In theory, this should also protect peers unaware of this extension from ever accepting such headers, since RFC7540 requires that requests/responses with headers containing NUL byte
> (0x00) MUST be treated as malformed and rejected, but I'm not sure if that's really enforced.
> 
> 3. Binary-aware peers MUST base64 encode binary header field values when forwarding them to peers unaware of this extension and/or when converting to HTTP/1.1.
> 
> 4. Binary header field values cannot be concatenated, because there is no delimiter that we can use.
> 
> NOTE: This proposal implies that endpoints SHOULD NOT use binary header field values before receiving HTTP/2 SETTINGS from the peer.
> However, since, at least in theory, all RFC7540-compliant peers unaware of this extension MUST reject requests with headers containing NUL byte (0x00) with a stream error, endpoints could opportunistically use binary header field values on the first flight and assume that if peer isn't aware of this extension, then it will reject the request, which can be subsequently retried with base64 encoded header field values.
> 
> I'd like to hear if anyone strongly disagrees with this proposal and/or the binary data in header field values in general. Otherwise, I'm going to write a draft and hopefully we can standardize this before HTTP/2-over-QUIC, so that binary header field values can be supported there natively and not via extension.
> 
> [1] https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgrpc%2Fproposal%2Fblob%2Fmaster%2FG1-true-binary-metadata.md&data=02%7C01%7CMichael.Bishop%40microsoft.com%7C3c1991f44f854de0c52e08d4ee7eaa87%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636395675174515196&sdata=dd2q2yYa%2FeBFZvJIvRsJhOjriWgrHPrxnRxmMFvjxss%3D&reserved=0
> 
> Best regards,
> Piotr Sikora
> 

--
Jason T. Greene
Chief Architect, JBoss EAP
Red Hat

Received on Tuesday, 29 August 2017 19:49:49 UTC