Re: More on allowed field characters

On Mon, Aug 23, 2021 at 04:20:29PM +1000, Martin Thomson wrote:
> Hey Willy,
> 
> Take another look at the text I've added to https://github.com/httpwg/http2-spec/pull/936
> 
> It now goes:
> 
> * HPACK can carry anything.
> * HTTP defines what is valid.
> * Recipients can treat anything that is not valid as malformed (using our formal definition).
> * This section includes additional validation requirements.
> * <those additional requirements>

OK, looking at it in the whole context, I think it better addresses the problem:
  - http framing mentions malformed messages
  - malformed messages refer to [HttpHeaders] for the the definition of
    invalid field names or values
  - [HttpHeaders] has a Field Validity subsection which starts by defining
    malformed as anything that doesn't strictly follow HTTP.

It is also mentioned that intermediaries must not forward malformed messages.

However the next paragraph mentioning this does still maintain some confusion:

  A field name MUST NOT contain characters in the ranges 0x00-0x20, 0x41-0x5A, or 0x7F-0xFF
  (all ranges inclusive).  This limits field names to visible ASCII characters, other than
  ASCII SP (0x20) and uppercase characters ('A' to 'Z', ASCII 0x41 to 0x5a).

Especially this last sentence which *seems* to endorse the use of such
other characters for header field names, thus contradicting the first points.

Why not instead merge these two paragraphs into a single one:

  All HTTP/2 implementations MUST validate fields in messages they receive as follows:
  ...
  A field name MUST NOT contain characters outside the range permitted
  by [HTTP-5.1]. A field value MUST NOT contain characters outside the
  range permitted by [HTTP-5.5].

This allows to even remove the last few points about special cases for names
vs values.

I would personally be fine with going further in the spirit of the strictness
of the initial work on H2 by declaring as dangerously non-compliant any sender
of some of the invalid characters you wanted to block above, and protect against
their inclusion into the HPACK dictionary:

   An HTTP/2 recipient that receives a field name containing a character
   in the range 0x00-0x20, 0x41-0x51, or 0x7F-0xFF (all ranges inclusive),
   or a field value containing a character 0x00 (NUL), 0x0A (LF), 0x0D (CR)
   MAY treat this as a connection error of type PROTOCOL_ERROR.

But I also understand that this could punish blind H2->H2 coalescing gateways
that have no idea what they're dealing with, and might be a bit extreme.

Willy

Received on Monday, 23 August 2021 07:30:50 UTC