W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2021

Re: More on allowed field characters

From: Willy Tarreau <w@1wt.eu>
Date: Mon, 23 Aug 2021 10:53:54 +0200
To: Martin Thomson <mt@lowentropy.net>
Cc: ietf-http-wg@w3.org
Message-ID: <20210823085354.GA15502@1wt.eu>
On Mon, Aug 23, 2021 at 05:49:23PM +1000, Martin Thomson wrote:
> Thanks Willy,
> 
> > However the next paragraph mentioning this does still maintain some confusion:
> > 
> >   A field name MUST NOT contain characters in the ranges 0x00-0x20, 
> > 0x41-0x5A, or 0x7F-0xFF
> >   (all ranges inclusive).  This limits field names to visible ASCII 
> > characters, other than
> >   ASCII SP (0x20) and uppercase characters ('A' to 'Z', ASCII 0x41 to 
> > 0x5a).
> > 
> > Especially this last sentence which *seems* to endorse the use of such
> > other characters for header field names, thus contradicting the first points.
> 
> Yeah, that's a mistake.  How about:
> 
> > This specifically excludes all non-visible ASCII characters, ASCII SP (0x20), and uppercase characters ('A' to 'Z', ASCII 0x41 to 0x5a).
> 
> I don't think that we need to combine paragraphs as you suggest.  The point
> is that these are *extra* checks that apply even if you are only forwarding
> messages.

This is interesting because your expressed intent here is particularly
important but it doesn't translate well into the text. I mean, once the
intent is known it makes sense, but the text alone is not sufficient to
figure the intent and we still risk lazy checks there. Then why not
completely reformulate the introductory line around this:

  Field Validity

  HTTP/2 and HPACK are length-delimited and technically permit any
  character to be encoded. Attacks targetting HTTP/2 implementations
  exploit ambiguous header component delimiters before the message is
  reassembled and submitted to HTTP semantics checks, or forwarded. For
  this reason, immediately after HPACK decoding and before trying to
  process or forward an HTTP message, all HTTP/2 endpoints MUST
  validate fields in message they receive as follows:

  (... then we can go on with the current enumeration ...)

And at the end of this enumeration:

  These checks represent the strict minimum that permits to safely
  reconstruct a safe HTTP message, are not exclusive with the HTTP
  semantics checks on header field syntax described in HTTP#5.1 and
  HTTP#5.5.

>  (RFC 7540 basically said what you suggest and it turned out that
> no one paid any attention to it.)

Yeah I understand. It was so easy to overlook parts of 7540 :-)

> >    An HTTP/2 recipient that receives a field name containing a character
> >    in the range 0x00-0x20, 0x41-0x51, or 0x7F-0xFF (all ranges inclusive),
> >    or a field value containing a character 0x00 (NUL), 0x0A (LF), 0x0D (CR)
> >    MAY treat this as a connection error of type PROTOCOL_ERROR.
> 
> A connection error is something we already permit.  It's just indirect.  You
> have to follow the chain from here to the definition of malformed and then to
> stream errors, but it's there.

This is a good point indeed.

Willy
Received on Monday, 23 August 2021 08:54:14 UTC

This archive was generated by hypermail 2.4.0 : Monday, 23 August 2021 08:54:15 UTC