Re: Permitted characters in HTTP/2 fields

This conversation started with:

At our last interim, we discussed potential ways in which HTTP/2 was
> probably too strict about characters (octets really) in field names and
> values.
> The conclusion then was to loosen the restriction and mandate only a small
> set of checks.  This should match what implementations already do.


Any chance of describing exactly what those reasons are, because it's lost
on me exactly what problem is being solved here.      If we don't have a
full brief for these changes, then how are we meant to evaluate them or
indeed record the reason for posterity.  Neither #815 nor #846 explain the
problem other than say the text is confusing.  There is no motivation for
why validation should be less than carrying HTTP fields plus pseudo fields.

I don't mind the current text so much, as it says I can validate against
HTTP semantic fields as defined by
https://www.ietf.org/archive/id/draft-ietf-httpbis-semantics-15.html#section-5,
so I will.   I'm just going to reject any other fields and I'm allowed to,
so I'm happy.    But I have no idea why we want to allow implementations to
send non compliant fields around.   Isn't that just asking for problems.
 If it is because some existing implementations are already sending invalid
fields, then they are doing so regardless and unless you say an impl must
accept them, then any impl may reject them as invalid. So changing the spec
to be less strict makes no difference so long as impls are allowed to
actually enforce correct validation.

Finally, when the "Brief" says we should match what implementations already
do, then the question is which implementations are to be matched?   If
there are some implementations that already enforce the precise spec for
HTTP headers, then should we match those imples or are some implementations
more match worthy than others?



On Fri, 21 May 2021 at 11:22, Martin Thomson <mt@lowentropy.net> wrote:

> Hey Willy,
>
> On Fri, May 21, 2021, at 02:59, Willy Tarreau wrote:
> > I really agree. I don't remember if 0x80 and above are forbidden in H2
> but
> > I'd personally prefer to block them so that we don't needlessly introduce
> > the risk of aliasing due to different codings being used. Protocol
> elements
> > that define how messages should be delimited/routed/etc must be strictly
> > defined and easy to enforce in implementations and applications.
>
> We never really said before.  I'm happy to extend the 0x7f to 0x7f-0xff if
> that is what others want.  It's not quite the same as limiting the grammar
> to what is permitted for field names, but it might be OK.
>
> field-name is "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
> "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
>
> That amounts to a whole bunch of characters less than %x21-7E (minus
> ':').  A simpler check for c >= 0x21 && c <= 0x7e && c != ':' seems
> reasonable to me.  Then we don't have to worry about Unicode field names.
> That's not a whole lot different than c >= 0x21 && c != 0x7e && c != ':' as
> the current PR has.
>
> I had the distinct impression that we DID see Unicode field names in some
> cases though.
>
> We wanted to avoid backward incompatibility issues that might result from
> tighter constraints on field *values*, which is why we never said anything
> before, but names might be easier.
>
>

-- 
Greg Wilkins <gregw@webtide.com> CTO http://webtide.com

Received on Friday, 21 May 2021 02:36:40 UTC