Re: draft-ietf-httpbis-jfv: what's next from Willy Tarreau on 2016-10-15 (ietf-http-wg@w3.org from October to December 2016)

From: Willy Tarreau <w@1wt.eu>
Date: Sat, 15 Oct 2016 23:35:19 +0200
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: Martin Thomson <martin.thomson@gmail.com>, Matt Menke <mmenke@google.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20161015213519.GA28177@1wt.eu>

On Sat, Oct 15, 2016 at 03:49:04PM +0000, Poul-Henning Kamp wrote:
> --------
> In message <CABkgnnXw7WacnMf4Nsx-drktn__V4afK61G67A5bT5SSdqaucQ@mail.gmail.com>
> , Martin Thomson writes:
> >On 15 October 2016 at 20:41, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> >> Looking forward, if we want to be able to use CS to build H3
> >> compression, we cannot allow CS headers with format errors.
> >
> >I tend to agree with this, though there are levels of format errors.
> >For instance, if you use the >< notation and the < is absent, that's a
> >flat parse error (I would argue that the < is redundant actually, save
> >an octet).
> 
> It is redundant, but it might still be a good idea.
> 
> Truncation of headers happens a lot more than it should in the wild,
> so apart from the recursive role of the '<' I do like that it also
> tells you that you are not missing half the header.

And it also helps when headers are duplicated. Below it's becoming
obvious what type of duplication happened, and where :

     Foo1: >blah<, >bar<
     Foo2: >blah, bar<

> >But what I think that Matt is looking for is a grammar that supports
> >an in-band signal about type so that syntax checking can be done by
> >the parser (and not by the semantics layer).  That - to me - seems
> >like a pretty reasonable request.
> 
> Yes, I agree, but it runs into the very inclusive definition of
> RFC7230::token.
> 
> We need three markers: '(h1_)number', 'h1_timestamp' and 'h1_blob',
> which are all valid 'identifier' (= RFC7230::token) today.
> 
> We have three options:
> 
> 1. Keep using RFC7230::token for 'identifier'
(...)
> 2. Restrict 'identifier'
(...)
> 3. Let the semantic layer sort it out.
(...)

> I picked 3 based on 'minimum intrusiveness', but I can live with
> all three.

And what about #4 consisting in indicating the encoding in the header
field *name* instead, since we're suggesting to use this for new fields ?
It could be possible to prepend something like "-1", "-2", etc... in front
of the field name to indicate its expected type and how it's supposed to
be decoded, and reserve such fields' name syntax only for these typed
header fields.

Just my two cents,
Willy

Received on Saturday, 15 October 2016 21:35:49 UTC