Re: identifier in draft-ietf-httpbis-header-structure-01.txt from Poul-Henning Kamp on 2017-05-01 (ietf-http-wg@w3.org from April to June 2017)

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Mon, 01 May 2017 17:41:58 +0000
To: Alex Rousskov <rousskov@measurement-factory.com>
cc: HTTP working group mailing list <ietf-http-wg@w3.org>
Message-ID: <12558.1493660518@critter.freebsd.dk>

--------
In message <a72ec50c-a19d-37ae-be82-a37d6977cbf6@measurement-factory.com>, Alex Rousskov writes:

>>> [...] and we should focus on a small set of unambiguous elements [...]
>
>> That is simply not possible for existing headers.
>
>Can we define all existing headers using a small set of unambiguous
>elements? Probably not, but that should not be our goal. We only need to
>cover common usage of common fields plus (nearly) all future ones.

We can come very close, but you will always need to know what
header you are parsing to know how to parse it.

This is the "bespoke parsers" the draft talks about.

It would be a giant leap forward, if we could respecify the
existing headers in a format which would allow those parsers
to be machine generated directly from the spec.  (See strawman
in my first reply to you)

>> Current header definitions make 3.14159265 a valid "token":
>
>Yes, but _we_ do not have to define 3.14159265 as both "number" and
>"identifier".

For future headers:  Absolutely.

For existing headers: No, we don't get to narrow their definition
that way.

>Yes, if I am building a generic parser that wants to understand senders
>intent in _all_ legacy cases, then my parse tree cannot use some of the
>draft elements "as is", but since understanding senders intent in all
>cases is an unsolvable problem in legacy HTTP, we should not focus on
>that esoteric use case too much!

I don't think it is unsolveable, after all, the current headers have
ABNF definitions and they may be flawed, but they are not random.

And if we want decent compression from a semantic(ally informed)
HPACK2/QUIC/whatever serializerers, they, ipso facto, has to start
with a generic parser for the most common headers to be worth
the effort.

>A generic parser that might confuse a number with identifier is possible
>to build for legacy HTTP while perfect parsers are possible for new
>>fields<, and that is good enough. Same for generators.

Well, that really depends on what the goal is, doesn't it ?  :-)

And yes, I'm open for input on that

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

Received on Monday, 1 May 2017 17:42:28 UTC