Re: Header Structure: goals from Poul-Henning Kamp on 2017-06-16 (ietf-http-wg@w3.org from April to June 2017)

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Fri, 16 Jun 2017 07:42:32 +0000
To: Julian Reschke <julian.reschke@gmx.de>
cc: Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <25235.1497598952@critter.freebsd.dk>

--------
In message <4d7d2510-ceae-dbe4-39df-eef16e744431@gmx.de>, Julian Reschke writes
:

>We are clearly not making sufficient progress solving the problem we 
>wanted to solve - to make it easier to define new header fields with a 
>syntax that is not "broken" and where processors can use off-the-shelf 
>components to do the parsing.

There are two possible overall aproaches to this:

1) The RFC which defines a new header includes a machine-readable
   specification you can copy&paste into your project, where through
   some mechanism of your choice, that turns into parser code.

   This makes the draft harder to produce, becaus we have to invent
   and specify that machine-readable specification language and
   provide reference implementations for a parser-generator and
   a syntax-checker tool for RFC writers.

   Advantages:

   * The parser and the representation of the parsed result can be
     optimized for your programming environment, for instance with
     respect to datatype for numbers etc.

   * Runtime performance.

   * HPACKng can use the same machine-readable specification(s) to
     optimize compression.

   * The parser can detect headers with RFC-extraneous fields.

   Disadvantages:

   * You get none of these advantages until you've read the RFC and
     copy&pasted the machine-readable spec into your project.

2) The RFC which defines a new header merely stays inside the dotted
   lines of HS, and a generic parser builds an internal represenation
   of each instance of the header, which your code can query, based on
   your reading of the RFC.

   That is basically the draft we have today, after scrutiny and
   text-processing.

   Advantages:

   * Your parser also supports RFCs you have not read.

   Disadvantages:

   * The represenation of the parse-tree will be bulkier and less
     efficient.

   * Probing of the parse-tree from the application is clumsy and
     more error-prone.

   * RFC-extraneous fields in a header will not be detected unless
     the application programmer does it manually.

Number two is close to impossible to avoid:  It will be reimplemented
by application programmers using regexps to take headers apart if
we don't provide for it.

So by and large, the question before us is if number one is worth
the effort, and I think that hinges on the HPACKng advantage.

The alternative take on HPACKng, would be a speculative semantic
compressor, which hunts for timestamps, numbers, and base64-encoded
substrings which can be transmitted more efficiently semantically
than by text-compression.

Again, the advantage of that model is that you don't need to read
the RFCs first to reap the advantage.

So summa summarum:  It probably depends more than anything on how
many new headers we expect and what they transmission volume and
semantic compression potential will be.

I'm still not sure where I personally land on this, but given the
WG activity on this draft, I guess going for number one would fall
almost entirely on me, in which case it won't happen...

>Note that this is not a complaint vs PHK - 

No worries.


-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

Received on Friday, 16 June 2017 07:43:02 UTC