Re: Header Structure: goals

Personally -- I'd be fine with #2; a machine-readable schema can come later. Schema languages have their own set of pitfalls, and defining one now seems... ambitious.

What I really want is a vocabulary that spec authors can use unambiguously, and a processing model that takes as many of the nasty questions about parsing headers off their shoulders as possible.

Cheers,


> On 16 Jun 2017, at 5:42 pm, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> 
> --------
> In message <4d7d2510-ceae-dbe4-39df-eef16e744431@gmx.de>, Julian Reschke writes
> :
> 
>> We are clearly not making sufficient progress solving the problem we 
>> wanted to solve - to make it easier to define new header fields with a 
>> syntax that is not "broken" and where processors can use off-the-shelf 
>> components to do the parsing.
> 
> There are two possible overall aproaches to this:
> 
> 1) The RFC which defines a new header includes a machine-readable
>   specification you can copy&paste into your project, where through
>   some mechanism of your choice, that turns into parser code.
> 
>   This makes the draft harder to produce, becaus we have to invent
>   and specify that machine-readable specification language and
>   provide reference implementations for a parser-generator and
>   a syntax-checker tool for RFC writers.
> 
>   Advantages:
> 
>   * The parser and the representation of the parsed result can be
>     optimized for your programming environment, for instance with
>     respect to datatype for numbers etc.
> 
>   * Runtime performance.
> 
>   * HPACKng can use the same machine-readable specification(s) to
>     optimize compression.
> 
>   * The parser can detect headers with RFC-extraneous fields.
> 
>   Disadvantages:
> 
>   * You get none of these advantages until you've read the RFC and
>     copy&pasted the machine-readable spec into your project.
> 
> 2) The RFC which defines a new header merely stays inside the dotted
>   lines of HS, and a generic parser builds an internal represenation
>   of each instance of the header, which your code can query, based on
>   your reading of the RFC.
> 
>   That is basically the draft we have today, after scrutiny and
>   text-processing.
> 
>   Advantages:
> 
>   * Your parser also supports RFCs you have not read.
> 
>   Disadvantages:
> 
>   * The represenation of the parse-tree will be bulkier and less
>     efficient.
> 
>   * Probing of the parse-tree from the application is clumsy and
>     more error-prone.
> 
>   * RFC-extraneous fields in a header will not be detected unless
>     the application programmer does it manually.
> 
> Number two is close to impossible to avoid:  It will be reimplemented
> by application programmers using regexps to take headers apart if
> we don't provide for it.
> 
> So by and large, the question before us is if number one is worth
> the effort, and I think that hinges on the HPACKng advantage.
> 
> The alternative take on HPACKng, would be a speculative semantic
> compressor, which hunts for timestamps, numbers, and base64-encoded
> substrings which can be transmitted more efficiently semantically
> than by text-compression.
> 
> Again, the advantage of that model is that you don't need to read
> the RFCs first to reap the advantage.
> 
> So summa summarum:  It probably depends more than anything on how
> many new headers we expect and what they transmission volume and
> semantic compression potential will be.
> 
> I'm still not sure where I personally land on this, but given the
> WG activity on this draft, I guess going for number one would fall
> almost entirely on me, in which case it won't happen...
> 
>> Note that this is not a complaint vs PHK - 
> 
> No worries.
> 
> 
> -- 
> Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG         | TCP/IP since RFC 956
> FreeBSD committer       | BSD since 4.3-tahoe    
> Never attribute to malice what can adequately be explained by incompetence.

--
Mark Nottingham   https://www.mnot.net/

Received on Tuesday, 20 June 2017 02:21:48 UTC