Re: identifier in draft-ietf-httpbis-header-structure-01.txt from Poul-Henning Kamp on 2017-04-30 (ietf-http-wg@w3.org from April to June 2017)

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Sun, 30 Apr 2017 18:50:12 +0000
To: Alex Rousskov <rousskov@measurement-factory.com>
cc: HTTP working group mailing list <ietf-http-wg@w3.org>
Message-ID: <8399.1493578212@critter.freebsd.dk>

--------
In message <45136a00-8550-a6f8-640d-7d5d2a6a12b1@measurement-factory.com>, Alex
 Rousskov writes:

>> Header Structure is not a syntactical specification, 
>
>You have fooled several people into thinking that the draft specifies
>syntax rules (among other things). 
>[...]
>IMHO, you have to decide where to stop:

Exactly!

That has been an issue with this document from the start.

The only syntax ("parser level") rule is "The data model of Common
Structure is an ordered sequence of named dictionaries."

I've often been wondering if that is a bridge too far for this
document.

Maybe it should stop instead at the "lexical" level, and only define
a family of "tokens", and leave it to the individual header-specifying
documents to define their precise order ?

One of the tokens would absolutely be 'dictionary', but the requirement
that the *only* compliant top-level structure of a header be 'list
of dictionaries' would disappear.

The argument "contra" is mainly that you will not be able to use the
same generic data structure to represent an arbitrary HS header.

The argument for the generic data structure is not very strong,
"IETF doesn't do APIs" and all the headers listed in A.5 would need
special-casing anyway.

The best argument "pro", apart from document clarity, is that we
can grandfather even more headers.  Not quite all of A.5, but it
brings some rather interesting headers into view of semantic
compression, notably all the ones with timestamps.

That way we could lift the art of HTTP header specifications a
couple of very usable steps above ABNF:

	"Range:" (
		"bytes" "=" 1# (
			<POSINT first-byte-pos> "-"
			/
			<POSINT first-byte-pos> "-" <POSINT last-byte-pos>
			/
			"-" <POSINT suffix-length>
		)
		/
		<TOKEN other-range-unit> "=" <1* VCHAR other-ranges-specifier>
	)

	"Retry-After:" ( <DATE when> / <POSINT delay-seconds> )

	"User-Agent:" * (
		RWS (
			<TOKEN product> [ "/" <TOKEN product-version> ]
			/
			<COMMENT comment>
		)
	)

I would *love* if the result was machine-readable.

The value of fuzzing-generators working directly from spec has been
documented for 40 years and needs no further argumentation.

It would certainly also be possible to have a code-generator spit
out H1 parsers/validators for the headers from this.

I can see how you could automatically produce definitions of
semantic on-the-wire compression schemes from this, but
security reasons would probably veto that.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

Received on Sunday, 30 April 2017 18:50:44 UTC