Syntax and sticky/compressed headers.

Hi,

	I've been thinking about Paul Leaches sticky headers
note. Basically the problem I see is a "slippery slope" one.
Its not too difficult to do sticky headers, but then again its
not much more difficult (if at all) to go all the way to a 
binary protocol.

	As I see it each MIME header has a pretty uniform abstract
syntax, basically the data structure is :-

message : struct
	headers		List(header)
	body		Array(Octet)
	footers		List(header)

header : struct
	tag		token
	parameters	List(attribute_value)

attribute_value : struct
	tag		token
	value		Alt (token | string)

[The last alt being disambiguated by the specific tag token]

Basically in MIME all tokens are strings so the distinction
between tokens and strings is not apparent. The distinction I would
draw is that a token is simply a conventional symbol and could be
exchanged for any other symbol without changing the meaning of 
the message (ie Content-Type: could be "MIMETYPE" or 42 and 
the protocol would still work identically. Some values (eg URLs)
cannot be arbitrarily changed without the meaning of the message
itself changing.


Rather than exchange Content-Type for #f as Paul suggests why not
go to a 100% binary protocol and represent tokens as numbers. 
[NB I'm not suggesting ASN.1 here or similar profanity, ASN.1 is
a bad implementation of a good idea and like COBOL deserves to
die regardless of which standards bodies tout it).


The scheme I would suggest would be to learn from ASN.1 and construct
an unambiguous format that can be written out via a simple linear
traversal of the data structure (i.e. no BER nonsense). The MIME
data structure is simple enough to make this possible without 
excessive complexity.

The traditional point raised against binary formats is that it
is not possible to create new tags without danger of conflicts.
If a variable length encoding were used for identifiers this need
not be an issue. For example the high bit of a token could 
determine whether the token was a single byte (ie commonly used)
token or a multi-byte token. If the token was a multibyte one the
low 7 bits could give the actual length. This would allow private
extensions without a significant probability of collision.


The other major change that I would like to see which is linked to 
this is to change HTTP from its present synchronous design to an 
asynchronous one. At present a client must send requests and recieve
responses in order. Pipelining means that the client can send 
multiple requests before recieving a response but there is no 
allowance for out of order reciept of responses. Adding this 
facility would make HTTP a general purpose object messaging 
protocol. 

More to the point it would mean that we could move to HTTP NG
without obsoleting the vast installed base. Here the parts that
most concern me are CGI, NSAPI and ISAPI type scripts. It is
all very well for Netscape to release a new server but the 
content providers won't be willing to support a new protocol 
unless it is transparent to their back end apps. I don't think
that the perl people are likely to want to support multiple
protocols unless it is entirely transparent. This leads me 
to think that there is no room to change the semantics of the 
HTTP protocol as Simon originally proposed in NG.


As I said, its a slippery slope which as far as I see it points
pretty clearly towards the territory that NG maps out. If one added
in flow control and batched acknowledgements the feature set
is completed.


		Phill

Received on Tuesday, 27 August 1996 11:34:22 UTC