Inter-Stream Compression and Delta Encodings from Patrick McManus on 2017-04-25 (ietf-http-wg@w3.org from April to June 2017)

From: Patrick McManus <mcmanus@ducksong.com>
Date: Mon, 24 Apr 2017 21:07:44 -0400
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAOdDvNpCFdXjx3O3FbpVVhcXcOxhEueePjv+DgpaiTyS=r6CDg@mail.gmail.com>

Hi WG!

Right before we met in Chicago Vlad updated his compression draft - I'll
highlight it here in case you lost it in the shuffle of the meeting (as I
did originally).
https://datatracker.ietf.org/doc/draft-vkrasnov-h2-compression-dictionaries/?include_text=1
He presented on this in Seoul (and that means it is in the meetecho video
archive if you missed it
http://oldrecs.conf.meetecho.com/Playout/watch.jsp?recording=IETF97_HTTPBIS_II&chapter=chapter_1
)

I would like to start a discussion on whether the working group has an
interest in adopting a work item in this general area (which may or may
not be this draft depending on consensus). The general topic is recurring,
and I know of interest from at least three parties - though other than Vlad
(who has submitted a draft!) they should speak for themselves in this
thread. But the chairs would need to hear a wider array of viewpoints
before asking the group to take on the burden.

To summarize the top level pros and cons:
Pro: saves lots of bytes and serialization time (vlad had data on that you
can find in the ietf 97 meeting materials). arguably also fixes a
regression from h1 where h2 discourages inlining which results in less
efficient content-encodings.

Con: mixing compression and encryption is a scary business - ala CRIME.
Vlad's draft attempts to address this by creating different sets of
compression contexts and letting the clients determine the sets and the
servers determine whether or not they will compress individual resources
within those sets.

Does the working group think that is a mechanism that can be effectively
used in a safe way? Thanks for your comments.

-Patrick

-- here are a few drive-by review comments from a first take on the text --

I guess the settings is c->s with encodings flowing s->c. We often manage
to make these things symmetrical can this operate in the other direction
too? We've long lamented content-encoding: gzip not working well in h1 POST
e.g.

what's the interaction with push?

set_compression_context default should probably be 254 to be conservative
rather than 0.

set_dictionary can't be "set on any stream" - subject to opt in

- If not enough DATA was
sent, the Dictionary for the given ID is considered uninitialized
vs
"If Size is greater than the length of the
transmitted data, then all of the data will be used."

having a definiton for a context very early in the document would help..
maybe "a context is a non-overlapping set of response streams and
dictionaries"

h1 bindings in a document with for h2 in its title is weird. I would just
get rid of the h1 definitions completely - it has a much richer tradition
of transaction independence on a persistent connection and things that have
tried to bypass that (e.g. connect auth) have a checkered history.

"In addition when binary data is expected on the stream, the clinet SHOULD
hint to the server by sending a SET_COMPRESSION_CONTEXT with the special
value of 255." This is really better advice to the server than anything a
client should be guessing. There are N representations potentially each
with a unique MIME type as possible responses for a request.

"If a USE_DICTIONARY frame arrives for an uninitialized dictionary, this is
considered as stream error of type COMPRESSION_ERROR." Given the cross
stream nature of this - that's probably a protocol error. Any decoding
error is probably a protocol error too.

what does it mean for the extension to be disabled by default? Is that
something different than the SETTINGS frame, or are you informing how the
server config switches need to work?

probably not enough bits for contexts (and maybe dictionaries).

Received on Tuesday, 25 April 2017 01:08:27 UTC