Re: Working Group Last Call: Compression Dictionary Transport

Sorry, this is my first foray into the standards process but from reading
over RFC 2026 for the standards track from proposed -> draft -> standard,
it looked like proposed was appropriate and draft is the point where
multiple independent implementations became the defining factor.

Pulling out the relevant section for proposed standard:

   A Proposed Standard specification is generally stable, has resolved
   known design choices, is believed to be well-understood, has received
   significant community review, and appears to enjoy enough community
   interest to be considered valuable.  However, further experience
   might result in a change or even retraction of the specification
   before it advances.

   Usually, neither implementation nor operational experience is
   required for the designation of a specification as a Proposed
   Standard.  However, such experience is highly desirable, and will
   usually represent a strong argument in favor of a Proposed Standard
   designation.

And for experimental:

   The "Experimental" designation typically denotes a specification that
   is part of some research or development effort.  Such a specification
   is published for the general information of the Internet technical
   community and as an archival record of the work, subject only to
   editorial considerations and to verification that there has been
   adequate coordination with the standards process (see below).  An
   Experimental specification may be the output of an organized Internet
   research effort (e.g., a Research Group of the IRTF), an IETF Working
   Group, or it may be an individual contribution.

Maybe I haven't been transparent enough with the process of Chrome's origin
trials but it feels like it was experimental already when we adopted the
draft into the WG, having done the research and internal testing.

The origin trials started with Chrome 117 last March with the draft-00
design. There have been 3 rounds of trials with 3 different revisions of
the draft with the current V3 trial implementing the features in the
current draft-05.

The trials included different types of sites from the largest properties
(Google and others) as well as sites of various sizes from rich
applications to ecommerce and published content sites to make sure the
developer ergonomics worked like we expected and that the design
failed-safe when exposed to the web at scale. This included testing through
most of the popular CDN's to make sure it either worked out of the box as a
passthrough cache or could be configured to work (and, more importantly,
that it didn't break anything). The trials have been hugely successful with
the expected 80%+ reduction in bytes for static content and significant
performance wins for dynamic content (even for the most latency-sensitive
sites).

As far as breakage goes, the only issue discovered was with some security
devices (middleboxes) that inspect traffic but don't modify the
Accept-Encoding header that passes through to make sure only encodings that
they understand are advertised. We are planning to "fix" the ecosystem when
the Chrome feature rolls out by providing an time-locked enterprise policy
that will make admins aware of the issue and provide pressure on the device
vendors to fix their interception.

There haven't been any fundamental changes to the design since the original
draft. We moved a few things around but the basic negotiation and encoding
has been stable and we've converged on the current, tested design. This
feels like we have quite a bit of both implementation and operational
experience deploying it and feels pretty solidly at the "proposed standard"
maturity.

It's possible that further experience when CDN's or servers start
implementing features to automate the encoding that it would benefit to
revise the standard but, as far as I can tell, that's the purpose of
proposed standard before it matures to draft standard.


Stage aside, for "Use-As-Dictionary" specifically and the risks of matching
every fetch, "clients" can decide the constraints around when they think it
would be advantageous to check for a match or when they would be better off
ignoring it and falling back to non-dictionary compression. Chrome, for
example, has a limit of 1000 dictionaries per origin in a LRU store (and
200 MB per origin). Those may change but there are no MUSTs around using
the advertised dictionaries.

For that matter, there is no requirement that a client and server need to
use the Use-As-Dictionary header as the only way to seed dictionaries in
the client. It's entirely possible to embed a dictionary in a client and
still use the Available-Dictionary/Content-Encoding part of the spec. The
same can apply to a CDN when it is configured to talk to an origin. There's
nothing stopping a CDN from providing a config where dictionaries can be
uploaded (or provided) and certain types of requests back to the origin
could advertise the configured dictionaries as available.

I'm hopeful that what we have designed and tested has the flexibility to
allow for a lot of use cases beyond what we have already deployed and
tested but that's largely what the process from proposed standard to draft
standard allows for.

On Thu, Jun 13, 2024 at 1:42 AM Martin Thomson <mt@lowentropy.net> wrote:

> On Thu, Jun 13, 2024, at 15:36, Yoav Weiss wrote:
> > Yeah, I don't think this is the way to go.
>
> As I said, obviously.  But your strategy only really addresses the serving
> end.
>
> >> All of which is to say, I think this needs time as an experiment.
> >
> > I'll let Pat chime in with his thoughts, as I don't have strong
> > opinions on that particular front.
>
> I should have said before: I'm supportive of experimentation in this
> area.  Even to the extent of publishing an RFC with the code points and
> whatnot.  But I don't think that this meets the bar for Proposed Standard.
>
>

Received on Thursday, 13 June 2024 13:51:15 UTC