Re: Working Group Last Call: Compression Dictionary Transport from Patrick Meenan on 2024-06-14 (ietf-http-wg@w3.org from April to June 2024)

From: Patrick Meenan <patmeenan@gmail.com>
Date: Fri, 14 Jun 2024 17:34:34 -0400
To: Mike Bishop <mbishop@evequefou.be>
Cc: Martin Thomson <mt@lowentropy.net>, Yoav Weiss <yoav.weiss@shopify.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CAJV+MGydwDdBP7-Tc8iU-zv5z92z23PVRJL4_-6nVeFq=jJjvA@mail.gmail.com>
Mostly correct.

The origin servers are all using independent implementations but all are
using the "zstd" and "brotli" cli's and libraries published by Facebook and
Google for the dictionary encoding.

The CDN's aren't entirely in passthrough mode. They aren't actively
participating in the encoding but they are being used with "vary" support
to cache and serve the delta-compressed versions of resources in the static
case.

One client (Chrome).

On Fri, Jun 14, 2024 at 3:32 PM Mike Bishop <mbishop@evequefou.be> wrote:

> If I’m distilling this correctly, the current state of implementations is:
>
>    - Several origin servers
>       - All using the same implementation, or multiple independent
>       implementations?
>    - Multiple CDNs in pass-through mode (i.e. don’t break, let origin
>    send diffs)
>    - Zero CDNs performing the diff themselves
>    - One browser
>
>
>
> Is that accurate?
>
>
>
> *From:* Patrick Meenan <patmeenan@gmail.com>
> *Sent:* Thursday, June 13, 2024 9:51 AM
> *To:* Martin Thomson <mt@lowentropy.net>
> *Cc:* Yoav Weiss <yoav.weiss@shopify.com>; ietf-http-wg@w3.org
> *Subject:* Re: Working Group Last Call: Compression Dictionary Transport
>
>
>
> Sorry, this is my first foray into the standards process but from reading
> over RFC 2026 for the standards track from proposed -> draft -> standard,
> it looked like proposed was appropriate and draft is the point where
> multiple independent implementations became the defining factor.
>
>
>
> Pulling out the relevant section for proposed standard:
>
>
>
>    A Proposed Standard specification is generally stable, has resolved
>    known design choices, is believed to be well-understood, has received
>    significant community review, and appears to enjoy enough community
>    interest to be considered valuable.  However, further experience
>    might result in a change or even retraction of the specification
>    before it advances.
>
>    Usually, neither implementation nor operational experience is
>    required for the designation of a specification as a Proposed
>    Standard.  However, such experience is highly desirable, and will
>    usually represent a strong argument in favor of a Proposed Standard
>    designation.
>
>
>
> And for experimental:
>
>
>
>    The "Experimental" designation typically denotes a specification that
>    is part of some research or development effort.  Such a specification
>    is published for the general information of the Internet technical
>    community and as an archival record of the work, subject only to
>    editorial considerations and to verification that there has been
>    adequate coordination with the standards process (see below).  An
>    Experimental specification may be the output of an organized Internet
>    research effort (e.g., a Research Group of the IRTF), an IETF Working
>    Group, or it may be an individual contribution.
>
>
>
> Maybe I haven't been transparent enough with the process of Chrome's
> origin trials but it feels like it was experimental already when we adopted
> the draft into the WG, having done the research and internal testing.
>
>
>
> The origin trials started with Chrome 117 last March with the draft-00
> design. There have been 3 rounds of trials with 3 different revisions of
> the draft with the current V3 trial implementing the features in the
> current draft-05.
>
>
>
> The trials included different types of sites from the largest properties
> (Google and others) as well as sites of various sizes from rich
> applications to ecommerce and published content sites to make sure the
> developer ergonomics worked like we expected and that the design
> failed-safe when exposed to the web at scale. This included testing through
> most of the popular CDN's to make sure it either worked out of the box as a
> passthrough cache or could be configured to work (and, more importantly,
> that it didn't break anything). The trials have been hugely successful with
> the expected 80%+ reduction in bytes for static content and significant
> performance wins for dynamic content (even for the most latency-sensitive
> sites).
>
>
>
> As far as breakage goes, the only issue discovered was with some security
> devices (middleboxes) that inspect traffic but don't modify the
> Accept-Encoding header that passes through to make sure only encodings that
> they understand are advertised. We are planning to "fix" the ecosystem when
> the Chrome feature rolls out by providing an time-locked enterprise policy
> that will make admins aware of the issue and provide pressure on the device
> vendors to fix their interception.
>
>
>
> There haven't been any fundamental changes to the design since the
> original draft. We moved a few things around but the basic negotiation and
> encoding has been stable and we've converged on the current, tested design.
> This feels like we have quite a bit of both implementation and operational
> experience deploying it and feels pretty solidly at the "proposed standard"
> maturity.
>
>
>
> It's possible that further experience when CDN's or servers start
> implementing features to automate the encoding that it would benefit to
> revise the standard but, as far as I can tell, that's the purpose of
> proposed standard before it matures to draft standard.
>
>
>
>
>
> Stage aside, for "Use-As-Dictionary" specifically and the risks of
> matching every fetch, "clients" can decide the constraints around when they
> think it would be advantageous to check for a match or when they would be
> better off ignoring it and falling back to non-dictionary compression.
> Chrome, for example, has a limit of 1000 dictionaries per origin in a LRU
> store (and 200 MB per origin). Those may change but there are no MUSTs
> around using the advertised dictionaries.
>
>
>
> For that matter, there is no requirement that a client and server need to
> use the Use-As-Dictionary header as the only way to seed dictionaries in
> the client. It's entirely possible to embed a dictionary in a client and
> still use the Available-Dictionary/Content-Encoding part of the spec. The
> same can apply to a CDN when it is configured to talk to an origin. There's
> nothing stopping a CDN from providing a config where dictionaries can be
> uploaded (or provided) and certain types of requests back to the origin
> could advertise the configured dictionaries as available.
>
>
>
> I'm hopeful that what we have designed and tested has the flexibility to
> allow for a lot of use cases beyond what we have already deployed and
> tested but that's largely what the process from proposed standard to draft
> standard allows for.
>
>
>
> On Thu, Jun 13, 2024 at 1:42 AM Martin Thomson <mt@lowentropy.net> wrote:
>
> On Thu, Jun 13, 2024, at 15:36, Yoav Weiss wrote:
> > Yeah, I don't think this is the way to go.
>
> As I said, obviously.  But your strategy only really addresses the serving
> end.
>
> >> All of which is to say, I think this needs time as an experiment.
> >
> > I'll let Pat chime in with his thoughts, as I don't have strong
> > opinions on that particular front.
>
> I should have said before: I'm supportive of experimentation in this
> area.  Even to the extent of publishing an RFC with the code points and
> whatnot.  But I don't think that this meets the bar for Proposed Standard.
>
>
Received on Friday, 14 June 2024 21:34:55 UTC