Re: Compression Dictionary follow-up from IETF 119

To some extent, yes but usually with regex matching to pull out just the
hex string characters - though if you have access to the HTTP headers, you
likely have access to the URL as well. The other place where the friction
was less was just in day-to-day development with dev tools and inspecting
the headers and manually checking the equivalent file on the origin (though
bas64url encoding of the hash on the origin would make this easier to
eyeball).

On Wed, Apr 24, 2024 at 8:59 AM Lucas Pardue <lucas@lucaspardue.com> wrote:

> Hi Pat,
>
> On Sun, Mar 24, 2024, at 18:57, Patrick Meenan wrote:
>
> Thanks for the great discussion. There were two points of discussion that
> were left a bit open that I wanted to follow up on:
>
> 1 - Potential for "match" to be a client DOS vector
>
> All of the match patterns for a given origin (partitioned by page origin)
> need to be evaluated before a decision can be made and there was a concern
> that a lot of dictionaries could DOS the client (or be a footgun).
>
> Not matching is a graceful fallback so things are entirely within the
> client control (much as the HTTP cache is).
>
> Chrome currently has a limit of 1000 dictionaries per partition so if a
> site sets more than that, some will be evicted. We may tune that number if
> we start to see impact on the request times from running the matches.
>
>
> 2 - Questions about the use case for hex-encoded dictionary hashes.
>
> There was some question about the cases where developers are using the
> hex-encoded hash values where sf-binary was causing extra friction.
>
> The main flow where that has been an issue is when delta-encoding static
> assets (e.g. javascript bundles). At build time, the current version of a
> bundle is compressed using a previous version as a dictionary and is stored
> with the hex dictionary hash as part of the file name (then published to
> wherever they are served from).  Hex encoding is easy to use at build time
> since that is the output from cli tooling and is filesystem-safe during the
> build.  At serving time, the Available-Dictionary header value is appended
> to the URL and the file is checked, falling back to the unmodified URL.
>
> Most that I have talked to are keeping the hex encoding and adding
> processing to the serving path to convert the sf-binary to hex (e.g.
> hexencode(base64decode(strip(AvailableDictionary, ':'))) ).
>
>
> This makes me wonder something. In the version where it was hex encoded,
> were people actually parsing the HTTP field value before appending it to a
> URL? If not, that would seem like a possible attack vector.
>
> The processing step you illustrate at least adds some validation check,
> since the base64 decode would fail closed on bad input if the field value
> passed verbatim and not parsed first.
>
> Just a thought.
>
> Cheers
> Lucas
>
>
>
> We'll keep an eye on feedback from the updated Chrome origin trial to get
> a sense for how common it is and if there are any situations where it isn't
> easy to work with.
>
> Thanks,
>
> -Pat
>
>
>

Received on Wednesday, 24 April 2024 13:42:26 UTC