Re: Compression Dictionary follow-up from IETF 119

Hi Pat,

On Sun, Mar 24, 2024, at 18:57, Patrick Meenan wrote:
> Thanks for the great discussion. There were two points of discussion that were left a bit open that I wanted to follow up on:
> 
> 1 - Potential for "match" to be a client DOS vector
> 
> All of the match patterns for a given origin (partitioned by page origin) need to be evaluated before a decision can be made and there was a concern that a lot of dictionaries could DOS the client (or be a footgun).
> 
> Not matching is a graceful fallback so things are entirely within the client control (much as the HTTP cache is).
> 
> Chrome currently has a limit of 1000 dictionaries per partition so if a site sets more than that, some will be evicted. We may tune that number if we start to see impact on the request times from running the matches.
> 
> 
> 2 - Questions about the use case for hex-encoded dictionary hashes.
> 
> There was some question about the cases where developers are using the hex-encoded hash values where sf-binary was causing extra friction.
> 
> The main flow where that has been an issue is when delta-encoding static assets (e.g. javascript bundles). At build time, the current version of a bundle is compressed using a previous version as a dictionary and is stored with the hex dictionary hash as part of the file name (then published to wherever they are served from).  Hex encoding is easy to use at build time since that is the output from cli tooling and is filesystem-safe during the build.  At serving time, the Available-Dictionary header value is appended to the URL and the file is checked, falling back to the unmodified URL.
> 
> Most that I have talked to are keeping the hex encoding and adding processing to the serving path to convert the sf-binary to hex (e.g. hexencode(base64decode(strip(AvailableDictionary, ':'))) ).

This makes me wonder something. In the version where it was hex encoded, were people actually parsing the HTTP field value before appending it to a URL? If not, that would seem like a possible attack vector. 

The processing step you illustrate at least adds some validation check, since the base64 decode would fail closed on bad input if the field value passed verbatim and not parsed first.

Just a thought.

Cheers
Lucas

> 
> 
> We'll keep an eye on feedback from the updated Chrome origin trial to get a sense for how common it is and if there are any situations where it isn't easy to work with.
> 
> Thanks,
> 
> -Pat

Received on Wednesday, 24 April 2024 12:59:50 UTC