Re: No-Vary-Search

Jeremy, 

My first impressions follow.  You're more than welcome to disagree.


Section 2 contains this confusing sentence.  Please clarify in the doc.
"given by the obtain a"?

      |  Implementations instead need to
      |  implement the processing model given by the obtain a URL search
      |  variance algorithm (Section 4.2).

Section 3 similarly contains

   The obtain a URL search variance algorithm (Section 4.2) ensures that
   all URL search variances obey the following constraints:

If "obtain a URL search variance" algorithm is the name of an algorithm,
please indicate such.  (perhaps by quoting the name of the algorithm?)
...The sentence did not read clearly until I read Section 4.2, which has
that the title (notice difference in case)
  "4.2.  Obtain a URL search variance"



There are many references in the doc to WHATWG specs rather than IETF
specifications for URLs.  Is this intentional?


The document does not mention the implication of the union of variants
between Vary and No-Vary-Search response headers.  A CDN or browser
might have to limit the number of variants cached.


Overall, this document uses idioms I am less familiar seeing in RFCs.
Maybe these idioms are more typical in WHATWG documents, but the
pseudo-code is different than what I typically see in RFCs.
Perhaps I am not familiar with the pseudo-markup variant, but it does
not look like markdown to me.


e.g. To _parse a URL search variance_ given _value_:
See also my confusion above reading
  "The obtain a URL search variance algorithm"
which could have been
  "The _obtain a URL search variance_ algorithm"
using the _every-other-word_ idiom from Section 4,
though _I_ _am_ _personally_ _not_ _a_ _fan_ _of_ _this_ formatting.
My preference would suggest using a real language (any one), instead of
pseudo-code, to create a reference implementation, if that is the goal.
Add comments to describe required behavior to clarify the reference
implementation.


The No-Vary-Search syntax with "except" reads to me as a double-negative:
  No-Vary-Search: params, except=("x")

Not knowing how far along this spec document is, was naming the header
"Vary-Search" considered?  With "Vary-Search", inverting the logic would
suggest "params" to default to all params varying (same as not
specifying Vary-Search), and "except" could be "no-vary"
  Vary-Search: params, no-vary=("x")
to indicate no-vary for "x", or
  Vary-Search: params, no-vary
to indicate all search params are no-vary (wildcard).


7.  Privacy Considerations

The ability to cache variants based on search parameters could possibly
compromise privacy due to fingerprinting and the ability to detect cache
hit versus cache miss even with coarse timing resolution.


Cheers, Glenn


On Wed, Jun 12, 2024 at 01:23:23PM -0400, Jeremy Roman wrote:
> In the interest of continuing discussion on this list, the WICG draft has
> been reformatted in RFC format and reported to the Datatracker:
> 
> https://datatracker.ietf.org/doc/draft-wicg-http-no-vary-search/01/
> or directly on GitHub
> https://jeremyroman.github.io/http-no-vary-search/draft-wicg-http-no-vary-search.html
> 
> The text has been left mostly unchanged so far (modulo very small editorial
> changes), and does not yet reflect any change to RFC 9111 behavior (though
> hopefully it's clear what those changes would be, from the existing text).
> 
> On Tue, Mar 19, 2024 at 2:26 AM Mark Nottingham <mnot@mnot.net> wrote:
> 
> > Hi Jeremy,
> >
> > > On 19 Mar 2024, at 11:44, Jeremy Roman <jbroman@chromium.org> wrote:
> > >
> > > Unfortunately it is not possible for me to join personally (time zones
> > and personal complications). We might be able to brief a Chrome team member
> > who is attending if there is interest (depending when this is), though as
> > you point out it would necessarily be a fairly brief overview on short
> > notice (so it might not be possible).
> >
> > It doesn't look likely that we'll have time for additional presentations.
> > I'd suggest continuing the discussion on the list.
> >
> > Just for some context -- we found this kind of capability useful when I
> > was at Yahoo! way back in 2010:
> >   https://www.mnot.net/talks/pdf/Stupid_Web_Caching_Tricks.pdf#page=36
> >
> > Cloudflare supports configuration to ignore the whole query string, as
> > well as specific arguments in it:
> >   https://developers.cloudflare.com/cache/how-to/cache-keys/
> >
> > As does Fastly:
> >   https://docs.fastly.com/en/guides/making-query-strings-agnostic
> >
> > https://www.fastly.com/documentation/solutions/examples/manipulate-query-string/
> >
> > As does Akamai (apparently, based upon the information available):
> >
> > https://community.akamai.com/customers/s/article/Remove-query-strings-from-forward-request-and-cache-key?language=en_US
> >
> > I know Varnish supports this as well; I've done it with Squid (using a
> > helper) too. Not sure about eg nginx or Apache httpd.
> >
> > So I suspect it's safe to say there's interest in this general feature
> > from people who use HTTP caches.
> >
> > The difference here is the control mechanism to invoke that behaviour --
> > putting it in a response header is really nice because it's a)
> > standardised, so (eventually) interoperable across implementations, and b)
> > driven by the resource on the origin server, who has the most information
> > about the URL's semantics (rather than relying on out-of-band
> > configuration).
> >
> > However, when a cache has multiple stored responses and they have
> > conflicting information about the cache key, we need to be careful about
> > specifying the interaction. In a way, this is similar to Vary -- it faced a
> > similar question, and the decisions made in its design made implementation
> > difficult. We chose a different approach in Key and Variants to address
> > that; we should probably have a similar discussion here.
> >
> > Cheers,
> >
> >
> > --
> > Mark Nottingham   https://www.mnot.net/
> >
> >

Received on Thursday, 13 June 2024 08:16:46 UTC