Re: cache-busting and query-string versioning

Hello Raphaël,

Thanks for taking interest on this. I can only talk on my name, and I do so
from the perspective of both being a web developer and an HTTP/2 edge
server developer.

As a web developer, I can say that cache busting is difficult. In most web
applications out there, the responsibility of creating URLs for static
assets is distributed: front-end stacks create some, and backend
application frameworks like Django also do. Getting coherent URLs for cache
busting across code that runs in different places (browser and server) and
quite often in different programming languages tends to be a draining and
time-consuming effort. It also defeats the purpose of HTTP caching tools
(cache-control, etag, date, etc.) that recognize that representations
change. Finally, cache busting via URLs leaves the old version of the asset
in the cache instead of replacing it immediately, and that means that HTTP
caches have less space available and are therefore less effective of what
they could be.

On the other hand, round-trips and latency are a real issue, and cache
busting through URLs are a practical and available solution.

So I will risk an answer to two of your questions, but note that I don't
consider myself knowledgeable in the topic, and it would be really good if
we can get more feedback from other people here:

* No, cache-busting through URLs should not be part of HTTP best practices.
But it is now. I think that an explicit HTTP mechanism for cache busting
would be highly desirable.

* As far as I know, there is no alternative.


On Sun, Jun 26, 2016 at 11:37 PM, Raphaël D <raphael.droz@gmail.com> wrote:

> On Wed, 22 Jun 2016 13:39:39 +0200 Alcides Viamontes wrote:
>
> > Yes, this is related to caching in general. And it is the reason
> > people have to add query strings for doing cache busting. This problem
> is a
> > separate issue, but it interacts with cache digests in that old version
> > of assets are kept in the cache and  therefore in the cache digest and
> the
> > origin have no way of removing it. The origin can only create a new URL
> > (say, via a new query string) that gets added to the cache and the cache
> > digest.
>
> (email subject renamed to avoid polluting the "Cache Digests for HTTP/2"
>  mailing-list thread)
>
> Hi,
>
> I read this incidentally meanwhile trying to understand what alternative
> to cache-busting exists.
> I'm stretching my head to find which overlooked IETF-http or W3C concept
> matches.
>
>
> Nowadays most CMS, javascript/css-frameworks feature cache-busting
> (usually using query-string) which is considered as the unavoidable
> answer to the "can't forcefully refresh browser cache" issue (and web
> traffic pattern).
>
>
> Among the facts/reasons given are:
>
> 1) web-applications support the fact that some webserver set far-future
>    Expires times for assets (css, js, fonts)
>
> 2) downstream proxies and browser cache are not accessible by the webserver
>
> 3) but web application needs to force a HTML page to use freshest version
> of
>    some or all assets even if they were already cached in-browser with a
>    far-future Expires time
>
> 4) people don't use Last-Modified or ETags because zero request
>    always seems better than requesting and waiting for a 304 response.
>
> 5) static files are usually not routed through CGI but left to webserver
>    configuration rather than web-application itself which is related to
>    assumptions like the above n°1
>
>
> "Is it a mistake to Expires +1 year /js/jquery.js"? seems one of the
> underlying meta-questions (like the definitions of "persistence" and
> "version")
>
>
> In the event assets are not be cached that long (says only 1 week) and all
> (reverse)proxies are correctly configured, the issue still arise under
> some circumstance.
> For example in case a webapp upgrades assets, it leaves inconsistencies
> between the main resource fetched by the user-agent (the HTML page) and
> some of its "dependencies". Since old, cached assets that can't be easily
> invalidated, new resources are created in caches.
> (query-string versioning appears a lot like a N.I.H. ETags mechanism
> the difference being that one HTTP response (main webpage) sends the
> ETag of the multiple sub-resources it depends upon)
>
>
>
>
> * Should cache-busting be part of HTTP best practices, what references and
>   knowledgeable voices have said/to say about it?
>
> * Is there HTTP 1.x or 2.0 alternatives? Is there a need for it ? or
>   should the solution be uniquely in the hands of assets
>   deployment/distribution/versioning tools and/or web application tweaks
>   like query-string suffixes?
>
> * Does the situation implies new precautions when using HTTP Expires
> header?
>
>
>
> thank you
>
>
>
> Ref:
> ¹ https://tools.ietf.org/html/rfc7234
> ² https://css-tricks.com/strategies-for-cache-busting-css/
> ³
> https://www.shimmercat.com/en/info/articles/caching/#url-query-parameters-are-still-needed-to-update-assets-at-clients-
> ⁴ https://www.mnot.net/cache_docs/#FAQ ("My images expire a month from
> now, but I need to change them in the caches now!")
> ⁵ https://core.trac.wordpress.org/ticket/29201#comment:12
>

Received on Monday, 27 June 2016 05:22:26 UTC