W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2014

Re: #445: Transfer-codings

From: Matthew Kerwin <matthew@kerwin.net.au>
Date: Wed, 9 Apr 2014 00:12:35 +1000
Message-ID: <CACweHNCOSd968ktNyuZ+BoM8jYiaQB5XX_tcHA2-pPYMpna6Cg@mail.gmail.com>
To: David Krauss <potswa@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
On 7 April 2014 16:13, David Krauss <potswa@gmail.com> wrote:

> Range requests don't get cached because they are uncommon (not chicken and
> egg, but because there are few potential applications)

... they don't get cached because they're not common, and an argument
against promoting them to commonality is that they're not cached. Maybe
"Catch-22" would have been a better description than "chicken and egg".

> and it greatly complicates cache design. The underlying resource is likely
> to be too large to be stored entirely by the client, so the cache would
> have to store ranges separately and coalesce them on demand, per query. A
> single cache miss potentially generates multiple sub-range requests on the
> network.

Just so I can follow this paragraph, are you talking about a browser
cache? Because a client's storage capacity has no bearing on how a caching
proxy stores objects/fragments, or how that cache freshens stale data.

To my mind browsers don't play any part in range discussions, unless it's
in some new-fangled technology like video streaming. Most of the time a
browser wants to receive the whole object so it can evaluate and/or render
it. It doesn't make sense to grab half a page, or a chunk of an image. And
if it is a streaming thing, how often are those actually cached?

The model in my mind is a non-browser application that uses HTTP APIs.

Custom, application-specific caching, which may know about specific usage
> patterns or server performance characteristics, looks better.

Why do you assume the application doesn't have any say in the caching? If
the application layer is a javascript app, and the HTTP semantic and
transport layers are provided by a browser, then sure (if XHR requests even
pass through the browser cache). But surely a non-browser HTTP middleware
package would provide either: a super-duper opaque caching mechanism that I
know will suit my application, or an API so I can handle my own cache (like
libcurl does).

An analogy on the response-providing side is content-encoding. It's a HTTP
semantic level thing, purview of the HTTP middleware, but it's something
the application should be managing.


> A simple adaptation, which surely isn't my original invention just now, is
> to store large sub-ranges as virtual files. For example, go by powers of
> two, so you can download any two megabytes starting at any offset multiple
> of 2^21, any 64 kilobytes starting at a multiple of 2^16, etc. These are
> cache-friendly and simple to merge in client-side logic, and they work with
> content-encoding.

A convenient workaround is still a workaround. Now rather than allowing
the client to request the data it wants, the server dictates what it can
have. Might as well just provide query parameters to define arbitrary
dynamic/lazy-initialised/whatever resources.

The other issue here (in either case) is that the server now has a lot more
entities to manage. And twice as many, if they're all gzip'able. That's a
lot of PUSH_PROMISEs if you update something.

  Matthew Kerwin
Received on Tuesday, 8 April 2014 14:13:04 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:29 UTC