Re: #445: Transfer-codings from Matthew Kerwin on 2014-04-08 (ietf-http-wg@w3.org from April to June 2014)

From: Matthew Kerwin <matthew@kerwin.net.au>
Date: Wed, 9 Apr 2014 00:12:35 +1000
To: David Krauss <potswa@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CACweHNCOSd968ktNyuZ+BoM8jYiaQB5XX_tcHA2-pPYMpna6Cg@mail.gmail.com>

On 7 April 2014 16:13, David Krauss <potswa@gmail.com> wrote:

>
> Range requests don't get cached because they are uncommon (not chicken and
> egg, but because there are few potential applications)
>

... they don't get cached because they're not common, and an argument
against promoting them to commonality is that they're not cached. Maybe
"Catch-22" would have been a better description than "chicken and egg".

> and it greatly complicates cache design. The underlying resource is likely
> to be too large to be stored entirely by the client, so the cache would
> have to store ranges separately and coalesce them on demand, per query. A
> single cache miss potentially generates multiple sub-range requests on the
> network.
>

Just so I can follow this paragraph, are you talking about a browser
cache? Because a client's storage capacity has no bearing on how a caching
proxy stores objects/fragments, or how that cache freshens stale data.

To my mind browsers don't play any part in range discussions, unless it's
in some new-fangled technology like video streaming. Most of the time a
browser wants to receive the whole object so it can evaluate and/or render
it. It doesn't make sense to grab half a page, or a chunk of an image. And
if it is a streaming thing, how often are those actually cached?

The model in my mind is a non-browser application that uses HTTP APIs.

Custom, application-specific caching, which may know about specific usage
> patterns or server performance characteristics, looks better.
>

Why do you assume the application doesn't have any say in the caching? If
the application layer is a javascript app, and the HTTP semantic and
transport layers are provided by a browser, then sure (if XHR requests even
pass through the browser cache). But surely a non-browser HTTP middleware
package would provide either: a super-duper opaque caching mechanism that I
know will suit my application, or an API so I can handle my own cache (like
libcurl does).

An analogy on the response-providing side is content-encoding. It's a HTTP
semantic level thing, purview of the HTTP middleware, but it's something
the application should be managing.

[snip]

> A simple adaptation, which surely isn't my original invention just now, is
> to store large sub-ranges as virtual files. For example, go by powers of
> two, so you can download any two megabytes starting at any offset multiple
> of 2^21, any 64 kilobytes starting at a multiple of 2^16, etc. These are
> cache-friendly and simple to merge in client-side logic, and they work with
> content-encoding.
>

A convenient workaround is still a workaround. Now rather than allowing
the client to request the data it wants, the server dictates what it can
have. Might as well just provide query parameters to define arbitrary
dynamic/lazy-initialised/whatever resources.

The other issue here (in either case) is that the server now has a lot more
entities to manage. And twice as many, if they're all gzip'able. That's a
lot of PUSH_PROMISEs if you update something.

-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

Received on Tuesday, 8 April 2014 14:13:04 UTC