Re: Why should caches and intermediaries ignore If-Match?

On Mar 3, 2017, at 3:24 PM, Tom Bergan <tombergan@chromium.org> wrote:
> On Fri, Mar 3, 2017 at 2:58 PM, Mark Nottingham <mnot@mnot.net <mailto:mnot@mnot.net>> wrote:
> > On 4 Mar 2017, at 9:30 am, Roy T. Fielding <fielding@gbiv.com <mailto:fielding@gbiv.com>> wrote:
> >
> > On Mar 1, 2017, at 5:49 PM, Tom Bergan <tombergan@chromium.org <mailto:tombergan@chromium.org>> wrote:
> >>
> >> Here is the use case:
> >>
> >> We have a content-optimization (compression) proxy sitting between the browser and origin server. Among other things, the proxy can compress videos. When the browser starts playing a video, it makes an initial HTTP request to fetch (part of) the video, then builds an in-memory representation of the video and uses additional HTTP range requests as needed to fetch the rest of the video. For example, range requests are used to implement seeking.
> >>
> >> The challenge is that we now have multiple representations of every video: the original representation (from the origin server) and one or more compressed representations served by the proxy. When the browser makes an initial request for a video, it gets one of these representations. When it makes a subsequent range request, we want to ensure that it receives the *same* representation that it received on the initial request. Otherwise the browser cannot combine the second response with the first response and video playback will fail.
> >>
> >> An additional challenge is that the browser and proxy both have a cache. In theory, we control the entire connection and could add custom code to the browser, proxy, and caches to implement any protocol that we invent. In practice, both caches are intended to be HTTP-compliant caches and we'd rather not add custom hacks for use cases like this if we can avoid it.
> >>
> >> The browser needs to label each range request with the ETag it expects to receive. If-Match originally seemed like the perfect solution: The browser adds `If-Match: ETag` to every range request. If a cache has a copy of the video with a *different* ETag, the cache forwards the request to the next server in the chain rather than returning its cached copy (as would happen if we used If-Range instead of If-Match). Similarly, the proxy knows if the browser is requesting a compressed video or the original video, so it can respond accordingly. However, as discussed previously in this thread, If-Match doesn't work like this.
> >>
> >> Note that I agree it doesn't make sense for a cache to return 412 and we don't need that behavior. The semantics I'm looking for is: "Send me this representation if you have it, otherwise forward to the next server. A 4xx means that this representation is not current in the origin or in any intermediate cache or proxy."
> >>
> >> Hope that makes sense.
> >
> > You have several choices:
> >
> > 1) implement this using transfer encodings because they don't change range offsets;
> >     presumably, these would be added/removed by the protocol handlers before
> >     the caches ever see them.
> 
> Ew.
> 
> Can you expand what you mean by this? I'm not sure I followed.
> 
> In case I wasn't clear, the proxy actually produces a completely different transcode of the original video, possibly in a different container format or codec. The "compressed" video is actually a completely different file than the original video; this is not just compression via Content-Encoding.

If the encoding is reversible (lossless), transfer encoding is a better idea.  Of course,
this has zero chance of being implemented already -- it would be custom code.

> > 2) use If-Range and configure your proxy to forward the request when no match;
> >     yes, that's legitimate HTTP (a server is free to ignore partial requests and a proxy
> >     can forward any request it likes).
> 
> Nod.
> 
> This doesn't help with the caches, which return 200 when there is no match on the If-Range etag rather than forwarding the request. If we didn't have any HTTP caches in the middle, we would have already done this :)

You control those HTTP caches, right? Change that behavior.  We are talking about a trivial
configuration change (or a one-line source code change), as opposed to a change to HTTP
semantics which, even if we agreed to it, wouldn't be deployed for another five years.

OTOH, you can just do the sensible thing and use a different URL for the compressed stream.
Then the proxy can redirect initial (normal) requests to the compressed stream when it already
has the beginning of that stream in cache.

> > 3) use If-Match and deal with the extra round-trip after a 412.
> 
> Why doesn't the logic in #2 apply here as well? Intermediary servers aren't required to 412.

They are required to either not implement it or not perform the method.  Either way, the
response isn't going to be what you want (a 2xx status) because that would change the
semantics of the field. Your use case fits If-Range's purpose, not that of If-Match.
To be clear regarding the subject, the RFC doesn't say caches and intermediaries always
ignore If-Match; it says they may.  Deployed practice will just as often respond with a 412
when an unmatched etag is received in If-Match.

....Roy

Received on Saturday, 4 March 2017 00:12:12 UTC