Re: NEW ISSUE: example for matching functions, was: Weak and strong ETags from Jamie Lokier on 2007-05-29 (ietf-http-wg@w3.org from April to June 2007)

From: Jamie Lokier <jamie@shareable.org>
Date: Tue, 29 May 2007 23:27:50 +0100
To: Henrik Nordstrom <henrik@henriknordstrom.net>
Cc: ietf-http-wg@w3.org
Message-ID: <20070529222750.GA10806@mail.shareable.org>
Henrik Nordstrom wrote:
> > (That would also improve cache hits with Apache's method - which uses
> > a weak Etag for resources with a recent time, then converts to a
> > strong Etag with the same value after the time has passed).
> 
> Already the case, if Apache follows the RFC.. The initial response will
> have a weak ETag, which is then used in If-None-Match and should compare
> true even after it has been upgraded to a strong one.

You're right.

(But beware: If the client sends the weak ETag in If-None-Match, it
can get the _wrong_ response if the resource has changed during that
small time window (i.e. in the same second)).

Clients which require an accurate response after the request is sent
must send a strong Etag, or none at all.

In this case, converting the weak one to a strong one in the client
doesn't help, as you point out.

> > But what about the other way around?  Clients which require only weak
> > comparison, but the server must send a strong Etag so that _other_
> > clients, which require a strong comparison for the same resource, can
> > use conditionals?
> > 
> > It seems that it would be useful if a server could specify two Etag
> > for the same resource - a weak one, and a strong one.
> 
> Why? A strong one can always be used.

Because for some clients, a strong one is excessive (weak is good
enough), while for other clients accessing the same resource, a strong
one is required.

> > Currently, a server can support conditionals for one type of client
> > but not the other.  That seems to be an unnecessary limitation.
> 
> Can you give some examples please. I just don't see what you are talking
> about here..

For example: if there's a client like, oh, a calender application that
may not depend on the exact form of XML.  A weak Etag is fine for that.

But if you want to fetch the response for some other application like
displaying the XML, and have a proxy cache which converts
unconditional requests to conditional ones, then that must use a
strong Etag, or none at all.

Also, if you have an application that fetches the XML in pieces using
byte-range conditional requests - or a proxy which does that - it must
use a strong Etag.

It's reasonable to want a server to support all those clients with
efficient caching.

If it serves a strong Etag, the calendar client (for which weak
comparison would be fine) will result in unnecessary network traffic
sometimes - when a weak match would be fine, but a strong match fails.

If it serves a weak Etag, then the other client applications which
require a strong comparison can't use it and will result in
unnecessary network traffic.

Either way, the limitation means redundant network traffic that can be
avoided if either (a) the server can send two Etags, or (b) the client
can convert a weak Etag to a strong one, when it requires a strong one
or none at all.

Note that allowing (b) would require the table which started this
thread to be changed: request's strong Etag + resource's weak Etag
would return false, while requests's weak Etag + resource's strong
Etag would return true.

> The only reasons to specify a weak ETag is if you can't or don't want to
> fulfill the uniqueness requirements of a strong one.

My point is that "if you want to fulfull the uniqueness requirements"
is a property of the client application (and also how the data will be
used), not just the server.

A server can legitimately have reason to support clients which require
strong validation, as well as clients for which weak validation is
good enough, for the same resource.

> can't: When binary equivalence can not be guaranteed between two
> requests. For example dynamic compression with a embedded timestamp, or
> when there is other "random" elements in each response.
> 
> don't want to: When each response is unique, but it's at the same time
> not very important that the user always gets the most current. For
> example a hit counter.

Even with a hit counter, some clients may benefit from a byte-range
conditional request on it (e.g. if it's a large image file, and the
client checks the first few x00 bytes before deciding whether to get
the rest).  The weak Etag isn't enough for them.  For other clients,
it will be enough.

See other examples above.

-- Jamie
Received on Tuesday, 29 May 2007 22:27:57 UTC