Re: v11-03 COMMENT: 16.1 Semantic Transparancy from Jeffrey Mogul on 1996-05-20 (ietf-http-wg@w3.org from April to June 1996)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Mon, 20 May 96 16:25:04 MDT
To: Koen Holtman <koen@win.tue.nl>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9605202325.AA13733@acetes.pa.dec.com>
Koen makes some useful comments regarding a few places where
I got sloppy with terminology.  In particular, I often used
"semantic transparency" when I meant "approximation of
semantic transparency."

I don't agree with Koen that "semantic transparency" is unimportant.
While it is true that certain sections of the text should be
talking about freshness instead of semantic transparency, the
latter is still the key concept because it is the best way to
think about whether a design or implementation decision could
cause trouble.

For example, Koen and Larry Masinter and I have been having
a private discussion during which the text in section 16.2.7
became relevant:

    If a cache that is pooling cached responses from other caches sees
    two fresh responses for the same resource entity with different
    validators, it SHOULD use the one with the newer Date header.

This is, in effect, a rule that says that semantic transparency
takes precedence over mere "freshness" (since both of the responses
in question are fresh).  (Actually, this rule should probably apply
to any situation where a cache ends up with two apparently "fresh"
responses with different Dates.)

Consider this situation:
	client requests (via intermediate cache)
		GET /foo.html HTTP/1.1
 
	server returns
		Etag: "x"
		Date: Wed, 15 May 1996 00:00:00
		Expires: Sat, 18 May 1996 23:59:59
 
	Then, a day or so later, the server administrator announces
	that the expiration date was set too far in the future, and
	users should "reload" foo.html.

	So the client requests
		GET /foo.html HTTP/1.1
		Cache-control: no-cache

	server now returns
		Etag: "y"
		Date: Thu, 16 May 1996 00:00:00
		Expires: Thu, 16 May 1996 00:00:01

Now think about what happens when another request is made, via
the same cache, for /foo.html at Thu, 16 May 1996 00:01:00.
If the cache (for some odd reason) simply held onto the "freshest"
response, then it would presumably return the first response
(ETag = "x").  But I think this would violate the "Principle
of Least Astonishment" ... i.e., it's clearly not the most
"semantically transparent" thing that the cache could do at that
time, because the cache has received sufficient information from
the origin server to know (for sure!) that the response with
Etag = "x" is no longer the right one.

This is a slightly contrived (although not impossible) example.
My point is that thinking purely in terms of freshness does not
resolve certain potential ambiguities in the protocol design.
I assert that if we use "semantic transparency" as the way to
evaluate the ideal behavior of a cache, then we will avoid (as
much as possible) the "astonishing" results that tend to confuse
users.  Since "ideal" behavior in this sense does NOT include
"best possible performance", we can then use the concept of
"freshness" to bound (but not eliminate) the potential semantic
errors, while greatly improving the performance of the entire
system.

Here are specific rewrites for the bugs that Koen found:

16.1
Replace
  3. Protocol features that allow a cache to attach warnings to
     responses that do not preserve semantic transparency.
with
  3. Protocol features that allow a cache to attach warnings to
     responses that do not preserve the requested approximation
     of semantic transparency.

16.1.2 Cache-control Mechanisms
Replace
    However, in some cases, Cache-Control directives are explicitly
    specified as weakening semantic transparency (for example,
    "max-stale" or "public").
with
    However, in some cases, Cache-Control directives are explicitly
    specified as weakening the approximation of semantic transparency
    (for example, "max-stale" or "public").
    
16.1.3 Warnings
Replace:
    Whenever a cache returns a response that is not semantically
    transparent, it must attach a warning to that effect, using a
    Warning response header. This warning allows clients and user
    agents to take appropriate action.

with
    Whenever a cache returns a response that is neither firsthand nor
    "fresh enough" (in the sense of condition 2 in 16.1.1), it must
    attach a warning to that effect, using a Warning response header.
    This warning allows clients and user agents to take appropriate
    action.

16.1.6 Client-controlled Behavior
Replace:
    A client's request may specify the maximum age it is willing to
    accept for an unvalidated response; specifying a value of zero
    forces the cache(s) to revalidate all responses. A client may also
    specify the minimum time remaining before a response expires. Both
    of these options increase constraints on the behavior of caches,
    and so cannot decrease semantic transparency.
with
    A client's request may specify the maximum age it is willing to
    accept for an unvalidated response; specifying a value of zero
    forces the cache(s) to revalidate all responses.  A client may also
    specify the minimum time remaining before a response expires. Both
    of these options increase constraints on the behavior of caches,
    and so cannot further relax semantic transparency.

or perhaps (making this more precise at the expense of extra words)

    and so cannot further relax the cache's approximation of
    semantic transparency.

Replace
    A client may also specify that it will accept stale responses, up
    to some maximum amount of staleness. This loosens the constraints
    on the caches, and so may violate semantic transparency, but may be
    necessary to support disconnected operation, or high availability
    in the face of poor connectivity.
with
    A client may also specify that it will accept stale responses, up
    to some maximum amount of staleness.  This loosens the constraints
    on the caches, and so may violate the origin server's specified
    constraints on semantic transparency, but may be necessary to
    support disconnected operation, or high availability in the face of
    poor connectivity.


16.2.7 Disambiguating Expiration Values
Replace
    If a cache that is pooling cached responses from other caches sees
    two fresh responses for the same resource entity with different
    validators, it SHOULD use the one with the newer Date header.
with
    If a cache sees two fresh responses for the same resource entity
    with different validators, it MUST use the one with the newer
    Date header.  This situation may arise because the cache is pooling
    cached responses from other caches, or because a client has asked
    for a reload or a revalidation of an apparently fresh cache entry.

-Jeff
Received on Monday, 20 May 1996 16:36:12 UTC