(CONTENT NEGOTIATION,VARY) Newer draft text for Vary header and content negotiation `hooks'

The text below is a new version of the text I posted on Saturday.  I
wish to thank Jeff Mogul, Larry Masinter, Roy Fielding, and Dave
Kristol for their comments on that version.  These comments led to
some improvements and additions in the text below.

Changes with respect to the first version posted on Monday are marked
with | change bars.  Changes with respect to the second version posted
on Saturday are marked with + change bars.  All change bars were added
by hand, so no 100% accuracy can be guaranteed.

There have been some changes in the text to synchronize with the
planned caching text by Jeff Mogul.  I expect that both texts will
need additional minor changes to get completely in sync.

If you have comments on this text, now is the time to comment.  I
intend to close this issue in the middle of this week.

--snip--


** I. Vary+content negotiation new/changed header descriptions

[##Note: The current section 12 needs to be deleted completely from the
April 1.1 draft.##]

10.v  Vary

+  [##This Vary section mixes the description of a header with a
+  description of rules for caching.   These two things could be
+  separated in the final 1.1 document, with the rules for caching
+  being integrated in the caching section.##]

|  [##The first few paragraphs below were rewritten to account for
|  range retrievals, to shorten sentences, and to make some stylistic
|  improvements.  The text now accounts for range retrievals, I forgot
|  to cover them in the previous version.  The role of the
|  Authorization header has also been made more clear.##]

|  The Vary response-header field is used by an origin server to
|  signal that the resource identified by the current request is a
|  varying resource.  A varying resource has multiple entities
|  associated with it, all of which are representations of the content
|  of the resource.  If a GET or HEAD request on a varying resource is
|  received, the origin server will select one of the associated
|  entities as the entity best matching the request.  Selection of
|  this entity is based on the contents of particular header fields in
|  the request message, or on other information pertaining to the
|  request, like the network address of the sending client.

   If a resource is varying, this has an important effect on cache
   management, particularly for caching proxies which service a
|  diverse set of user agents.  All 200 (OK) responses from varying
   resources must contain at least one Vary header or Alternates
   header (Section 10.a) to signal variance.

   If no Vary headers and no Alternates headers are present in a 200
   (OK) response, then caches may assume, as long as the response is
|  fresh, that the resource in question is not varying, and has only
|  one associated entity.  Note however that this entity can still
   change through time, as possibly indicated by a Cache-Control
   response header (section 10.cc).

|  After selection of the entity best matching the current request,
|  the origin server will usually generate a 200 (OK) response, but it
|  can also generate other responses like 206 (Partial Content) or 304
|  (Not modified) if headers which modify the semantics of the
|  request, like Range (Section 10.ran) or If-Valid (Section 10.ifva),
|  are present.  An origin server need not be capable of selecting an
+  entity for every possible incoming request on a varying resource;
|  it can choose to generate a 3xx (redirection) or 4xx (client error)
|  type response for some requests.

|  In a request message on a varying resource, the selecting request
|  headers are those request headers whose contents were used by the
|  origin server to select the entity best matching the request. The
|  Vary header field specifies the selecting request headers and any
   other selection parameters that were used by the origin server.

       Vary                 = "Vary" ":" 1#selection-parameter

       selection-parameter  = field-name
                            | "{" "accept-headers" "}"
                            | "{" "other" "}"
                            | "{" "unknown" "}"
                            | "{" extension-parameter "}"

       extension-parameter  = token

   The presence of a field-name signals that the request-header field
   with this name is selecting.  The field-name will usually be, but
   need not be, a request-header field name defined in this
   specification.  Note that field names are case-insensitive.  The
   presence of the "accept-headers" parameter signals that all request
   headers whose names start with "accept" are selecting.

+  The inclusion of the "{other}" parameter in a Vary field signals
   that parameters other than the contents of request headers, for
   example the network address of the sending party, play a role in
   the selection of the response.

+     Note: This specification allows the origin server to express
+     that other parameters were used, but does not allow the origin
+     server to specify the exact nature of these parameters.  This
+     is left to future extensions.

+  The "{unknown}" parameter signals that the origin server is not
   willing or able to specify the selection parameters used.  If an
   extension-parameter unknown to the cache is present in a Vary
+  header, the cache must treat it as the "{unknown}" parameter.

+     Note: HTTP/1.1 caches have to treat the "{other}" and
+     "{unknown}" parameters in the same way.  For example, presence
+     of the response header
+
+       Vary: accept-language, {other}
+
+     requires the same caching behavior as does the presence of 
+    
+       Vary:  {unknown}
+     
+     Use of the "{unknown}" parameter is discouraged.  Header fields
+     which use "{other}" are more readable for humans, and better
+     support the use of heuristics to improve caching performance.
+
+

   If multiple Vary and Alternates header fields are present in a
   response, these must be combined to give all selecting parameters.

+  The field name "Host" must never be included into a Vary header;
|  clients must ignore it if it is present.  The names of fields which
|  change the semantics of a GET request, like "Range" and "If-Valid"
|  must also never be included, and must be ignored when present.  

+  [##Note: Dave Kristol suggested that I change the "must never be
+  included" above to "must be omitted", but I think this evokes the
+  wrong mental model of how servers go about making Vary headers##]

|  Servers which use access authentication are not obliged to send
|  "Vary: Authorization" headers in responses.  It must be assumed
|  that requests on authenticated resources can always produce
|  different responses for different users.  Note that servers can
|  signal the absence of authentication by including a "Cache-Control:
|  public" header in the response.

+  [##Note: the text below could be moved to a separate subsection
+  elsewhere in the 1.1 draft.  Some of the text at the start of this
+  section could be repeated at the start of that separate
+  subsection.##]

   A cache may always store the relayed 200 (OK) responses from a
   varying resource, and can refresh them according to the rules in
|  Section aa.bb [##Which will be written by Jeff Mogul##].  The
|  partial entities in 206 (Partial Content) responses from varying
|  resources may also be stored.

   When getting a request on a varying resource, a cache can only
|  return a cached 200 (OK) response to one of its clients in two
   particular cases.

   First, if a cache gets a request on a varying resource for which it
   has cached one or more responses with Vary or Alternates headers,
   it can relay that request towards the origin server, adding an
+  If-Invalid header listing the cval-info values in the Cval headers
+  (Section 10.cval) of the cached responses.  If it then gets back a
+  3xx (Ppp Qqq) [##TBS ##] response with the cval-info of a cached
   200 (OK) response in its Cval header, it can return this cached 200
   (OK) response to its client, after merging in any of the 3xx
   response headers as specified in Section xx.yy [##Which will be
   written by Jeff Mogul##].

   Second, if a cache gets a request on a varying resource, it can
   return to its client a cached, fresh 200 (OK) response which has
   Vary or Alternates headers, provided that

       - the Vary and Alternates headers of this fresh response
         specify that only request header fields are selecting
         parameters,

       - the specified selecting request header fields of the current
         request match the specified selecting request header fields
         of a previous request on the resource relayed towards the
         origin server,

       - this previous request got a 200 (OK) or 3xx (Ppp Qqq)
+        response which had the same cval-info value in its CVal
         header as the cached, fresh 200 (OK) response.

   Two sequences of selecting request header fields match if and only
   if the first sequence can be transformed into the second sequence
   by only adding or removing whitespace at places in fields where
   this is allowed according to the syntax rules in this
   specification.

   [##Note that a more complicated matching rule could be defined in a
   future specification.  The rule above reflects the consensus of the
   editorial group on how complex we can get in HTTP/1.1##]

|  [##Note that the above rule says sequences, not sets of request
|  headers.  It cannot say sets because, for some request headers
|  (like Via?) which contain comma-separated lists, if you have two in
|  a request, the order in which they appear matters.  A simple
|  matching rule which would allow some forms of re-shuffling and
|  collapsing of request headers to get a match turned out to be
|  beyond my capabilities to write.##]

+  [##Jeff Mogul has made some suggestions for a better matching rule,
+  though the specification of this rule uses much more text.  A
+  future version of this text may have a better matching rule in a
+  separate subsection.##]

|  If a cached 200 (OK) response may be returned to a request on a
|  varying resource which included Range request header, then a cache
|  may also use this 200 (OK) response to construct and return a 206
|  (Partial Content) response with the requested range.

         Note: Implementation of support for the second case above is
         mainly interesting in user agent caches, as a user agent
         cache will generally have an easy way of determining whether
         the sequence of request header fields of the current request
         equals the sequence sent in an earlier request on the same
         resource.  Proxy caches supporting the second case would have
         to record diverse sequences of request header fields
         previously relayed; the implementation effort associated with
         this may not be balanced by a sufficient payoff in traffic
         savings.  A planned specification of a content negotiation
         mechanism will define additional cases in which proxy caches
         can return a cached 200 (OK) response without contacting the
         origin server.  The implementation effort associated with
         support for these additional cases is expected to have a much
         better cost/benefit ratio.

  [##Note that the `planned specification of a content negotiation
  mechanism' above does not necessarily have to be draft-holtman!'  In
  theory, a content negotiation mechanism totally unlike draft-holtman
  could just as well live up to these cost/benefit expectations.##]

10.a  Alternates

   The Alternates response-header field is used by origin servers to
   signal that the resource identified by the request-URI and the Host
   request header (present if the request-URI is not an absoluteURI)
   has the capability to send different responses depending on the
   accept headers in the request message.  This has an important
   effect on cache management, particularly for caching proxies which
   service a diverse set of user agents.  This effect is covered in
   Section 10.v.

       Alternates           = "Alternates" ":" opaque-field

       opaque-field         = field-value

   The Alternates header is included into HTTP/1.1 to make HTTP/1.1
   caches compatible with a planned content negotiation mechanism.
   HTTP/1.1 allows a future content negotiation standard to define the
   format of the Alternates header field-value, as long as the defined
   format satisfies the general rules in Section 4.2.

   To ensure compatibility with future experimental or standardized
   software, caching HTTP/1.1 clients must treat all Alternates
   headers in a response as synonymous to the following Vary header:

         Vary: {accept-headers}

   and follow the caching rules associated with the presence of this
   Vary header, as covered in Section 10.v.  HTTP/1.1 allows origin
   servers to send Alternates headers under experimental conditions.


10.u  URI

+  [##Note: there used to be text for the URI header here.  It was
+  removed because Roy Fielding will supply the final text, which will
+  differ significantly from the text that was here.##]
+

** II. Changed status code descriptions

300 Multiple Choices

   This status code is reserved for future use by a planned content
   negotiation mechanism.  HTTP/1.1 user agents receiving a 300
   response which includes a Location header can treat this response
   as they would treat a 303 (See Other) response.  If no Location
   header is included, the appropriate action is to display the entity
   enclosed in the response to the user.

406 None Acceptable

   This status code is reserved for future use by a planned content
   negotiation mechanism.  HTTP/1.1 user agents receiving a 406
   response which includes a Location header can treat this response
   as they would treat a 303 (See Other) response.  If no Location
   header is included, the appropriate action is to display the entity
   enclosed in the response to the user.


** III.  New text for the (new) caching section

13.x Interoperability of varying resources with HTTP/1.0 proxy caches

  [## Note: the text in 13.x could be part of a larger subsection in
  the 1.1 document##]

  If the correct handling of responses from a varying resource
  (Section 10.v) by HTTP/1.0 proxy caches in the response chain is
  important, HTTP/1.1 origin servers can include the following Expires
  (Section 10.exp) response header in all responses from the varying
  resource:

     Expires: Thu, 01 Jan 1980 00:00:00 GMT

  If this Expires header is included, the server should usually also
  include a Cache-Control header for the benefit of HTTP/1.1 caches,
  for example

     Cache-Control: max-age=604800

  which overrides the freshness lifetime of zero seconds specified by
  the included Expires header.


13.y Cache replacement for varying resources

+ [##Note to Jeff: You will have to sync your caching text with the
+ text below.  I cannot move here without loosing upwards
+ compatibility, which means that I cannot move at all.##]

  If a new 200 (OK) response is received from a non-varying resource
  while an old 200 (OK) response is cached, caches can delete this old
  response from cache memory and insert the new response.  For 200
  (OK) responses from varying resources (Section 10.v), cache
  replacement is more complex.

  HTTP/1.1 allows the authors of varying resources to guide cache
  replacement by the inclusion of elements of so-called replacement
  keys in the responses of these resources.  The replacement key of a
  varying response consists of two elements, both of which may be
  empty strings, separated by a semicolon:

       replacement-key  =  variant-id ";" absoluteURI

  The variant-id element of the replacement key is the variant-id
| value in the Cval header of the response, if a Cval header which
  such a value is present, and an empty string otherwise.  The
  absoluteURI element of the replacement key is the absolute URI given
  in, or derived from, the Content-Location header of the response if
  present, and and an empty string if no Content-Location header is
  present.

|
  If a cache has stored in memory a 200 (OK) response with a certain
  replacement key, and receives, from the same resource, a new 200
  (OK) response which has the same replacement key, this should be
  interpreted as a signal from the resource author that the old
  response can be deleted from cache memory and replaced by the new
  response.

  The replacement key mechanism cannot cause deletion from cache
  memory of old responses with replacement keys that will no longer be
  used.  It is expected that the normal `least recently used'
  replacement heuristics employed by caches will eventually cause such
  old responses to be deleted.

| All 200 (OK) responses from varying resources should include
| replacement key elements.  Resource authors may not assume that
| caches will be able to cache responses not including replacement key
| elements.  If a Vary header is used to signal variance, the response
| should include a variant-id value as the replacement key element.
| The Content-Location header should only be used to supply a
| replacement key element if an Alternates header is present in the
| response.


[End of document]

Received on Monday, 8 April 1996 16:23:09 UTC