Re: Variant IDs from Koen Holtman on 1996-02-10 (http-caching-historical@w3.org from February 1996)

From: Koen Holtman <koen@win.tue.nl>
Date: Sat, 10 Feb 1996 22:37:57 +0100 (MET)
To: http-caching@pa.dec.com
Cc: koen@win.tue.nl (Koen Holtman)
Message-Id: <199602102137.WAA08191@wsooti04.win.tue.nl>
I've been away for a few days, so I'll reply to several remarks in the
Variant IDs thread at once in this message.  Here are my current views
on a number of Variant ID issues.

1) About the `is unwillingness to list all variants in an URI header a
frequently occuring case' issue: I'm convinced by the

   21 languages * (1 zipped + 1 unzipped) * 3 MIME types * 3 charsets

example by Paul Leach that there are indeed cases where you do not
want to list every available version in the URI header.

I'm not yet sure if this case will occur often enough to require
optimizations in the protocol beyond the optimizations (302 (See
other) redirection, URI headers that vary) that would are already be
possible under 1.1.


2) About Variant-ID/Variant-set duplicating the
Variant-If-modified-since (and variant-if-validator-valid) mechanism
in (URI header based) content negotiation: I think now that I
previously overestimated the amount of duplication that would happen.

My content negotiation definitions are big, but the parts devoted to
the Variant-IMS mechanism are not.  The main complexity generators in
URI based content negotiation are elsewhere.

I suspect it would be possible to define the semantics of Variant-ID
and Variant-Set in a page or half a page of text.


3) Putting 1) and 2) together, 
  - I see no good reason anymore to oppose putting Variant-ID and
    Variant-set in the 1.1 spec as optional optimizations
  - But I am also not convinced yet that they should go in.


4) A side issue: I'd rather not solve charset problems by building
charset converters into servers.  With server side conversion, a lot
of bandwidth and proxy cache memory is used to transmit and cache all
the different versions.

Charset conversion intelligence should be at the user agent side as
much as possible.


5) A note for Jeffrey Mogul and Paul Leach who are (I believe) writing
up 1.1 spec language for Variant-ID and Variant-Set:

The addition of Variant-IDs does add some subtle issues that will be
tricky to define: suppose that a response varies only on user-agent.
Consider the following scenario:

Step 1:

Cache sends request:
  GET blah HTTP/1.1
  User-Agent: Blebber1.1
  ...

Origin server sends response:
  HTTP/1.1 200 OK
  Variant-ID: 5
  Cache-control: max-age=1000
  Length: 3000
  Validator: pppp
  ....
  [3000 byte body]

Step 2:

Cache sends request:
  GET blah HTTP/1.1
  User-Agent: Blebber2.2
  Variant-Set: 5;pppp
  Vary: user-agent
  ...

Origin server sends response:
  HTTP/1.1 3xx Variant response not included
  Variant-ID: 5
  Vary: user-agent
  ...

Now, the cache knows that the origin server maps User-Agent:
Blebber2.2 to variant 5.  Now for how long may the cache serve variant
5 for requests with User-Agent: Blebber2.2 without validation?  Is
this determined by the Cache-control: max-age=1000 header in the first
response?  Or is it determined by any caching related header in the
step 2 response?

Both approaches would work, but defining semantics for all possible
cases will be tricky.


6) About 925 byte URI headers vs. 350 byte Variant-Set headers:

- With some work (allowing "l" to stand for "language", "v" for
"variant", etc), the URI header can be made a lot shorter.

- Also, as Roy Fielding pointed out, the URI header says something
that is useful, it is not just there to optimize caching.

- I don't expect 925 URI headers to be used that often.  Most
resources will not be negotiable, and most negotiable resources will
only have a few variants.

- I believe that a 350 byte _request_ header will usually slow things
down more than a 925 byte _response_ header.  I understand that if a
request is larger than one packet, sending the request may take
several RTTs due to TCP/IP slow start/flow control mechanisms.  Adding
925 bytes to a response has less impact, as responses are too large to
fit into one packet anyway.

Koen.
Received on Saturday, 10 February 1996 21:52:07 UTC