Record of http discussions at the Paris WWW conference

I'm documenting some HTTP things talked about at the Paris web
conference below.  These are my own recollections: I can't guarantee
that everything is correct.  In fact, I expect some things below to be
incorrect, or at least one-sided.


1. HTTP session on developer's day

Some questions from the audience (there were more, but these were
about things already discussed before on the mailing list):

Q: will 1.1 support a mode in which a cache does authentication for
   cached pages on behalf of the origin server?

A: not directly, but you can label your content
     Cache-control: public, must-revalidate
   which is almost as efficient: there origin server still needs to do
   the authentication itself, but the data is served from the cache.

Q: will transparent content negotiation [not in 1.1, but planned on
   top of 1.1] support negotiation on bandwidth requirements /
   transmission time of variants?

A: feature negotiation will allow one to do such things, but it is not
   know whether the bandwidth negotiation mechanism which can be built
   using the feature negotiation framework will be powerful enough.
   There are no special protocol elements to measure or adapt to
   available bandwidth.

Q: does content negotiation intend to support the retrieval of old
   versions of a document?

A: No. No version retrieval mechanisms are planned for inclusion in
   1.1.

[Side remark: With help from someone in the audience, we were able to
make some good propaganda for the Host header.]

2. HTTP BOF

Several things were discussed, mostly related to caching.  Proposed
draft text will hopefully be posted by some of the participants soon.

One important change I want to mention here is that the requirement
that proxies MUST listen to Cache-Control: no-cache in `reload button'
requests from the user agent is dropped.  If this header is ignored, a
Warning response header MUST of course be generated.  This change is
to allow some (educational) sites to use a caching proxy as a
non-circumventable bandwidth-reducing device for requests on `sites of
dubious educational value'.


3. Meetings of the editorial group and hallway conversations

3a.

We shared the opinion that section 16 (caching) would need a careful
rewrite.  Some of us expect that a rewrite would make the section much
shorter (by removing redundant stuff).

3b.

In general, we seemed to agree that it would be very good if
simplifications could be made.  However, even among ourselves, it
proved extremely difficult to cut features without making at least one
person unhappy.  So it seems that the simplifications will have to be
made to the explanation of the 1.1 mechanisms, rather than to the
mechanisms themselves.

3c.

However, for a combination of procedural and technical reasons, most
of us _would_ like to cut the directives

                        "no-store"
                        "no-transform"
                        "proxy-revalidate"
                        "only-if-cached"
                        "min-vers" "=" HTTP-Version

which seem to have been introduced by Jeff Mogul at the last minute.

3d.

Also, it seems that among the editorial group members present at the
web conference, the people who want to remove the If-Range header
outnumber the people that want to keep it.

3e.

The editorial group spent most of its time discussing how to simplify
the caching mechanism for generic resources.  In the end, there seemed
to be agreement that the mechanism needs to support all the
functionality it supports now.  

The associated complexity of cache implementations can thus not be
reduced, but we expect that this complexity can be made more
manageable by moving it to another corner of the protocol, i.e. by
putting information in other headers and changing some concepts and
terminology.

We worked out the following simplification:

  - adopt a model in which generic resources bind to other resources
    (called variant resources below), not to resource entities

  - thus, for a single resource, at most one response can be cached.
    A resource only has one expiration time associated with it.  A
    generic resource is nothing but a `portal' through which variant
    resources are accessed.

  - when a generic resource is accessed the model of `what happens' is
    as follows:
     1) a request on generic resource is received
     2) using the request, the server chooses one of the variant
        resources bound to the generic resource
     3) the server then internally redirects the request to the chosen
        variant resource, i.e. it generates a response message as if
        the request was done directly on the variant resource
     4) the server takes the response from step 3) and adds a
        Content-Location header with the URI of the selected variant
        resource.  This URI may be a relative URI.
     5) this response is then sent it to the client.

  - variant resources would all have different URI's, but these would
    not necessarily have a meaning outside the scope of the negotiated
    resource that bound to the variant resource.  In particular,
    direct GETs on the variant resource URI need not do anything other
    than return an error message.  Direct GETs need not even return
    the same entity.  The cache key for responses from a variant
    resource is (request-URI, variant-resource-URI) where request-URI
    is the URI of the generic resource on which the request was made.
    The cache key is _not_ (variant-resource-URI).

  - There is no spoofing mechanism: i.e. a response with come
    Content-Location header may _not_ be used by 1.1 caches for
    serving later requests which are done directly on the
    Content-Location URI.

  - actually, the only thing that the Content-Location URI is useful
    for in the context of HTTP is cache memory management heuristics.
    Such heuristics will not be defined by HTTP, the spec will only
    point out that Content-Location is useful for these heuristics

  - variant-IDs are deleted from the protocol.  Their function is
    taken over by the contents of the Content-Location field.  

  - whether a resource is generic or plain is a binary property. A
    resource may change from being generic to plain, and the other way
    around, at any point in time.    All variant resources bound to a
    generic resource are plain by definition.

  - entity tags will have the property of being what was called
    `selecting opaque validators' in the 02 version of the draft.
    This means that the entity tags of any two different entities
    returned by different variant resources bound to the same generic
    resource are guaranteed to be different.  Another way to look at
    this is to say that a variant-ID is opaquely encoded in the entity
    identifier.

  - This eliminates the need to define `resource entities'.  Also, the
    ETag and conditional request headers get a bit simpler.

  - Renaming `generic resources' to `negotiated resources' is
    considered to be a good idea by some.

  - Renaming `entity tags' to `entity identifiers' is considered to be
    a good idea by some.


Some examples:

 draft 03 style response from a generic resource:

   HTTP/1.1 200 OK
   ETag: "3420";"en"
   Content-Language: en
   ....

 new editorial group style response from a generic resource:

   HTTP/1.1 200 OK
   ETag: "3420-en"
   Content-Location: paper.en.html
   Content-Language: en
   ....

  [Note: of course, "3420-en" will often be something more opaque like
  "3423223".]

 draft 03 style conditional request on a generic resource:

   GET /blah/paper HTTP/1.1
   If-NoneMatch: "3240";"en", "3442";"fr", "3240";"dk"
   ....

 new editorial group style conditional request on a generic resource:

   GET /blah/paper HTTP/1.1
   If-NoneMatch: "3240-en", "3442-fr", "3240-dk"
   ....


Koen.

Received on Tuesday, 14 May 1996 15:09:25 UTC