Re: HTTP Protocol Extentions from Arthur van Hoff on 1997-09-15 (www-push@w3.org from July to September 1997)

From: Arthur van Hoff <avh@marimba.com>
Date: Mon, 15 Sep 1997 09:54:37 -0700
To: Larry Masinter <masinter@parc.xerox.com>
CC: Push Workshop <www-push@w3.org>, DRP Mailing List <drp@marimba.com>
Message-ID: <341D684D.663FA473@marimba.com>

Hi Larry,

> It seems like what you want is some kind of globally unique
> ETag, e.g., such that a cache could presume that if it had
> data with same globally unique ETag anywhere in its cache,
> it could use that data; on the other hand, you also want
> those etags to be 'global'
> 
> Since there are both 'strong' and 'weak' etags, maybe we could
> add global etags too. For the most part, you can just use
> them like (strong) etags, but the origin server guarantees
> global uniqueness.

We could suggest the following change to the entity tag
definition in the HTTP/1.1 specification:

   entity-tag = [ weak | global ] opaque-tag
   weak       = "W/"
   global     = "G/"
   opaque-tag = quoted-string

This approach has two problems:

1) Because the actual entity tag is opaque there is not
   structure to it. As a result there is no scheme that 
   can be used to avoid clashes. What if one server uses a
   hexadecimal number which represents a checksum, and another
   server uses a hexadecimal number which represents a version.

2) Because of the lack of structure, it will be necessary
   to specify a verifiable checksum using a seperate header.
   I don't like this because it means duplication, and it
   means that each file is identified by two identifiers,
   an opaque one, and a checksum. The two will be mostly
   interchangable.

I would prefer an approach where a global entity tag is
not opaque. For example:

   entity-tag = [ weak ] opaque-tag | global global-tag
   weak       = "W/"
   opaque-tag = quoted-string
   global     = "G/"
   global-tag = <"> URI *( "," URI ) <">
   md5-URN    = "urn:md5:" base64-number
   sha-URN    = "urn:sha:" base64-number
 
This would assign some structure to the global tag name
space, which means that collissions can be avoided,
and it would allow a client to parse the URIs and verify
checksums when necessary.

> "Differential-ID" is a like range retrieval and has many of
> the same caveats. The header name should reflect that it is
> a modifier to the request rather than (just) an identifier.

Maybe we can define a range-specifier as follows:

   range-specifier       = byte-range-specifier | diff-range-specifier
   diff-range-specifier  = "diff=" global-tag

This would allow the client to use the Range header to
request a diff. The Content-Range header would be
used to identify the returned diff in the reply.

This would work fine, except that it may be incompatible 
with existing HTTP/1.1 implementations. Also, if ignored
it will be less efficient that the method that we originally
proposed. It allowed diffs to be cached in any HTTP/1.1 server.

By the way, both of these solutions would require changes
to the HTTP/1.1 specification. I'm not sure if that is a
wise thing to do since HTTP/1.1 seems almost baked.

Have fun,

	Arthur van Hoff

Received on Monday, 15 September 1997 12:55:35 UTC