Re: ETags Re: The range of the HTTP dereference function from Jeffrey Mogul on 2002-04-01 (www-tag@w3.org from April 2002)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Mon, 01 Apr 2002 15:19:40 -0800
To: <LMM@acm.org>
Cc: "'Jeff Bone'" <jbone@jump.net>, <www-tag@w3.org>, mogul@pa.dec.com
Message-Id: <200204012319.PAA30557@wera.pa.dec.com>
    > But ETags don't really fully solve the problem, do they?  
    (reference to preprint of Jeff Mogul's www2002 paper).

Not being a member of the www-tag list, it took me a while
to dig up Jeff Bone's original question:

    But ETags don't really fully solve the problem, do they?  This
    stems from the ambiguity in the relationships between resources,
    representations, and the bits that one gets back in response to an
    HTTP request.  ETags interact badly with things like range
    requests, compression, delta encoding, etc.  Does the ETag relate
    to the baf-of-bits received in such a case, or the potentially
    reconstructed "snapshot" / representation of resource state that
    can be built from several of these things?  Mogul's recent preprint
    [1] IMO does a good job both of elaborating the problem and
    presenting a model and mechanisms to address it.

Glad to see someone likes the paper!  However, I'm concerned that you
may have missed a point that I tried to make.  "ETags" (more formally,
"entity tags") can easily be made to interact well with "things like
range requests, compression, delta encoding, etc."  The key is to
realize that they are NOT connected with "entities" -- this is an
unfortunate consequence of several terminology choices during the
HTTP/1.1 design process.

There's no consistent way to associate an "entity tag" with the
"bag-of-bits" that HTTP/1.1 defines an entity to be.  Moreover, this is
not a very useful thing to tag; what we *do* need to be able to tag is
the "snapshot ... of resource state".

If one adopts my "instance" terminology for the resource-state
snapshot, then what HTTP/1.1 calls an "entity tag" is really an
"instance tag", even if we need to continue to use the "Etag" header to
carry these values (since we don't want to change the on-the-wire
specification).  In which case I think that Etags do solve the problem
that they were designed to solve (even though we didn't really
understand, while writing RFC2616, how to formalize this).

Larry wrote:
    
    ETags don't fully solve "the" problem, but they solve
    "a" problem. That they can't reliably be used with
    range requests, compression, delta encoding, etc. is
    a controllable problem: don't do it.
    
    "Doctor, Doctor, it hurts when I use etags with
     range requests, compression, and delta encoding."
    
No, this isn't true!  What hurts is trying to think about
"entity tags" as having something to do with "entities".  When
you give up trying to do that, it's pretty painless.

    Jeff makes a useful analysis and lays out a direction
    for a solution, but I'm not sure there's a compelling
    case for it being "worth it" to add another layer of
    tags. It might be simpler to just disallow late-stage
    ETags and use the ETag header for what Jeff wants to
    call Instance Tags.
    
Again, I think this is exactly what I tried to say.  (I've
been told that this part of the paper isn't as clear as it
might have been; sorry!)  In particular, I can't see any
utility for a system of tagging the actual "entities" (as
defined in HTTP/1.1) because they are too ephemeral to be
worth referring to more than once.  (With one possible
exception: if you are trying to optimize performance by
suppressing duplicate message transmission.  The WWW 2002
paper that I co-wrote with Terence Kelly discusses this
issue, but here I think the appropriate identification mechanism
is an implicit one based on message digests, not an explicit
one based on tags.  However, we haven't yet tried to figure
out how to make this work in the presence of partial-result
messages, and that could force me to re-examine the issue.)

    I think Jeff's paper (and Bala's book, for that matter)
    call attention to the fact that HTTP isn't "done".
    While there's a lot of attention on "XML protocol"
    and dealing with the problems using HTTP for things
    other than the classical web, since HTTP-NG, there's
    been no current charatered activity working through
    the HTTP-for-web issues. (This is out of scope for
    WEBI and OPES, I think.)

Larry and I have occasionally discussed the need for some sort
of standing committee of the IETF to bring some coherency
to whatever HTTP-related efforts continue to crop up (or, perhaps,
to create the efforts that seem to be in need of a prod).
I'm sure he would do a fine job of managing such a group!

-Jeff
Received on Monday, 1 April 2002 18:19:53 UTC