Re: Issue 39: proposed example for varying the etag based on conneg from Nathan on 2010-04-02 (ietf-http-wg@w3.org from April to June 2010)

From: Nathan <nathan@webr3.org>
Date: Fri, 02 Apr 2010 23:18:07 +0100
To: Jamie Lokier <jamie@shareable.org>
CC: Yves Lafon <ylafon@w3.org>, Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4BB66D1F.7040202@webr3.org>
Jamie Lokier wrote:
> Yves Lafon wrote:
>> On Thu, 1 Apr 2010, Jamie Lokier wrote:
>>>> I don't see how... Would you consider the gzipped version of a
>>>> text/plain resource to be "semantically equivalent"?
>>> Yes, without a doubt they are semantically equivalent.  The compressed
>>> and uncompressed versions say exactly the same thing in different
>>> ways.  Their relationship is even purely mechanical and reversible.
>>> Isn't that about as semantically equivalent as you can get?
>> No they are not. At least not directly. The fact that it is compressed is 
>> a property of the entity, in that way it is not the same thing as the 
>> uncompressed text.
>>
>> Same thing for an image of CT images/gif and the exact same image with a 
>> CT of image/png, they are not equivalent unless the application do some 
>> extra processing to figure out they are equal.
> 
> The question was about *weak* etags.  Two things do not have to be
> identical for the purpose of weak etags - strong etags are used for that.
> 
> "Semantically equivalent" for the purpose of *weak* etags is not well
> defined in FRFC2616, and seems to be left to interpretation.
> 
> But it clearly does not mean "the same thing" or "they are equal", nor
> does it require that equivalence could be determined by application
> processing.
> 
> I think it comes down to "if one document was used in place of the
> other, for example shown in a browser, would it have essentially the
> same meaning, differing only in unimportant details?"
> 
> If two documents are byte-for-byte identical except for compression
> aren't sufficiently equivalent to fit your idea of "semantically
> equivalent", can you give an example of two things which *are*
> semantically equivalent but not the same (i.e. suitable for weak etag
> but not strong)?
> 
> I cannot imagine what would fit the criteria, if compression of a type
> which is transparently decompressed in the context of its use does not.
> 
>> If you want the compression to be transparent, it has to be done at the 
>> connection level using TE/Transfer-Encoding, and in that case the ETag wil 
>> be the same (which is the second example in Julian's proposal)
> 
> The question was about weak Etag, not strong.
> 
> The question of semantic equivalance is rather contextual - it depends
> on the meaning that will be extracted from the document.  In the
> context of web browsers, compressed text/* is transparently
> decompressed and so I believe it to be semantically equivalent in that
> context.
> 
> I don't know what semantic equivalence means at all without context.

Last time I asked the following (about 2 weeks back):

"What about cache's giving back the wrong media-type, would a change in
media-type be a semantic change of such magnitude that a weak entity tag
couldn't be used (or would be misleading)?"

Since then, a few simple mental use-cases for conneg with weak entity
tags have crossed my mind.

Same document / different Languages; would a cached german document
suffice when the user wanted an english one (and the server had one)?
- imo, no

Same "content" different mediatype text/text vs text/html; would a
cached html do when the user wanted text - debatable but swap that for
rdf & svg and i think the answer is no.

I continued this on with all the Accept headers and came to the same
conclusions; however Content-Encoding is the grey area for me I'm afraid.

Personally, not that it means anything, I'd be happy to use weak entity
tags only for when a minor change occurs to a representation; such as
stripping whitespace, fixing a (minor) typo, or swapping new line
terminators. For everything else I think I'd rather be safe than sorry
and stay clear.

A secondary point, is that any sharing of weak entity tags across conneg
resources would indicate to me that each resource would need it's own
strong etag, whilst each conneg point would need a weak entity tag -
that feels wrong and the logistics of managing this entirely impractical
- especially when you consider that each representation more than likely
has a different update schedule.. effectively you'd always just have a
single weak entity tag for a conneg point that never changed, as in:

You have four representations which you negotiate across; thus these are
all semantically equivalent; update one of them only and the notion of
conneg indicates they are all still equivalent, thus a weak entity tag
still valid, however all caching would be broken - further after say 100
incremental updates to all resources; the weak etag would still be the
same - completely breaking caching.

Thus I can answer my own aforementioned question and say that a change
in media-type would be a semantic change that meant a weak entity tag
couldn't be used.

As for Content-Encoding and weak entity tags, the only thing I can think
that might influence it is range requests; if the byte offsets differ
between different encodings then they wouldn't be valid so a strong tag
would be needed.

?

Regards,

Nathan

ps: maybe I asked an invalid question!
Received on Friday, 2 April 2010 22:18:45 UTC