Re: Draft TAG finding available: Client handling of MIME headers from noah_mendelsohn@us.ibm.com on 2003-05-07 (www-tag@w3.org from May 2003)

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 7 May 2003 17:11:21 -0400
To: Tim Bray <tbray@textuality.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, www-tag@w3.org
Message-ID: <OF17CC2538.7353AAB8-ON85256D1F.00708818@lotus.com>
> > Issue #2: if a client *does* send the content type with PUT, is a 
server
> > allowed to override that (that's what currently happens with
> > Apache/moddav).?
> 
> Really?  Blecch.  This seems completely 100% architecturally wrong and 
> should be fixed.

I would welcome a little guidance on this one from those who know web 
architecture a bit better than I do.  Here's how I understand the 
architecture to work.  If I'm right, then the analysis is a bit trickier 
than I've seen so far in this discussion.  I start by setting out my 
admittedly imperfect understanding of the pertinent web architecture, then 
suggest the answer to the question above.

Assumptions about Web Architecture
==================================

The instruction in RFC 2616 that "The PUT method requests that the 
enclosed entity be stored under the  supplied Request-URI"  has always 
seemed to me a bit strange given, the base architectural assumption that 
what we exchange on the Web are representations of resources.   It seems 
incoherent to say "store the entity", except maybe as a hint.  I would 
have thought it more appropriate to say "put the resource at the supplied 
Request-URI into a state corresponding to the representation provided with 
the PUT."  If that's true, we don't know and can't know what's actually 
stored at the server, except in the particular case that the server or 
resource documentation chooses to tell us.  All we know is that the 
supplied representation has been used to set some suitable state of the 
resource.  If I do a PUT to a clock resource with some text/plain stream 
that indicates a time, we don't know in general whether the clock stores 
the text, or whether it merely sets a clock that may or may not continue 
to run.  Do I have that right? 

Furthermore, whether or not the server tends to store octet streams (as 
opposed to, say, clocks), I thought there was no general requirement to 
respond to a GET with the media type last used on a PUT.  A clock might 
well respond with an image/gif of a clock face even if it was originally 
set with a PUT of text/plain.  Am I still getting this right?

If so, then we have to admit that the case in question here is a 
particular special case.  It's a very common special case, but in some 
ways tricky.  Why?  Because this Apache-like system is a case where the 
server is in some sense dumb.  It doesn't really understand what it's 
storing, at least not reliably and in all cases.  If I do a PUT of an 
octet stream with some mysterious media type it may indeed hold the bits 
and respond to a subsequent GET with the same bits, but even as the 
seeming owner of the resource, it doesn't really in any sense understand 
the import of the state it's holding, except insofar as the state is the 
pair {content-type,octet stream}.

A Proposed Answer to the Question in this Thread
================================================

We seem to be asking the question:  given a URI owner (embodied as the 
server) that chooses to model the internal state of its resource as such a 
{content type, octet stream} pair, and that chooses to establish that 
state in the obvious manner following a PUT with explicit content type, 
what are the legal responses to a GET?  It seems to me the answer is: 
well, you're responsible for responding with an octet stream and content 
type that "represents" the state of your resource.  Surely the obvious 
response is with the pair you've been storing.  Are you allowed to 
"override" the content type as suggested above?  I would think >only if 
you can make the case that the octet stream, interpreted per the 
overriding type, is an accurate representation of your resource's state<. 
It seems to me that it's usually not, and in therefore usually 
inappropriate. 

If I've understood all this correctly, then the precise conclusion is: 
blind overriding of the media type is inappropriate, but not specifically 
because a PUT was used to establish the state.  The overriding is 
inappropriate only in the case where the resource warrants that its true 
state is indeed to be thought of as a (content type, octet stream pair} 
>AND< that the server does not understand the media type well enough to do 
a conversion.  In that specific (but common) case, the overriding type is 
likely to lie.   If, on the other hand, I warrant that the state of my 
resource is a time of day, then it seems perfectly reasonable to accept 
something like text/ascii on a PUT, and respond to a get with an image/gif 
of a clock face.  Furthermore, it seems to me that relabling text/html as 
text/plain need not in all cases be an error, since the server may chose 
to claim "My resource state for text files is always modeled as plain -- 
like vi or Notepad, if I'm given text/html, I the owner of the resource 
choose to consider it as merely plain text.  For that reason, a response 
of text/plain is in fact a correct representation of my state."

From the outside, all of these cases are hard to distinguish.  As a 
client, unless I have specific knowledge of how a particular resource 
behaves, I must be prepared to do a PUT to some URI, and later find that a 
GET responds with a different media type or content (even if I know from 
external means that no subsequent PUTs or POSTs have been done.)

Again, I'm not a web architecture expert, and would be very grateful if 
someone who is could clarify where I've got this right and where not.  If 
I do have it right, then perhaps this note sheds some light on the 
question being discussed.  Thank you!

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Wednesday, 7 May 2003 17:20:30 UTC