Re: How do you POST to a "document"? from Sandro Hawke on 2003-01-24 (www-tag@w3.org from January 2003)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 24 Jan 2003 01:36:20 -0500
To: "Roy T. Fielding" <fielding@apache.org>
cc: Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
Message-Id: <200301240636.h0O6aKI26996@wadimousa.hawke.org>
> >> I still don't understand how that system explains a POST
> >> of a message to an HTTP-to-SMS gateway that is identified by
> >> an http URI.  I'd like to understand that.
> 
> [...]
> 
> > So I think of the web as mediated shared memory.  Each web address
> > (URI) points to a storage location.  GET means to read the contents of
> > a location, PUT means to store replacement contents in a location.
> > Sometimes I think of the locations as individual whiteboards, bulletin
> > boards, shelves, slots, or parts of a landscape where a signboard
> > could be placed.
> 
> That doesn't solve the issue that TimBL mentioned, because it simply
> replaces resource (a concept independent of any implementation)
> with a conceptual definition of one particular implementation.  We
> still have the issue that the identifier is being used to identify
> both the shared memory and what lies behind that shared memory.

Um, no.  With the MSM model we can say "the shared memory location
with address foo" and perhaps go on to talk about other things about
that location (like that it is devoted to a discussion of Pianos, or
whatever).   If the location is being used to store the state of some
object, then perhaps we can call that object the "subject" of that
location.   

> Worse, we've broken the consistency of the REST model for non-http
> URIs when introduced into the same interface -- it isn't reasonable
> to claim that those URIs identify shared memory, but it is quite
> reasonable for us to produce representations of them on demand.

I apply MSM to ftp and other stored-information protocols/schemes, but
I haven't figured out a good way to apply it to mailto and telnet
URIs.  But mostly I'll defer to TimBL's e-mail on that subject in this
thread.

> Getting back to the problem that TimBL described, he would
> like to define the URI as identifying the virtual Web page -- the
> sameness that is perceived from all responses to GET over time.
> What I can't seem to get across is that the resource in REST is
> the sameness that is perceived from all responses to all methods
> over time.

I think those samenesses are different.  If a location is devoted to
storing the state of some object, then users will perceive a sameness
because the information stored at the location is mainained and
presented consistently AND they will perceive a sameness because some
qualities of the abstract resource thing never change very much.
These are almost the same samenessess [ :-) ] but sometimes the
difference matters.

In particular, the difference matters when two or more locations are
devoted to maintaining state information about the same object, but
they do so differently.  Perhaps one is more trusted than another,
more timely, more complete, or throws in fewer pop-up ads.  The user
experience is different on the two sites, yet as far as anyone can
tell, the sites are about exactly the same thing.  Let's imagine the
thing is the Sun, and the locations are both mine.  I declare
http://www.hawke.org/sun-a and http://www.hawke.org/sun-b to both
identify the Sun, and my server gives nice data at both addresses.
But on sun-b, sometimes I give the wrong data, because of a bug in my
software.  People learn this, and learn to stick with sun-a instead.

How do you talk about that with REST?

> As you say, changes on a shared memory can cause changes on the
> associated backing "reality".  When you make those changes, are
> you thinking to yourself that you really want to change that
> shared memory, or that you really want to change the state of
> that object to which it is only acting as an interface?
> I am firmly convinced that users of a web interface to a
> microwave oven are not thinking about its shared memory when
> they select "five minutes", "high power", and then "start".

I think that's just a case of humans being really good at resolving
ambiguity.  If I point to a picture of the Eiffel tower on the wall
across the room and say "I want to go there", you can probably figure
out if I mean I want to go to Paris or the other side of the room.

So in some sense the URI identifies both the shared memory location
and the subject of the information in that location (and probably some
other things).  And HTTP does its job in transfering the data in that
location while caring nothing about the subject matter, and humans do
their job and ignore the location, jumping straight to the subject
matter, when they feel like it.   Is one the universal, one true
meaning of the URI?  I don't think so.  I just want RDF to say which
one(s) it's using.   [ more on that in the next message. ]

> Does that mean the identifier is ambiguous?  No, it means that
> the URI alone is insufficient to target the assertion.
>
> I can just hear people thinking: "Well, that's a silly example,
> everyone knows that presentation should be separated from content."

Yep, the claim should have been phrased differently.  Something about
how it looked in your browser at some time, or something.  Still, the
bit about composition is excellent.  Inclusion/importation issues have
been rather painful in RDF/OWL.

> Okay, let's claim for a second that the URI actually identifies
> the virtual notion of it being a Web page, which holds true regardless
> of the subsidiary presentation resources.  Fine, but then consider
> that the reason it is called content negotiation in HTTP, rather
> than simply format negotiation, is because the server can deliver
> different content based on aspects of the request *other* than
> the method and URI.  So it isn't a virtual web page that is being
> identified, but rather a set of potential web pages, one of which
> the server will select for a given request.  To what degree then
> can these individual virtual web pages differ before they are no
> longer considered to be "the same resource"?  The answer is:
> to whatever degree that the authority considers sufficient to
> maintain the sameness of representation that characterizes it
> as being a resource.  

In MSM, my answer so far is that the MIME entites obtained via GET of
a given location over a range of time in which the contents of the
location have not changed SHOULD all convey the same information, the
same facts.  If some facts are considered much more important than
others, then some responses MAY omit the less important ones.  (IE
there may be lossy formats.)  I think this covers various natural and
formal languages, as well as data format variation.  It does not cover
cookies or address/browser sniffing.  I think I can leave the sniffing
to violations of "SHOULD" and for cookies say the locations are really
addressed by the tuple <uri, cookieSet>.  Cookies are better eaten
than formalized.  :-(

> In other words, the URI identifies a
> conceptual mapping to a set of entities, and because it is a
> Uniform Resource Identifier, it follows that this must be our
> definition of resource on the Web.

At this moment, I don't feel bound to use or justify terminology which
seems to be more confusing than helpful.   

> There are no models that I know of for which the definition of
> resource is not a superset of what they wish to identify.
> Other systems can restrict the domain of resources used within
> that system however they like, but as soon as they make reference
> to a resource in another system, such as the Semantic Web
> making reference to Web resources, then they have no choice
> but to recognize the meaning of that other system's identifiers.
> The results will be ambiguous otherwise, and its not the other
> system's fault, and its not because the definition of resource
> is vague or tied to any one model.  The definition is in 2396
> because of the very long and painful debate about URNs, in which
> confusion of the scope of resources (e.g., assuming they were
> machines or files) led to a huge waste of energy on pointless
> debates far worse than this one.

I don't know if I'm glad this one is relatively minor or sad that one
got so bad.  I used to call it The URI Wars, because I noticed that
every veteran of 2396 I come across seemed to have certain behaviors
best explained as post-traumatic disorders.....  I honestly appreciate
your willingness to revisit this area at all.   

I can't really make sense of that argument about "then they have no
choice but to recognize the meaning of that other system's
identifiers."   I'm going to follow up this message immeditately with
something that will hopefully be more concrete and useful.  

    -- sandro
Received on Friday, 24 January 2003 01:38:47 UTC