- From: Roy T. Fielding <fielding@apache.org>
- Date: Thu, 23 Jan 2003 15:52:06 -0800
- To: Sandro Hawke <sandro@w3.org>
- Cc: Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
>> I still don't understand how that system explains a POST >> of a message to an HTTP-to-SMS gateway that is identified by >> an http URI. I'd like to understand that. [...] > So I think of the web as mediated shared memory. Each web address > (URI) points to a storage location. GET means to read the contents of > a location, PUT means to store replacement contents in a location. > Sometimes I think of the locations as individual whiteboards, bulletin > boards, shelves, slots, or parts of a landscape where a signboard > could be placed. That doesn't solve the issue that TimBL mentioned, because it simply replaces resource (a concept independent of any implementation) with a conceptual definition of one particular implementation. We still have the issue that the identifier is being used to identify both the shared memory and what lies behind that shared memory. Worse, we've broken the consistency of the REST model for non-http URIs when introduced into the same interface -- it isn't reasonable to claim that those URIs identify shared memory, but it is quite reasonable for us to produce representations of them on demand. [...] > But now I'm handwaving. Are you nodding in understanding or scowling? I am trying to understand why it is necessary for the architecture to recognize a complex implementation model behind the interface for "http" URIs when it clearly does not do so for non-"http" URIs. I could understand it if you simply declared that an http URI identifies an HTTP interface itself, rather than a conceptual model of the implementation behind the interface, since that would solve part of the problem that TimBL is talking about (direct identification being ambiguous). That is why I said that what Tim and I actually disagree on is not the same as what people (not just you) have been saying we disagree on. The other part, however, is the RDF issue that you have been working on, which I'll try to cover as I go along. People don't use URIs to refer to the interface, but rather to the consistent sameness found through interacting with that interface. That consistent sameness could be thought of as a virtual web page, but thinking so is no different than thinking of it as a concept with a degree of sameness that matches a web page. It means that we either agree that we know nothing about the resource aside from it identifying one sameness of concept (hence the definition), or we agree that we will allow indirect identification within the system and that indirection requires additional context to disambiguate between differing indirect targets. Note that this does not change either TimBL's or my position that each http URI only directly identifies one thing. Please understand that the old-Web's perspective of a resource is always through the generic interface. It doesn't matter what the scheme is or what the URI identifies, the Web interacts with it through a generic interface that already makes it look like a shared memory system (via message passing). RFC 2396's definition, however, encompasses all of the uses of URIs, including for such things as inventory control of real objects. The identity type of a resource has no impact on how it can be viewed via the Web, even if it does have impact on those other systems. People are familiar with this idiom -- they enter an identifier into an information system and the system responds with information about the thing identified by that identifier. I don't know of any system where a car sales organization types a VIN into a computer and expects the car to pop out the speakers. They are just names; expectations will depend on what actions are being applied to those names, not based on how the name is directly bound. HTTP places no requirements on the binding of http names to resources other than the syntax for interpretation and access to the authority. It does not even require that a representation be available for a bound name. VIN places one requirement: that it be indelibly stamped on several places within a single manufactured car and not be reused throughout the expected lifetime of that car. If I were to tattoo an http URI on Mark's forehead and forbid anyone else from tattooing the same URI, then I can reasonably claim that it directly identifies him every bit as much as the VIN directly identifies the car. The validity of that binding is a social problem, not a technical one distinguished by the naming syntax. Likewise, if I perform a GET request on that URI, I must accept the fact that what I get back is not Mark -- it isn't even necessarily a picture of Mark. What I get back is only a representation as he defines it, and whether or not the result is a useful resource will depend on his ability to maintain the accuracy of its apparent state over time. Of course, I would never do that -- I'd just tag him with a URI that has no representations and claim he isn't a useful resource *when on the Web*, even though he usually is in real life. Anyway, I hate extreme analogies like that because they really don't illuminate anything that we would implement. Now, let's consider the needs of the Semantic Web. Like other systems that use URIs outside of the Web, the SW does not interact with resources through the generic interface. That's fine. However, when the SW makes assertions about *behavior* on the Web, then it must take into account the fact that clients of the Web interact with resources through that generic interface. The SW cannot make assertions about the potential result of an interaction on the Web without specifying time, method, URI, and perhaps a few other things depending on the nature of the assertion. That is because Web behavior is defined by those elements as much as it is by the URI. Getting back to the problem that TimBL described, he would like to define the URI as identifying the virtual Web page -- the sameness that is perceived from all responses to GET over time. What I can't seem to get across is that the resource in REST is the sameness that is perceived from all responses to all methods over time. They are the same model, though I have so far failed to convince everyone that the web page is just how the interface is presented on the Web rather than the object of interaction. REST only knows about information resources (see my dissertation) because that is all the Web knows about. 2396, on the other hand, and resources in general, are not limited to information resources. As you say, changes on a shared memory can cause changes on the associated backing "reality". When you make those changes, are you thinking to yourself that you really want to change that shared memory, or that you really want to change the state of that object to which it is only acting as an interface? I am firmly convinced that users of a web interface to a microwave oven are not thinking about its shared memory when they select "five minutes", "high power", and then "start". Does the URI identify the control or just an interface to the controller? I just don't care -- the sameness of perceived interactions are identical and therefore the same resource, whether you imagine it to be the interface or the control itself. The only way to distinguish the two is to exit the system of discourse entirely, at which point we can no longer use the same identifiers as used within the system without additionally defining context. I'll try to illustrate with prose rather than attempt an RDF description (and risk incorrect syntax): Let's say I make a set of assertions like "<http://www.w3.org/> is presented in a way that is clear and easy to read through use of a three-column format". I hope that all of us agree that, for this context, the URI is being used to (in)directly identify the Web page. But is the target of that assertion the resource? A Web page, as observed by the user, is actually a coordinated set of responses to GET requests on multiple resources that eventually results in an application steady-state known as the completely rendered page. In this case, Navigator makes the following requests of separate resources in order to form the Web page: http://www.w3.org/ http://www.w3.org/StyleSheets/home.css http://www.w3.org/Icons/w3c_main http://www.w3.org/Icons/right http://www.w3.org/Icons/Logo_25wht.gif http://www.w3.org/Icons/valid-xhtml10 http://www.w3.org/Icons/valid-css http://www.w3.org/WAI/wcag1AA and the result is something that I agree is a clear and easy to read source of information in three-column format. If, however, I switch to the "links" browser, then I get a Web page consisting of one representation derived from http://www.w3.org/ and I am happy to say that it also is clear and easy to read. However, it is not in three-column format. That is because the Web page is not just a product of the first resource, but a product of the capabilities and behavior of the browser in interpreting a sequence of related actions. "Render this" is not equivalent to "this". Does that mean the identifier is ambiguous? No, it means that the URI alone is insufficient to target the assertion. I can just hear people thinking: "Well, that's a silly example, everyone knows that presentation should be separated from content." Okay, let's claim for a second that the URI actually identifies the virtual notion of it being a Web page, which holds true regardless of the subsidiary presentation resources. Fine, but then consider that the reason it is called content negotiation in HTTP, rather than simply format negotiation, is because the server can deliver different content based on aspects of the request *other* than the method and URI. So it isn't a virtual web page that is being identified, but rather a set of potential web pages, one of which the server will select for a given request. To what degree then can these individual virtual web pages differ before they are no longer considered to be "the same resource"? The answer is: to whatever degree that the authority considers sufficient to maintain the sameness of representation that characterizes it as being a resource. In other words, the URI identifies a conceptual mapping to a set of entities, and because it is a Uniform Resource Identifier, it follows that this must be our definition of resource on the Web. The fact that it is desirable for that sameness of mapping to be as broad and consistent as possible for most resources does not imply that it must be so for all resources. A resource will be as consistent as it needs to be in order to be a useful as a future source of information for its intended consumers. Any assumptions about the nature or form of what is being identified are hidden by the interface. Miles apparently wants me to remove the definition of resource because it arbitrarily constrains other models. I disagree. There are no models that I know of for which the definition of resource is not a superset of what they wish to identify. Other systems can restrict the domain of resources used within that system however they like, but as soon as they make reference to a resource in another system, such as the Semantic Web making reference to Web resources, then they have no choice but to recognize the meaning of that other system's identifiers. The results will be ambiguous otherwise, and its not the other system's fault, and its not because the definition of resource is vague or tied to any one model. The definition is in 2396 because of the very long and painful debate about URNs, in which confusion of the scope of resources (e.g., assuming they were machines or files) led to a huge waste of energy on pointless debates far worse than this one. Cheers, Roy T. Fielding, Chief Scientist, Day Software 2 Corporate Plaza, Suite 150 Newport Beach, CA 92660-7929 fax:+1.949.644.5064 (roy.fielding@day.com) <http://www.day.com/> Co-founder, The Apache Software Foundation (fielding@apache.org) <http://www.apache.org/>
Received on Thursday, 23 January 2003 18:52:05 UTC