- From: Tim Berners-Lee <timbl@w3.org>
- Date: Thu, 23 Jan 2003 22:48:51 -0500
- To: "Roy T. Fielding" <fielding@apache.org>
- Cc: Sandro Hawke <sandro@w3.org>, www-tag@w3.org
> >>> A resource, thus defined, has access mechanisms whereby you can >>> retrieve >>> and update representations. This formalism is complete, consistent, >>> and >>> highly robust in practice, underlying the construction of the most >>> succesful information system in history. >> >> In fairness, I think this only applies to HTTP 1.1, not the entire >> web. > > No. Go look at the code and see how it handles all URIs. HTTP is > an extension of that interface across the Internet. > Ok, here is one hook to a difference in the model you and I have, Roy. You point out that the API in libwww basically provides the functionality of HTTP, and at the same time gives access to FTP and so on. You use this an an illustration of a theory that all URIs have the same interface as HTTP, that HTTP extends over the web the interface of libwww in a quite generic way, while other protocols only support some of the features. Hence the ability of HTTP proxies to provide access to FTP and Gopher. Which is is logical. However, it does not address the range of all URI schemes, and of course as HTTP basically doesn't play with the fragid, it doesn't involve that at all. It is a reasonable bit of software design for libwww to generalize where generalization can be done, and it is not surprising that HTTP, as a later design, "embraces and extends" FTP. And HTTP is in fact a good model for the Web, and the category of URIs for which this model holds (http, https, ftp, gopher) are important, because they form a web of network information objects. (I'm happy to call that the Web, and exclude "Web" Services, by the way. We can call them "Internet Services" if you like. I think this so far if what you call the REST model.). But other URIs don't fall into that scheme. mailto: URIs identify mailboxes, and to say that you can make an HTTP proxy represent a mailbox is a kludge. A web site can have various pages which give various sorts of information related to a mailbox, but conceptually a mailbox is a delivery point not an information object. You could map HTTP's POST to it but not HTTP's GET. Similarly, telnet: URIs are end points for interactive sessions. You can connect to one by a java obect in a web page, but that doesn't mean they are like web pages any more than a flower pressed in a book is a piece of paper. So that is I think one way in which our formalizations of URIs differ. [..expletives deleted...] Working *perfectly* for HTTP is not evidence that it works anywhere >> else. (other people have cited the parable of the blind men and the >> elephant.) And the success of the Web is of course due to many, many >> factors. > > I have seen no evidence that it doesn't work, anywhere. Some SW folks > *claim* that if you allow the RDF producer to make ambiguous statements > about both representations and the resource using only the URI as the > target subject, then it results in ambiguity. Well, of course that > would cause ambiguity, which is why they are NEVER THE SAME THING on > the Web itself. The answer is: DON'T DO THAT. RDF people do not in my experience use a URI to represent both the resource and a representation. Well, I don't. (Cwm has, for example, a relationship -- a built-in function -- log:semantics which relates a resource to what you get from retrieving a representation and parsing it, and another, log:contents which relates a resource to the bits of any representation of it) If you assumed that is what people are doing , it may be because you are mapping their words onto your concepts, not theirs. You maybe forget that for me, for example, the car and the picture of the car are distinct. It is the confusion between those which causes a problem. Now, you don't write RDF so I am not sure how I discuss this with you. I've written a lot of http://www.w3.org/DesignIssues/HTTP-URI specifically about this and I don't know where to start. I think you must agree that once my program accesses the web page which we will say is a picture of a car, then it has a representation of a picture on bits. It has therefore a concept of the picture. The picture itself has important properties such as who owns it and made it, and what its copyright information is. You say that that is information about the representation, but I would point out that a picture can have many representations, in JPG PNG and GIF at various levels of resolution. They share owner, copyright, date of creation, creator, focal length, genre, exposure, orientation, and so on, because they are all what I would call representations of the same picture, the same conceptual work. This commonality is very strong, and points to the value of being able to identify the thing they have in common: the picture. And normally, when I want to make a hypertext link to that it is to the picture, not to a representation, that I want to make the link. So the argument that we are "just talking about representations" doesn't fit the bill. It doesn't meet the requirements to be able to talk about the picture as a conceptual work. Now, you say the owner of the HTTP URL can declare that it actually identifies the car. I say that messes things up. Suppose the owner does that -- suppose they mark up the JPEG with a comment field indicating that. Now my client program has no ID for the picture. Now here's the rub. When the URI was for the picture, then I can indirectly identify the car with it, as "x, where <car.jpg> is a picture of x". In N3 that looks like "That which has picture car.jpg". [ has :picture <car.jpg> ]. That's cool. Its what we do all the time to identify things for example people by SSN. "The car whose picture hangs above your mother's fireplace" and stuff. KR sytems thrive on it. What doesn't work is if we say that <car.jpg> actually is an identifier for the car. Because "the picture of the car" doesn't identify the picture - it identifies any picture of the car. [ is :picture of <car.jpg> ] You can write it but it doesn't work. Its not a bug in RDF. It is a fundamental problem with the URI system we assume that you don't have an identifier for the conceptual work. An example you give often is a robot. To an RDF system, a robot which can be driven by the control panel at <robot.html> can be formally referred to in just the same way as [ :controlPanel <robot.html> ]. (That which has control panel <robot.html>) This works. Let me summarize - Web software needs to be able to express things about conceptual works They are a big part of the web system and of our society. - When you identify a conceptual work, you can retreive representations and you can indirectly identify abstract things. - If you say that the URI identifies an abstract thing you cannot refer tothe conceptual work. Of course, to use the same identifier for two different things in a formal system is a contradiction of the term "identifier". The power of URIs is that they are context-independent identifiers. So I say that necessarily HTTP URIs directly formally identify conceptual works. Indirectly they are used to identify consortia and cars and things. Mailto: URIs do not identify conceptual works. They are not part of the rest model. I think that you will find that the REST model is not harmed in any way by introducing an extra concept of the conceptual work betwen "representations" in and what you used to call the resource. I think you will find it has a nice consistency and solidity. You asked for examples, by the way. I could give you some. Some are linked from the N3 primer, and one which uses concepts of resources and representations explicitly to make rules for a trusted system where information is trusted only when it is derived from a representation which is signed is at http://www.w3.org/2000/10/swap/test/crypto >> Once you step outside the formalism, not only do you want to know what >> kind of thing a specific Resource is, but you notice that everone is >> using each URI to identify several distinct things. So the >> fundamental premise of 2396 breaks as soon as you step outside the >> formalism. > > Nothing in 2396 breaks because of that. 2396 defines the syntax for > identification. It doesn't define how URIs are used. It doesn't even > define how they are used on the Web. What it does define is that they > are identifiers and they identify resources and they do so using a > uniform syntax. Resources in RFC 2396 are not even limited to > information > objects, since they are specifically intended to include the naming of > physical things and do so quite well. The scope of the REST model, > for example, is more restricted than the scope of 2396. > Yes. HTTP is basically a REST protocol. > Regardless, how people use URIs (how a URI can be used to identify > something indirectly, including those things other than the resource) > is an entirely separate issue from the identity of a resource. If the > Semantic Web is only interested in identity, then it doesn't matter > how many other ways that the URI is being used. ? > Likewise, regardless > of how many new terms are invented to redefine the holy grail, there > is no way to stop people on the Web from using a URI (any URI, > regardless of scheme) in ways that the originator did not intend, > and thus indirectly identifying things other than the originally > intended resource. These we call "errors". If you own a URI you have the right to say what it identifies. Other people can lie about it, but they are lying if they do. > The problem occurs when we face up to the fact that the Semantic Web > is not just a generic KMS, and in fact is very interested in the Web > and what people identify when they create anchors. I think you are saying that the semantic web must use URIs to identify exactly the same thing as anyone else. Absolutely. > Once there, we must > accept the fact that the Web defines URIs and methods as two separate > protocol elements, and therefore it would be incorrect to define > resources other than how they are defined in 2396 and used on the > existing Web interfaces described by HTTP and implemented in dozens > of independent open source projects that you are free to inspect. Yes. > URIs alone are not sufficient to target assertions about content on > the Web, even if we restrict our discussion to resources that act > like information repositories. ? > It therefore behooves the Semantic Web to adapt to the Web as it > exists and works in practice, not try to force new definitions on > it that are misdirected and impotent. The semantic wen system I have outlined above is neither misdirected or impotent. It is potent in that it can do everything it needs to: identify what it needs to directly or indirectly. It is not misdirected, because it fits in with the way HTTP works. While it may not seem intuitive to you, I think you will find the idea that the URI identifies a web page is not alien to most people. Not that most people have to worry about the formalism, but even then most folks understand that a URI at Amazon is primarily that of a page at Amazon, and only identifies the book indirectly. People give the URI of someone's home page and know that is what they are doing - they don't really feel they are giving the person's URI. I suppose one can bicker about what other people conceptualize things and it isn't helpful. Sometimes people have used that sort of argument here. The system I have is consistent and meets the requirements. I haven't seen an alternative system which clearly defines what a URI identifies, allows conceptual works to be referred to and allows arbitrary things to be given identifiers indirectly. Tim PS: I haven't talked about the # in this message, and the RDF fragment ID That is closely related but is a different set of issues. You do also need RDF to be able to claim foo#bar as an arbitrary identifier of an arbitrary thing for the semantic web to work.
Received on Thursday, 23 January 2003 22:48:38 UTC