- From: Roy T. Fielding <fielding@apache.org>
- Date: Fri, 4 Jul 2003 20:49:07 +0200
- To: Tim Berners-Lee <timbl@w3.org>
- Cc: Public W3C <www-archive@w3.org>
Hi Tim, I am still in Switzerland (through July 17), but I'll try to keep up. BTW, BBC World has been showing bits of an interview with you on TV, with you sitting on a weird sculpture. > So, I I'd like to work with you on this issue. I have two routes. > > Route 1. > > I think we had got to the point to the point that, while you > maintained that HTTP URIs can identify anything, you did admit that > the web in reality depends on the expectation that if I have seen one > bit of information with a given URI, I expect to be able to quote the > URI and for someone else to later get back for it essentially the same > information, for whatever local definition of "essentially". In other > words, if I show you a refer you to a picture of a car and you get a > parts list that is stretching it, it probably hinders communication, > whereas if i see a PNG and you get a JPEG we probably never even know. I don't think we ever disagreed on that -- it is even in the stuff I wrote four years ago. > So one route is to point out that because of that expectation, what is > really invariant about the representations retrieved for the same URI > is not that they are about the same thing, but that they convey the > same information. It might be a living page like "the front page of > the LATimes" or it might be "what my car looks like". Yes, the resource is the sameness over time, regardless of the format. > So in the architecture we actually lose information unless we capture > that. I thought we did capture that. > So, please, can we refer to tthe things identified by HTTP URIs as > say "information resources" (I won't quibble over terms). This will > enormously help clear up the indirection bugs we get on many of the > lists. I don't think so. You are jumping from "resources are consistent" to "http means document", which is not a logical conclusion. Let's say that I have an http URI for which no information is forthcoming in response to a GET -- never is and never will be as far as the consumer can see. Is that an identifier of an information resource? I don't know. In fact, nobody except the naming authority knows. Maybe it is a future resource that just doesn't have a representation yet. Maybe it is a sink. We really can't know for sure until someone else tells us what we can do with that URI. What we do know is that they can never be inconsistent, since inconsistency would change the meaning and thus the resource. In other words, there is no case in which a URI that actually identifies a car will ever return a parts list, so we don't have to worry about it. Another way of thinking about this is to go from the other direction. Let's say that we create a new identifier called "urn:bug:foo" and declare that it corresponds to a particular species of bug -- not information about the bug, but the concept of that species in much the same way that people use "human" to describe us. It follows therefore that we should not be confusing that identifier with information about the bug species. But that simply will not last for long -- it is a URI and thus transferable via information protocols like HTTP, and eventually someone will come along and deploy a urn proxy that takes as input a "urn:bug:foo" and responds with some sort of information. Does the presence of that information invalidate the "realness" of the original concept? No. No more so than the presence of an HTTP server invalidates the "realness" of the resources named by that server's authority. What can be said about an "http" URI is that it identifies a resource by reference to an HTTP interface address. Whether or not the resource has some nature beyond the HTTP interface is not stated, nor does it need to be constrained. We know that the HTTP interface is constrained to an information exchange, but that says nothing about the nature of the resource. The architecture is complete and well-founded if all we say is that "http" identifiers identify resources; there is no need (and no benefit gained) by further restricting what can be identified by an http URI. In any case, saying that "http" identifies an information resource would not eliminate the indirection issue. A document can talk about some other document just as easily as a car. We eliminate the indirection case by declaring that assertions that target a URI are assertions on the resource identified by that URI: state that is only reflected by the content of all its representation over all time. The only way to make assertions about the information content returned by an action is to add qualifiers for method and time, since the architecture requires that those be orthogonal to the identifier. > Route 2. > > This is to say, can we please have this distinction for the semantic > web? In other words, before, it wasn't necessary to formally > distinguish between whether the Consortium of The Consortium's Home > Page was identified, as people constantly resolve such things in human > communication, and they were only used in human communication. > > Now we need to build a global KR system. You say your are not into > RDF, and that's OK, but you are quite smart enough to understand the > issues without using it or being committed to the language. The > goal is a language which talks about arbitrary objects using > dereferencable global identifiers. I know. We keep going over the same issues. I am not the person you should be arguing with here -- this discussion was completely thrashed out on the URI mailing list and it was Pat Hayes who clearly demonstrated that this simply is not needed by RDF and is not true in any case. People use URIs based on incomplete knowledge about the nature of what is being referenced, thereby giving the URIs semantics through use that may be absent from the minds of the naming authorities. If we have dereference-able global identifiers then we have an opportunity for secondary semantics to creep into the system. For example, I provide a picture of Laguna Beach, but other people link to it as a picture of the ocean. Nobody else is aware of the distinction until I put up a picture of Laguna Beach that doesn't happen to include a coastal scene. Who introduced the error, and who gets to decide which semantic is more significant? What if I decide, after receiving several million complaints from misled users of that resource, that it really should be "a picture of Laguna Beach that always shows the ocean"? Have I changed the resource or simply "fixed" an anomalous representation? > Because we want to leverage the web, we obviously want to leverage > URIs - make those identifiers URIs in some ways. There are two ways > of going about this. One is to use the flexibility point in the > design of the fragid syntax. This says that on the web, you can > make a new language about whatever you like, and the fragid syntax > connects (in a way you define) with the syntax of the new language, > and the things identified mean whatever you like. Example: SVG > defines graphics things, and the frag id syntax can define a 2d window > on the 2d space. I actually think this is a really important > flexibility point in the design, as I would like all kinds of new > languages to introduce all kinds of new abstract concepts in the > future. My opinion is that "#" is considerably less flexible than "http", or "urn" for that matter. It is a dead end for indirection, and that is generally a bad thing for evolvability of the system. > The only alternative is that we abandon the direct use of URIs for > arbitrary things, and follow alas common view among users that a > fragment identifier identifier identifies part of a document. I think you are painting yourself into a corner. The initial claim that the presence of an "http" identifier implies that an inconsistency will develop between the resource and information about the resource is false. The resource is that which is consistent, which may actually be several different aspects of sameness that is encountered when interacting with that resource. Additional metadata is needed to tell us what aspects the authority considers essential to the resource, which can be accomplished via RDF regardless of the URI scheme. RDF does not need the fragment distinction. > So, Roy. Could we see our way to resolving this between us so we can > then advance it in the TAG and clear the way to gettinga coordinated > architecture with the RDF core group? I don't understand, Tim. I have already resolved all of the issues requested by the RDF core group. In fact, they requested quite the opposite to what you are saying is required for RDF, and vehemently objected to the artificial semantics that Sandro wanted to add. That is why I get so frustrated by this argument: if you can't agree within RDF that this is necessary, then I don't see why the Web has to be constrained in a way that is contrary to the services model wherein robots and sinks and gateways have an equal place. ....Roy
Received on Friday, 4 July 2003 14:49:02 UTC