- From: Roy T. Fielding <fielding@apache.org>
- Date: Fri, 11 Oct 2002 00:27:20 -0700
- To: "David Orchard" <dorchard@bea.com>
- Cc: <www-tag@w3.org>
> I apologize in advance for being clearly dense on this subject. It > certainly is helpful for me to understand this area a little better. And > I > think we're getting close to the areas of my misunderstanding. Oh, bugger -- I guess I have to respond to this one because you obviously spent a long time writing it while I was writing the last one saying that I wouldn't respond any further. The chances of me ever having time to work on the architecture document are approaching nil. > I know you have said it plenty of times, that identification and access > are > separate, but for the longest time http: things were access oriented. > Heck > that's why the web is the set of network available information items. That simply does not follow. For the longest time, people drive cars. That doesn't mean that cars cease to exist outside the act of driving. There are thousands of people who make their entire living based on designing cars, photographing cars, talking about cars, etc. Consider the VIN to be a reasonable identifier for a car. Is the fact that a VIN identifies a car in any way restrictive of the ways that people exchange or use VIN numbers? No. When you use a VIN number for the purpose of arranging financing and registration, the car is not accessed in any way. Likewise, when a car is parted-out and found by the police, they use the VIN to perform a registration look-up on the owner. The identifier just identifies the car, but it can be used to identify the owner, the leasing agreement, the manufacturer, the creation date, and even a police record. I don't understand this notion of access-oriented. "http" URIs are used as cache keys more often then they are used to access the resource. It is therefore a fact of life that "http" URI are more often used for the purpose of identification than for access. Likewise, authors use them as identifiers when they put them in links -- they have no way of knowing what mechanism will actually be invoked by the browser when it is told to traverse the link. Is the entire mechanism access-oriented? No, only the part where the furthest downstream client cannot respond to the request on its own and makes an HTTP request on the origin server is an access. > I really do think that somebody seeing an http: thingy gets a strong > implication that they ought to be able to have access to a representation. That's a separate issue. When I see any URI, including URN, I think I ought to have access to a representation. Not because I'm special and deserve it, but because I might find it useful. That's one reason why HTTP is a universal proxying protocol -- it can attempt to obtain a representation of any resource, regardless of URI scheme, using the proxy mechanism. URN, "now", "tdb", or whatever someone comes along with next doesn't change that; it merely makes it more expensive. > And I did read through almost all of the uri mailing list last night and > today. Some of the fun things I found: 1) the message from Marc > Andreessen > about wanting to wrap up URLs as back end and out of user-site mechanisms > so > he could start working on URNs; 2) the discussions about whether urls > should > be prefixed with url: in plain text like email to indicate the presence > of a > link. Reminds me of our 4 year old xlink and html discussions on link > identification. 3) I was briefly confused about the discussions around > SOAP > that happened in 1994. That's good -- I encourage it of anyone who thinks that they have some new argument about this subject. I am not kidding when I say that this discussion has come up many, many, many times and the result is always the same. > Exploring this a bit further, I'll quote a part of 2396. I know that you > know it off by heart so I'm not trying to be cheeky, but it helps me in > expressing my position. "A URI can be further classified as a locator, a > name, or both. The term "Uniform Resource Locator" (URL) refers to the > subset of URI that identify resources via a representation of their > primary > access mechanism (e.g., their network "location"), rather than identifying > the resource by name or by some other attribute(s) of that resource." > > When I take an http: URI and then classify it as a locator, name or both, > it > turns out it is a locator because http: is the primary access mechanism, > eg. > network location, for the URI. Is that because of the scheme name, or because you happen to control the naming authority and have placed a server at that location that is able to respond to access requests? > So saying that an http: uri does not imply > access in any way just blows my mind. All locators are also identifiers. Whether something is a name or a locator only impacts its ability to be directly used to locate the resource, not its ability to identify the resource. > The only reason the xmlns identifier > doesn't imply access is because the namespace spec said so, and still > people > keep on thinking or implying it does. Namespaces had to specifically > mention the lack of access, and lots of people said it over and over > again, > and still it didn't stick. What do you mean it didn't stick? Do you have examples of XML namespace parsers that access the namespace URI every time parsing takes place? If not, then clearly it is being used as an identifier, and therefore this argument is specious. Yes, there is a small community of people that populate the W3C mailing lists who refuse to accept NO for an answer, but I don't think that changes the design one iota. Whether or not there is a representation available for a resource is a decision of the people providing representations, not an aspect of the scheme. > I want to re-emphasize that I agree totally that http: URIs do not REQUIRE > access. I just don't seem to understand why you don't think it implies > access. Because I implement HTTP client and server software and know from personal experience that most traversals and use of http URI do not result in access of any kind. What you probably meant to say is that there exists some implication that, given an http URI for identification, there is a separate belief among users that some means of obtaining a representation of the resource identified by that URI exists. Damn right! All important resources should have URI, and all URI are dereferenceable (to varying degrees of cost), so therefore it makes sense that people believe ANY URI to be capable of producing a representation. The only difference between an http URI and a new scheme is the expense of deploying the dereference mechanism, and people who don't understand that simply haven't worked with HTTP proxies. > Cuz it sure does to me and apparently a whole bunch of other > people. And having every spec that uses URIs have to say whether or not > that access should be done (like namespaces), when it could be (at least > from my pov) more easily expressed in having different schemes, seems to > place an undue burden on developers and software. How is having different schemes going to change the definition of a protocol element? Are you suggesting that these protocol elements that are defined to be scheme-agnostic should instead be scheme-specific? That maybe the specification would be less wordy if, instead of saying that the element is used for identification (not dereference), that it rather list the schemes that can or cannot be used within that element? Or maybe you would prefer that the element not be a URI at all, since it should be clear by now that any URI scheme is inherently dereferenceable? > On to the first assertion. I'm still confused about why you don't see the > utility in changing behaviour. I think this is my central > misunderstanding > of your position, because it I guess I'm too uninformed to figure why we > wouldn't want software to do something different. Because that would change the definition of the protocol element, and result in severely sucky performance. I don't know of any example where the decision of whether the protocol/language is using the URI for identification or access should be subject to the URI scheme. That's just plain nuts -- I can't build reliable systems that way. > I picked up the idea > because people starting talking about how great the world was going to be > when things that were only for identifiers could now be used for access, > and > the gloating over those poor folks who were using urns commenced. No, no, no. What they are saying is that, given an important resource and an identifier for that resource in the form of a URI, people will eventually want to use it for access to representations of the resource. Why? Because people are curious. This has nothing to do with changing the software to perform some special access magic, nor is it prevented by creation of an obscure URI scheme. If it is important, then somebody will provide a proxy for dereferencing that URI, either by proxy or by URI rewriting, such as was described in 1995: http://www.apache.org/~fielding/uri/drafts/draft-ietf-uri-roy-urn- urc-00.txt > So perhaps you can explain to me why software wouldn't need to change > when a > URI went from having no representations available to having > representations > available. I have an infinite number of http URLs at my disposal that already have that property. Do you see any software changing because of it? My definition of resource is a discontinuous, multi-valued mapping function of representations over time. The presence or absence of representations at any point along the curve(s) does not alter the function -- only the result of evaluating the function. A URI is an identifier for such a function, so saying that a URI scheme implies access is equivalent to saying that I can't talk about the Pythagorean theorem without performing multiplication. It just isn't true. I don't care how many times that the squares and square roots happen in geometry classrooms around the world, it is still necessary for the teacher's sanity to be able to identify the function separately from using the function. Can we please stick a fork in this issue? Cheers, Roy T. Fielding, Chief Scientist, Day Software (roy.fielding@day.com) <http://www.day.com/> Co-founder, The Apache Software Foundation (fielding@apache.org) <http://www.apache.org/> Meet me at ApacheCon 2002, Nov. 18-21, Las Vegas <http://www.apachecon.com/ >
Received on Friday, 11 October 2002 03:27:29 UTC