- From: Roy T. Fielding <fielding@avron.ICS.UCI.EDU>
- Date: Fri, 01 Dec 1995 01:13:53 -0800
- To: Keith Moore <moore@cs.utk.edu>
- Cc: urn@mordred.gatech.edu, uri@bunyip.com
and now that I have slightly more time (another all day meeting)... > Okay, I'll try a different tack. I'll see if I can state this > one as a design tradeoff. Good idea. I'll just rephrase the tradeoff from my perspective, and I hope you'll see why I object so strongly to the syntax. > If we give the client a way to differentiate between URN "schemes", > > on one hand, > > + we can make potentially make resolution more efficient, because > the client can customize its search path on a per-scheme basis. > (If the client doesn't know the "scheme", it's not a matter of > not being able to resolve the URN, it's a matter of having to look > in potentially more places before it finds it.) Right. In other words, if we don't provide a scheme, we prevent any client making their resolution more efficient on a per-scheme basis, regardless of the local conditions that can be known only by that client's owner. > + it makes it easy for people to understand how other naming schemes > are incorporated into URN-space Right. It makes it possible for vendors to design that incorporation into their implementations, rather than get surprised by it as systems evolve. > + we make some unhappy people happier, which may bring us closer to > agreement. (I'm not being facetious here; it's often better to > have a standard with some "imperfections" than no standard at all.) Amen. > + to the extent that it's useful for a client to know the > syntax and semantics of a URN (for reasons other than resolution), > having the "scheme" name be visible makes that possible. Right. In other words, if we don't provide a scheme, we prevent any client from taking advantage of the syntax and semantics of a URN even when those semantics are meaningful for that scheme. For example, if the canonical form of a URN is dependent on the scheme, then URNs can be defined for legacy naming systems that allow case-sensitive names. And I'll add: + it allows multiple naming mechanisms to be designed, developed, and enhanced independently of the standardization process + it does not lock us into a single syntax for the NA+OID for all time + *if* relative URNs are found to be useful, we don't have to throw away all the existing URN work to use them + it lowers the entry barrier for the introduction of URNs into existing technology > ... > on the other hand, > > - it increases the probability of client configuration error Yep, but doesn't assume that we know the client user's needs better than the client user. > - if schemes tend to imply particular resolution protocols, > they decrease persistence of URNs Yep, that would be bad, but it is also easy to avoid. I would want any URN scheme to reflect a non-protocol name, like "oid:". > - schemes increase the probability that the client cannot resolve > a URN because it doesn't know about the "scheme", which in turn > reduces interoperability Increases the probability of reduced interoperability, yes. But that is true of any URI scheme until my URI Resolution Table idea becomes a standard (if ever). > - they make it less likely that URNs have global scope in practice > (since the interpretation of a URN is up to the client, and it > tempts clients to make special interpretation based on the "scheme") Yes, but it also allows a URN to have local scope when only local scope is desired (or even possible, as is the case with an emerging technology). The Web could be created one site at a time because any site can be created independent of any other. Sure, that means some names will fail the test of persistence, so people who want guaranteed name persistence will have get their names from a guaranteed naming organization. > ... >> You are assuming that there will be only one URN scheme. > > No, I'm assuming all URNs will have a prefix that gives the > client the ability to recognize it as a URN, and the minimum > information necessary to use it. (by "use it", I essentially > mean "find a resolver for it"). Same thing. Honestly, that is how the Web is implemented; just look at any client library: my libwww-perl, CERN (now W3C) libwww, Guido's modules for python, etc. You can implement URNs natively in libwww-perl simply by creating a perl module called www<scheme>.pl which includes a request() procedure. The library will load it dynamically when it encounters a URI with that <scheme>. The library doesn't care whether the scheme is associated with a protocol or not -- it uses the scheme to select a resolution mechanism and thus any new scheme can be added without affecting any other part of the library. Extensibility is one of the most important aspects of Web technology. This does not necessarily mean that all vendors have succeeded in implementing this extensibility; it only means that the design does not prevent them from implementing an extensible system. Those of us who know better have done better. If the implementation of IETF-sponsored URN's reduces the current URI extensibility, then I will not allow them to become a Web standard, even if that means divorcing Web standards from the IETF process [which I would personally hate to do]. However, that has never been necessary, because every time we have polled the vendors on this matter they have always supported a more extensible design. The problem is that the change was never made to the URN specification because the authors didn't follow-through, forgot, or just plain disagreed with the rough consensus. That is not following the IETF process, which is why I am sick of repeating myself every time a new URN draft is produced, and is the primary reason why so little progress has been made over the past three years. >> Any resource may be identified by multiple names and/or locations. >> Any resource which is "the current version of X" is also "that specific >> version of X" -- both of these concepts can and should be assigned names >> if it is determined (by anyone) that such a name is useful. Thus, any >> system that purports to define URNs must also allow multiple names per >> resource. > > Yes. > > (But I've never thought of a URN as being tightly bound to a "resource" > ... it's bound to a "definition". So a URN for today's weather map and > a URN for the weather map on 11/29/95 would be different because they > *mean different things*, and it doesn't matter that under certain > circumstances they could refer to the same resource. But this is > independent of whether URNs have "schemes".) It matters if the question asked is "have you seen the contents of this map before?", or "by which name should I refer to this resource when I put it in in a hotlist/bookmark file?". You are right though in that this example does not highlight the need for schemes. >> Requiring that all URNs have the same properties (i.e., case insensitive, >> references an entity fixed-for-all-time, etc.) would make it impossible >> to represent resource names as URNs. > > Depends on what you mean by "resource names". I have always assumed > that URNs must be able to subsume other naming systems that have the > same basic properties -- global uniqueness, persistence, transcribibility, > etc., but not that URNs must be able to subsume any kind of resource name > (such as a URL or a file name). Now if the other naming systems that > we need to subsume into URN space are really so diverse that we cannot > define a common "umbrella" syntax and registry and clients have to > be aware of the differences in their syntax in order to "use" them... > well, I'm tempted to suggest that we try to solve a narrower problem. I think that's reasonable, but the name "URN" refers to the larger problem of location-independent resource names. Solving a narrower problem is fine provided that it does not prevent others from solving other parts of the problem, which means that you must have a way to differentiate between solutions, and thus a scheme other than "URN:" is necessary. > But I'm not yet convinced that we need to support this kind of > diversity...perhaps you could supply some firm examples? What I am saying is that unless you can *prove* to me that we will never want to support that diversity, you cannot make that choice for others. >... > It's not as if everyone uses the word "scheme" in the same way. > (sorry, couldn't resist...it's one of my favorite quotes.) Cute, but there is only one way to use the word "scheme" when referring to the characters preceding a Uniform Resource Identifier. I am not interested in redefining the name associated with two proposed standards and an installed base of >20million applications. >> A scheme defines the syntax and >> semantics associated with the remainder of the identifier. It does not >> define the resolution protocol; some identifiers have a scheme name which >> matches a protocol name because that is the most meaningful name to >> associate with a locator for which the ultimate resolution process defaults >> to using that protocol. In other words, the Knoxville proposal is using >> the scheme "URN". > > The Knoxville proposal doesn't define the syntax of the name past the NA. Yes it does -- it defines that it is opaque and case-insensitive and only includes a restricted set of characters. > The Knoxville proposal doesn't define the semantics of the name at all; > we narrowed our scope for the purpose of that two day discussion to > specification of the name and how to find resolution servers for that > name, and used the term URN to refer to the part of the "resource > identifier" that we chose to work on. If you perform a case-insensitive comparison of two Knoxville identifiers and find them to be equal, what does that mean? Semantics. >> World-Wide Web user agents use the identifier scheme to determine the >> resolution mechanism (NOT protocol -- mechanism is that *thing* which is >> responsible, within or outside the client, for resolving identifiers of >> that particular identifier type -- it may use any protocol defined by >> the user or vendor for resolving that scheme, including a protocol defined >> on-the-fly through retrieval of a script). > > While I agree with you in principle, this is not the case in general. > It's certainly possible to add a layer of indirection between a URL > and its servers. But since the web wasn't designed with a standard > layer there from day one, it's somewhat difficult to add one now and > see it universally deployed. (doesn't mean it's not a good idea -- > it's just difficult) Wrong. The design has been there since day one -- in fact, it preceded the original definition of Universal Document Identifiers, which preceded the creation of the URI WG for the purpose of standardizing those identifiers. Schemes were designed to support extensibility of names by allowing the library resolver module to be determined by scheme name. It was also in libwww-perl since day one. What has not been there is support for the user to enable that extensibility. Right now, only a programmer can do that. However, this can be added now without any change to the URI syntax and with no entry barrier for implementation on clients. All I need to do is finish writing my paper. ;-) >> Uniform Resource Names is a category of identifiers, referring to those >> that identify a resource independent of its network location. It is wrong >> to use "URN" as a scheme name for the same reason it is wrong to use >> "URL" as a scheme name. >> >> I CANNOT USE ANY IDENTIFIER THAT BEGINS WITH "URN:" > > Sure you can. You can use URN: as easily as HTTP:. Actually, I can't use HTTP either, since schemes are required to be lowercase. > I don't really care what these things are called. I do care about > not defining lots of new URI prefixes such that the client has to > know about each one of them individually, or so that URNs get confused > with URLs. So in response to your all-caps statement, I might say: > > I CANNOT USE MORE THAN ONE NEW URI PREFIX > > although that, of course, is also false. I do, however, think it's > highly undesirable to keep extending things in this way. If you implement a "truly great" URN with a particular scheme, and it turns out that you are right in that your "truly great" URN is sufficient to solve the URN problem in general, then nobody will bother to use some other URN that is "less great". If, however, you are wrong in that some other URN syntax is better than that proposed, or if some other type of URN is necessary to solve the bits of the URN problem which you did not consider "important enough", then allowing multiple URN schemes to exist will allow the proof to be determined by implementation and successful deployment, not by pre-standardization posturing. If this is just a difference of opinion between "extensibility is bad" and "extensibility is good", then there is no point is continuing this discussion. >> Which means, obviously, that I will forbid the use of such an identifier >> in any system which I design or am responsible for standardization. >> That is what I've said consistently for over 1.5 years now, that is what >> I will recommend to the W3 Consortium members, and that is the objection >> I will continue to raise every time this is discussed within the IETF. >> >> Is that clear? > > In the IETF at least, you have no authority to forbid any such thing. I wasn't referring to IETF standards. URN is not an IETF standard. URN isn't even an IETF working group. Right now, URN isn't even out of the early research phase. I do have the authority to forbid the use of bogus URNs in any system *I* design, and in any system in which *I* am responsible for standardization (e.g., the W3C use of URIs). To the extent that my responsibility overlaps with that of the IETF, I defer to the IETF. However, the IETF's responsibility *never* extends to systems that are not yet implemented. Mine does. > We make decisions by rough concensus, but the concensus of the group can > override any individual. Only if that consensus is polled for on the working group mailing list and the results are represented in the WG documents. In the entire history of the URI WG, the only time that the "URN:" prefix *ever* obtained consensus was at a meeting at a bar during the Houston IETF meeting -- yes, that's right, it wasn't even a legitimate decision of those in attendance at the real meeting. > I personally would think it silly for us to develop this new kind > of identifier that we have been calling a URN all along, and use > any prefix for that identifier other than URN:. But if "silly" doesn't > cause any implementation or operational problems, you might be able > to get the group to go along with you and use some other prefix. > > On the other hand, if we end up defining lots of new URI prefixes, > we will have been wasting our time for the past 4 years or so, because > we will have effectively gained nothing over normal URLs. That's > not silly, that's tragic. Since when is the existence of only one URN scheme the sole advantage of location-independent names? The only thing that has been wasting our time for the past 4 years or so is this insistence on defining an identifier which is fundamentally incompatible with all existing practice. I am trying to stop yet another waste of time before it starts again. If existing practice will not be a concern of some future URN WG, then there should not be any URN WG in the IETF. >> Hell, ALL >> EXISTING IMPLEMENTATIONS OF URIs DEPEND ON THE EXISTENCE OF SCHEME NAMES. > > This isn't a justification for anything in particular. The reason we're > doing this little four-plus year exercise is that "existing implementations > of URIs" aren't sufficient for our needs. NO -- URLs aren't sufficient for our needs. There is nothing insufficient about the URI architecture and there is no technical reason to justify a change from that architecture. > (C'mon. Does the "scheme" really have to be the part of the URI before > the first colon? Yes. > Do URNs really have to share a common syntax > with URLs... down to including the path structure? No, but they must be usable within the same URI structure. > If you really want > URNs to be persistent, you don't put any semantically loaded > information in them at all...certainly not information that reflects > the internal structure of multi-file documents.) I have seen no implementation that proves such a theory, though I have never suggested that all URNs must contain structural information either. I believe there is no harm in allowing both to coexist. >> If you don't support the identification of resources that may already >> be on your local disk, identified within a personal database of resources >> located in a real-world bookshelf, or located within the user's local >> University library, then you have failed to solve the URN problem. >> You don't have to define these resolution mechanisms -- you just have >> to make them possible with minimum difficulty. > > Actually, we do support the identification of such resources, but not > with names that indicate where the resources are stored. After all, > a resource originally in my personal database could eventually become > available to the entire world...should the resource name then change and > then invalidate all of the references to it? > > But I could certainly configure my client to search my personal resource > database, my mail folders, etc. before searching the DNS registry. According to what constraints? Do you want every query to search all available sources? Or, do you want the sources to be ordered and targeted according to the likelihood of their knowledge about the resource? If you know a name is associated with a University Technical Report, don't you want your client to search the TR database before the library of congress? If so, how does the client get configured for such preferences without looking at the opaque identifier after the "scheme:"? The fact is that you cannot anticipate all the needs that I or anyone else may eventually have for URNs, so don't assume you have. Provide a syntax that is extensible not because it will be, but because you cannot be sure it won't need to be. ...Roy T. Fielding Department of Information & Computer Science (fielding@ics.uci.edu) University of California, Irvine, CA 92717-3425 fax:+1(714)824-4056 http://www.ics.uci.edu/~fielding/
Received on Friday, 1 December 1995 04:20:19 UTC