- From: Keith Moore <moore@cs.utk.edu>
- Date: Wed, 29 Nov 1995 04:35:51 -0500
- To: "Roy T. Fielding" <fielding@avron.ics.uci.edu>
- Cc: Keith Moore <moore@cs.utk.edu>, urn@mordred.gatech.edu, uri@bunyip.com
> > I don't know how I can make this any clearer: > > > > 1. I (and others who were at Knoxville) have said, repeatedly, that > > the client chooses how to resolve a URN. (One of our oft-stated > > design principles was "a client can and will do whatever it wants to".) > > The client can only do this if it has some way to differentiate between > URN schemes. Okay, I'll try a different tack. I'll see if I can state this one as a design tradeoff. If we give the client a way to differentiate between URN "schemes", on one hand, + we can make potentially make resolution more efficient, because the client can customize its search path on a per-scheme basis. (If the client doesn't know the "scheme", it's not a matter of not being able to resolve the URN, it's a matter of having to look in potentially more places before it finds it.) + it makes it easy for people to understand how other naming schemes are incorporated into URN-space + we make some unhappy people happier, which may bring us closer to agreement. (I'm not being facetious here; it's often better to have a standard with some "imperfections" than no standard at all.) + to the extent that it's useful for a client to know the syntax and semantics of a URN (for reasons other than resolution), having the "scheme" name be visible makes that possible. (however, I'm not sure whether what you call "semantics" of a URN was included in the Knoxville URN at all; we had agreed that things like service requests were not part of the URN but that there was a need to be able to specify those things in a standard manner along with a URN) on the other hand, - it increases the probability of client configuration error - if schemes tend to imply particular resolution protocols, they decrease persistence of URNs - schemes increase the probability that the client cannot resolve a URN because it doesn't know about the "scheme", which in turn reduces interoperability - they make it less likely that URNs have global scope in practice (since the interpretation of a URN is up to the client, and it tempts clients to make special interpretation based on the "scheme") I don't know of any way to objectively decide whether the benefits of having a "scheme" outweigh the disadvantages. My experience with multiprotocol email leads me to believe that having one "scheme" that is flexible enough to subsume all others, and putting the details of how to resolve names in a network-accessible database, is far preferable to expecting each client to know the details of each "scheme". So it's really a case of how much rope to give the clients, and how much information to expect them to know in order to do their jobs. I don't mind giving them the rope as long as they don't really need to use it all that often. I could personally live with having a "name space identifier" (NSI) in the URN as long as (a) it's not strictly tied to a protocol or registry, (b) the resolution of the URN doesn't depend on the client knowing details of the NSI portion of the URN, (c) a registry can delegate resolution of URNs on at least a per-(NSI+NA) basis (and ideally, to smaller sub-ranges of that space). But somehow I get the impression this isn't what you're getting at. > You are assuming that there will be only one URN scheme. No, I'm assuming all URNs will have a prefix that gives the client the ability to recognize it as a URN, and the minimum information necessary to use it. (by "use it", I essentially mean "find a resolver for it"). > Any resource may be identified by multiple names and/or locations. > Any resource which is "the current version of X" is also "that specific > version of X" -- both of these concepts can and should be assigned names > if it is determined (by anyone) that such a name is useful. Thus, any > system that purports to define URNs must also allow multiple names per > resource. Yes. (But I've never thought of a URN as being tightly bound to a "resource" ... it's bound to a "definition". So a URN for today's weather map and a URN for the weather map on 11/29/95 would be different because they *mean different things*, and it doesn't matter that under certain circumstances they could refer to the same resource. But this is independent of whether URNs have "schemes".) > Requiring that all URNs have the same properties (i.e., case insensitive, > references an entity fixed-for-all-time, etc.) would make it impossible > to represent resource names as URNs. Depends on what you mean by "resource names". I have always assumed that URNs must be able to subsume other naming systems that have the same basic properties -- global uniqueness, persistence, transcribibility, etc., but not that URNs must be able to subsume any kind of resource name (such as a URL or a file name). Now if the other naming systems that we need to subsume into URN space are really so diverse that we cannot define a common "umbrella" syntax and registry and clients have to be aware of the differences in their syntax in order to "use" them... well, I'm tempted to suggest that we try to solve a narrower problem. But I'm not yet convinced that we need to support this kind of diversity...perhaps you could supply some firm examples? > Requiring that all URNs within a > given URN scheme have certain minimum properties is useful, but not > sufficient to contain all of the semantics any particular user would > assign to any particular resource. Allowing a resource to be identified > by multiple URN schemes, with each such URN scheme defining its own set > of relevant semantics, is the only way to sufficiently *identify resources* > using a simple identity string. I think you're mapping the problem differently than we did in the Knoxville meeting. Obviously you'd have to add some information to the Knoxville URN to state what you want to "do" with it. We recognized the need for this in the Knoxville discussions, but we didn't try to specify it...we said it wasn't part of the URN but in some cases would have to be supplied with the URN. So the "identity string" for the URC of resource FOO might consist of the Knoxville-style URN for FOO along with a request for a URC, while the "identity string" for the resource itself might consist of the Knoxville-style URN for FOO along with a request for the resource. Sometimes a reference to FOO will want to indicate the URC, other times it will want to indicate the resource itself, and other times it just wants to indicate FOO without being more specific. Likewise, there might be a service request for "most recent version" or "version 1.3" associated with FOO...though trying to put versions in a service request is probably a rathole. (In BFD/RCDS the revision history of the resource can be listed listed in its description (think "URC"). Clients can peruse this description and select whatever version of a resource they want using the LIFN of the resource, but there's no explicit request to "get the latest version of FOO".) > > 2. I have also said, repeatedly, that the URN syntax that we defined > > is NOT tied to DNS, that other registries besides the DNS registry > > are expected. It is essential that the syntax does not imply DNS -- > > if for no other reason than to allow transitions to other registries > > in the long term. > > If the only way a client can determine the type and semantics of an > identifier is to perform a DNS query on some part of that identifier, > then the identifier is tied to DNS. (a) we don't expect that to be true even in the short term. we feel sure that there will be "local" registries in many environments (which might provide access to resources which cannot be allowed to leave that environment, say for security reasons; and might also provide access to a local cache). some of us also envision that might will be the net.equivalent of "rare/old book services" that you consult after you've looked in the default location, perhaps for a higher price and/or longer delay. (b) unless DNS lasts a LOT longer than I think it will, it certainly won't be true in the long term. The client benefits if there is only one registry, but for transition purposes we need to make sure we can move from one registry to another, so we must assume from day one that there will be multiple registries. > > 3. URN: in the Knoxville proposal is NOT a "scheme". URN: is a prefix > > that allows clients to identify URNs in text and to distinguish URNs > > from other kinds of URIs. The Knoxville proposal doesn't have "schemes", > > because -- to the extent a "scheme" dictates a resolution protocol -- > > the inclusion of a "scheme" impairs the longevity of the URN. > > Then you don't understand what a scheme is. ``When I use a word,'' Humpty Dumpty said in a rather scornful tone, ``it means just what I choose it to mean--neither more nor less.'' ``The question is,'' said Alice, ``whether you _can_ make words mean so many different things.'' ``The question is,'' said Humpty Dumpty, ``which is to be master-- that's all.'' It's not as if everyone uses the word "scheme" in the same way. (sorry, couldn't resist...it's one of my favorite quotes.) > A scheme defines the syntax and > semantics associated with the remainder of the identifier. It does not > define the resolution protocol; some identifiers have a scheme name which > matches a protocol name because that is the most meaningful name to > associate with a locator for which the ultimate resolution process defaults > to using that protocol. In other words, the Knoxville proposal is using > the scheme "URN". The Knoxville proposal doesn't define the syntax of the name past the NA. The Knoxville proposal doesn't define the semantics of the name at all; we narrowed our scope for the purpose of that two day discussion to specification of the name and how to find resolution servers for that name, and used the term URN to refer to the part of the "resource identifier" that we chose to work on. Once we added additional components to form the "resource identifier", I suppose we would be defining both syntax and semantics. But since we didn't do that, perhaps the Knoxville URN is not a "scheme" after all? :) > World-Wide Web user agents use the identifier scheme to determine the > resolution mechanism (NOT protocol -- mechanism is that *thing* which is > responsible, within or outside the client, for resolving identifiers of > that particular identifier type -- it may use any protocol defined by > the user or vendor for resolving that scheme, including a protocol defined > on-the-fly through retrieval of a script). While I agree with you in principle, this is not the case in general. It's certainly possible to add a layer of indirection between a URL and its servers. But since the web wasn't designed with a standard layer there from day one, it's somewhat difficult to add one now and see it universally deployed. (doesn't mean it's not a good idea -- it's just difficult) > Uniform Resource Names is a category of identifiers, referring to those > that identify a resource independent of its network location. It is wrong > to use "URN" as a scheme name for the same reason it is wrong to use > "URL" as a scheme name. > > I CANNOT USE ANY IDENTIFIER THAT BEGINS WITH "URN:" Sure you can. You can use URN: as easily as HTTP:. I don't really care what these things are called. I do care about not defining lots of new URI prefixes such that the client has to know about each one of them individually, or so that URNs get confused with URLs. So in response to your all-caps statement, I might say: I CANNOT USE MORE THAN ONE NEW URI PREFIX although that, of course, is also false. I do, however, think it's highly undesirable to keep extending things in this way. Using "URN" (even in our discussions) is dangerous because it means different things to different people, but without using it we couldn't communicate at all. Given that we've worked so hard to agree on what the word URN means, why should we give it a new name? > Which means, obviously, that I will forbid the use of such an identifier > in any system which I design or am responsible for standardization. > That is what I've said consistently for over 1.5 years now, that is what > I will recommend to the W3 Consortium members, and that is the objection > I will continue to raise every time this is discussed within the IETF. > > Is that clear? In the IETF at least, you have no authority to forbid any such thing. We make decisions by rough concensus, but the concensus of the group can override any individual. I personally would think it silly for us to develop this new kind of identifier that we have been calling a URN all along, and use any prefix for that identifier other than URN:. But if "silly" doesn't cause any implementation or operational problems, you might be able to get the group to go along with you and use some other prefix. On the other hand, if we end up defining lots of new URI prefixes, we will have been wasting our time for the past 4 years or so, because we will have effectively gained nothing over normal URLs. That's not silly, that's tragic. > >> All you have done is define a single scheme-uber-alles called "URN". > >> That is not desirable, reduces flexibility and robustness, and standardizes > >> mechanisms that have no implementation experience on a global scale. > > > > URN: is not a scheme. And you have failed to justify your other attacks. > > In particular, you have not explained why any of the following is > > undesirable, inflexible, or non-robust: > > > > + a common prefix and NA space for all URNs > > See above. And this is at least the fourth time I have provided sufficient > reason and argument for why a common prefix and NA space for all URNs > is both unnecessary and undesirable [see the mailing list archive]. > Not once has ANYONE come up with any proof that a single URN scheme is > necessary and sufficient to encompass all resource names. As far as I'm > concerned, this discussion is closed until such time as that proof is given. As far as I'm concerned, you haven't produced a reasonable counterexample. > > + resolution services for URNs are advertised in one or more > > global registries. clients need not be configured to resolve > > URNs on a per-scheme basis; they can simply consult one or more > > of the registries to see which services/protocols are available. > > (clients can special-case lookups for part of the name space > > if they want to; but the ability to resolve a URN doesn't depend > > on them doing so.) > > And how does the client get "configured to consult one of the registries"? By default, it's shipped to point to whatever registry is in vogue when the client was built; the site or user can customize it based on local needs and practice but IT WORKS OUT OF THE BOX for most users in most environments. > The WWW mechanism for doing this depends on the existence of scheme names. Yes, but that mechanism makes it very difficult to deploy new "schemes". (and a URL "scheme" isn't the same thing as a URN "scheme", at least, not to everybody) > Relative URL parsing depends on the existence of scheme names. We're talking about URNs, not URLs. And if you have URNs you don't need relative URLs. There are enough potential problems with extending Relative URLs to the URN world that I am very dubious of requiring "relative URNs" in URN space. > Hell, ALL > EXISTING IMPLEMENTATIONS OF URIs DEPEND ON THE EXISTENCE OF SCHEME NAMES. This isn't a justification for anything in particular. The reason we're doing this little four-plus year exercise is that "existing implementations of URIs" aren't sufficient for our needs. > > Nothing about our proposal requires the client to use the URN registries. > > But if we were to design a scheme such that a client NEEDS "to define, > > without reference to any network, how identifiers are to be resolved", > > THAT would be undesirable. > > > > As for being able to "resolve" a URN without being connected to the network... > > I don't know what this means. Either the client has access to the external > > services it needs to make use of a URN, or it doesn't. If the client doesn't > > have access to those services, the URN isn't very useful to the client > > except for comparison with other URNs. > > You obviously haven't read the references I posted earlier. > Here they are again: > > http://www.acl.lanl.gov/URI/archive/uri-94q4.messages/0093.html > http://www.acl.lanl.gov/URI/archive/uri-94q4.messages/0101.html > and > http://www.ics.uci.edu/pub/ietf/uri/draft-ietf-uri-roy-urn-urc-00.txt I've read all of these, multiple times. Please consider that other people have different ways of mapping out the solution space -- just because they have picked different architectures doesn't mean that they aren't trying to solve the same problems you are trying to solve, in different ways. (C'mon. Does the "scheme" really have to be the part of the URI before the first colon? Do URNs really have to share a common syntax with URLs... down to including the path structure? If you really want URNs to be persistent, you don't put any semantically loaded information in them at all...certainly not information that reflects the internal structure of multi-file documents.) > If you don't support the identification of resources that may already > be on your local disk, identified within a personal database of resources > located in a real-world bookshelf, or located within the user's local > University library, then you have failed to solve the URN problem. > You don't have to define these resolution mechanisms -- you just have > to make them possible with minimum difficulty. Actually, we do support the identification of such resources, but not with names that indicate where the resources are stored. After all, a resource originally in my personal database could eventually become available to the entire world...should the resource name then change and then invalidate all of the references to it? But I could certainly configure my client to search my personal resource database, my mail folders, etc. before searching the DNS registry. Keith
Received on Wednesday, 29 November 1995 04:36:35 UTC