- From: Daniel LaLiberte <liberte@ncsa.uiuc.edu>
- Date: Thu, 29 Jun 95 01:47:06 CDT
- To: sollins@lcs.mit.edu
- Cc: FisherM@is3.indy.tce.com, uri@bunyip.com
Thank for the involved response, Karen. Just what I need to stay awake at 1am. From: "Karen R. Sollins" <sollins@lcs.mit.edu> You're right - I was mapping name of resolution service to resolution service. Since I tend to think in an object-oriented world, a "name" (think URN) for a resolution service will always name the same service, although it may move around (to my way of thinking, unless the names for resolution services are not long-lived, globally unique, etc.) Did you have some other naming scheme (namespace and resolution mechanism) in mind? Yes, the name of a resolution service in a URN ought to behave much like a URN itself. The intention for the path scheme is that the name of a resolution service always maps to *logically* the same resolution service in terms of its behavior, but it may be a completely different service, or it may be absorbed by a service higher up in the path name space never to be seen again. The distinction is analgous to a function that computes factorial versus one that looks up the results in a table. They achieve the same effect in completely different ways. Anyway, I see a slight variation on the problem still. Once a set of URNs have the name of a particular name resolution service embedded in them, that group of URNs will always be tied to the same resolution service as each other. I'm reluctant for us to choose a path where that kind of assumption about service in the unknown future is restricted. This is true for the path scheme as described in the current internet draft. We have since then extended it with a fallback mechanism such that if the original resolution service (or its logical equivalent) does not want to deal at all with the resolution of some set of URNs, the client can effectively go back to higher-level resolvers. (Details forthcoming.) The top-level resolver will therefore accumulate older but still valuable URNs - something like the handle server would be good at this level. I agree with what I expect is an underlying motivation here to be able easily to find a reliable, or just good, but most preferably authoritative resolution service to resolve a URN. Another important motivation is to allow the resolution of names to happen close to where the named resource lives, at least initially. It is better to put both under the same administrative unit if possible, since that will put the motivation for continuing to resolve the name in the right place. We also, of course, deal with what happens when the named resource moves. In addition, I hope we don't choose a path that will restrict generality in the future. Generality is my motivation as well. That's why I am reluctant to impose any unnecessary semantics on URN resolution protocols. But that is another subject. In fact, a generalization of the path scheme has some interesting features. The path scheme takes a long string composed of potential substrings. It divides the string into a prefix substring and the remainder and hands the remainder to the resolver named by the prefix. That resolver eithers decides that it can resolve the remainder or it strips a new prefix to use as the name for the next resolver to which will be handed the new remainder. And so on. All correct. The new version is a little different, but this is close enough for the purposes of your argument. This can be generalized further, by allowing each resolver to map the remainder either to a location or to a pair consisting of a resolver and string that aren't necessarily simply substrings of the string it was handed. So for example assume the resolver for the B component of path:/A/B/C/D/doc.html is given the remainder "C/D/doc.html". B chooses the next resolver in whatever way it desires and also computes some string for that resolver. In general, this turns out to be just a sequence of arbitrary redirects which must be followed to discover what each subsequent resolver will do. It is powerful, but time consuming. One of the strong motivations of the path scheme is to allow locality of reference for scalability, so that if a client (or caches near the client) already knows where the resolver for, say, /A/B/C is, it can go there directly. A sequence of redirects does not allow direct access. This helps address the problem that I was suggesting above. If two URNs have the same resolver name embedded in them, when that resolver goes out of business and the URNs that would have gone to it now go to a variety of other services, as long as something knows about that dispersement, that something can be put in place of the original resolver service, causing the subsequent strings to be rewritten to reflect the new state of affairs. That is also the essense of the fallback mechanism I mentioned above. The something that knows about the dispersement would be a higher level resolver, or the root if none other. This is a scheme that at least works, but it leads to permanent inefficiencies because URNs cannot change; they are immutable, so all the URNs with that particular resolution service name will always be an indirection, once the original has gone out of business and its business dispersed. Correct, as I also argued above. It's thoughts like these that make me think that we need to encourage indefinite retirement of old names to be eventually replaced by new, more direct names. The old names would continue to work but resolution of them would be slower than for the new names. The bottom line problem here is that the dispersion may not be algorithmic in the URN, but rather on some other basis that isn't known at the time of resolution (and may both be different for different sets of resources and may change with time). There may be a different way of handling each and every URN that was originally handled by a single resolution service. This problem would seem to be true no matter what the URN scheme. Can you think of a scheme that avoids this problem? Does the solution involve having *no* resolution service name in a URN? What's left is only an opaque string - who will resolve it? If there are N possible resolution services out there, do we try each of them? How large does N have to be to handle the load? Is this essentially the handle service? I think you were also suggesting that although the resolver name may be embedded in the URN it need not be used. Yes, this is a fallback mechanism completly external to the particular URN scheme in which the client chooses some promising resolution service(s). Consider again that a path URN might be resolved by a handle service which just hashes the whole string and looks up whatever info is associated with it. Karen, I thought in our last discussion that you were arguing that it is essential to support such external fallback mechanisms, and therefore we need to know in advance that a URI is a name that lives forever, so we know that it is legitimate to attempt to resolve it in whatever way we may choose. The problem here is that if is there, I suspect that users (applications, clients, whatever) will come to depend and expect from early on that the resolution service name must be correct. This is more a matter of human nature forcing us in a direction that has significant drawbacks but apparent short-term payoffs. I think that we as the designers of the scheme need to be careful to be as visionary and long-sighted as possible. Absolutely. But I disagree that there is a problem simply because URNs might have historical resolver names embedded in them. People, or software, will learn to ignore irrelevant details. We won't catch all the pitfalls, but we should try our best to avoid those we know about. (That's part of what makes the process challenging and often drawn out as more and more issues come to mind.) Yes, each new twist is like the next bend in the river. Anyway, to make a long story shorter, I agree with the desire to make this stuff efficient, but I believe that we shouldn't pay too heavy a price for that and should understand what the price is. Agreed. Daniel LaLiberte (liberte@ncsa.uiuc.edu) National Center for Supercomputing Applications http://union.ncsa.uiuc.edu/~liberte/
Received on Thursday, 29 June 1995 02:51:41 UTC