Re: URN: vs alternatives

Keith Moore (moore@cs.utk.edu)
Sun, 26 Nov 1995 03:56:09 -0500


Message-Id: <199511260856.DAA10465@wilma.cs.utk.edu>
From: Keith Moore <moore@cs.utk.edu>
To: liberte@ncsa.uiuc.edu (Daniel LaLiberte)
Cc: fielding@avron.ics.uci.edu, moore@cs.utk.edu, uri@bunyip.com,
Subject: Re: URN: vs alternatives 
In-Reply-To: Your message of "Sun, 26 Nov 1995 00:47:04 CST."
             <9511260647.AA05834@void.ncsa.uiuc.edu> 
Date: Sun, 26 Nov 1995 03:56:09 -0500

> Although the syntax does not imply DNS, the protocol for the first
> stage of resolving the "urn:..." does imply DNS, at least now.  

This is not how I understood our discussion in Knoxville. 
I would say it more like:

1. The first thing a client does in resolving a URN is to ask a 
   registry what services and protocols are available to resolve 
   that URN (and look for the service it needs and a protocol
   it understands)

2. One of the available registries is implemented using DNS. 

Now if by "does imply DNS, at least now" you mean that it's likely
that DNS will be the first registry deployed -- well, that's 
possible.  But it's not the only scenario I would consider.

I'm assuming that as soon as we define URNs we will have multiple 
registries, because it's clear that there is a lot of feeling both 
for and against DNS.  (I *wish* we could agree on using a single
registry for all URNs, but I don't see it happening.)  If we're
going to be stuck with multiple registries, it's important that
we be able to register services for any URN in any registry. 

> > 3. URN: in the Knoxville proposal is NOT a "scheme".  URN: is a prefix 
> > that allows clients to identify URNs in text and to distinguish URNs 
> > from other kinds of URIs.  The Knoxville proposal doesn't have "schemes",
> > because -- to the extent a "scheme" dictates a resolution protocol --
> > the inclusion of a "scheme" impairs the longevity of the URN.
> 
> But what I and others have been referring to as the Knoxville URN scheme
> IS a scheme.  The "urn:" prefix on an identifier signifies that it is a
> urn, and that the protocol that *may* be used to look up info in the
> global registry is to use DNS to query for the domain name constructed
> from the naming authority, etc (as needs to be defined in more detail).

I agree, except that instead of "the protocol that *may* be used" 
I'd say "one protocol that may be used".  (Though if we can get
consensus on a single protocol for this, well and good.)

> The opaque string part of the URN is interpreted by these
> secondary resolvers.  The name of the naming authority is essentialy a
> sub-scheme name for the opaque string, since the naming authority
> maps to a set of resolution services, each of which resolves the
> remainder of the URN in whatever way is appropriate.

I would refine this further so that the naming authority can register
any subset of its URN space to a set of resolution services for that
subset.

> > Roy said:
> > > That is true of any URI.  If the client is designed correctly, the
> > > resolution protocol is defined at run-time as a binding from scheme
> > > name to some resolution protocol (it doesn't matter what resolution
> > > protocol, so long as it is one that the client has implemented).
> > > The argument that this makes the identifier dependent on the resolution
> > > scheme is just plain false, as proven by any HTTP proxy.
> 
> > It's false only if everyone runs a proxy.  For everyone who doesn't run
> > a proxy (which as far as I can tell, includes most people who run native 
> > IP), a URL is tied to its protocol.  
> 
> I agree with Roy.  Remember, the client can do whatever it wants, and
> this is true for URLs or URNs.  But maybe you mean something different
> by a URL being "tied to" its protocol.

In the case of a URL, the thing a client normally does is to talk to
the host listed in the domain part of the URL, using the protocol
specified in the prefix of the URL.  It's certainly possible to add
an extra layer of indirection here -- either by a proxy, or as I
described in "URNs considered harmful".

Perhaps if we accept the idea that clients can do anything they want,
then we don't need to talk about the special cases.  (But when we have 
different views of the future, we'll disagree about which cases
are likely to be normal and which ones are likely to be special.)


> > Do you assume that everyone *should* run a proxy?  I don't.  
> 
> I do.  Caching proxies are essential for the scalability of the web.

If scalability of the web is to be acheived, it will probably happen
through some combination of cacheing (e.g. proxies) and replication
(multiple servers for any URN, listed in a location directory).
To me, the latter looks more promising.


> But nevertheless...
> 
> > I'd far rather see the rules for how you look up a URN advertised by the 
> > owner of the resource, and optionally overriden by a client, than 
> > for reliable URN resolution to *depend* on people keeping their 
> > clients/proxys configured.  
> 
> Hmm, we're talking about different things.  The first stage of the
> global resolution of a Knoxville URN is as described above, using DNS.
> The second stage is directed by the info found from the first stage.
> 
> What I believe Roy is suggesting (and I agree with) is that other URN
> schemes are possible for this first stage (or the whole process) that
> do not use DNS.  On the other hand, one scheme using DNS is not a bad
> idea, especially one that is very extensible, but it should not be
> assumed that all schemes need to fit into the same mold, no matter how
> general.  And (incidently) no matter what the scheme, clients can be
> configured to do whatever they want anyway.

It may be that you are using "scheme" where I am using "registry".
What you describe sounds fine with me, so long as :

a) a URN from one "scheme" cannot be distinguished from a URN from 
   a different "scheme"  (i.e. URNs don't belong to "schemes")
b) any registry can potentially list resources for any URN.

> > We agree, except that you're assuming the existance of a "scheme".
> 
> You can't avoid the existence of a scheme.  But, again, that depends on
> what you mean by scheme.  I mean the syntax and structure of the name
> space.  You mean the particular protocol used after the first stage of
> name resolution that finds the resolution services.

I don't really mean either of these; certainly not the second.
I have been using the word "scheme" to mean: a visible portion of 
a URN which selects one of potentially several mutually disjoint 
subsets of URN space, and which tends to dictate ANY part of the
resolution process (first stage, second stage, whatever.)

As for syntax and structure of the name space: I don't think it's
desirable for clients to have to know how a URN of type "foo" is
organized in order to resolve it.  I'd like to see resolution protocols
(or registries) which were flexible enough to direct clients
for incremental resolution (say to delegate aggregate portions of
the URN space under an NA) for most ways that a URN could be 
organized  (say, something like: if the last three characters of
the LUI are "xyz", talk protocol "Q" to server foo.bar.com)

In general: A "scheme" might be useful for some purposes, but 
only if the client doesn't have to know anything about them to
work reasonably.

> >   + a common prefix and NA space for all URNs
> 
> I see no need to have a common prefix for all names.   We will already
> have failed because of news message ids.  

Put it this way: if we expect to be able to type in a URN into a 
blank labeled "URN:" and have the client do something reasonable,
and if we want to accomodate news message-ids as well as other
kinds of pre-existing resource names, we need a way to distinguish
news message-ids from ISBN numbers from whatever else we're going
to deal with.  One way to do this is by prepending a string to
the URN that identifies which kind of pre-existing resource name 
is being used as a URN.  E.g. URN:/isbn/{isbn} or URN:/mid/{message-id}

News mesasge-ids aren't a problem because news readers that support
URNs will know that they have to prepend that string in order to 
look up a message-id in the URN registry.

> >   + resolution services for URNs are advertised in one or more
> >     global registries. clients need not be configured to resolve
> >     URNs on a per-scheme basis; they can simply consult one or more
> >     of the registries to see which services/protocols are available. 
> >     (clients can special-case lookups for part of the name space
> >     if they want to; but the ability to resolve a URN doesn't depend
> >     on them doing so.)  
> 
> This is very good, but not sufficient reason to require all future URNs 
> to fit into the mold.

We can't require any such thing, anyway, and we'd be arrogant to try.  
If what we create isn't sufficient for future needs, future users can
and will adapt as best they can.  But we'd like to make a URN system 
which is flexible enough so that when new needs are identified, the 
system can be adapted without (say) invalidating old URNs or old clients.

> > > Using the scheme "URN:" would doom all other names to the same resolution
> > > strategy, which just isn't sufficient for my needs and certainly isn't
> > > sufficient for the World-Wide Web.
> 
> > Nothing in the Knoxville proposal dictates what resolution strategy to use.
> 
> Saying that is confusing.  Given what you have said above, I believe
> you mean the second stage of the resolution is not dictated - and this
> is true.  But what Roy means is that the first stage is dictated (for
> the near future anyway) - and this is true.

I don't even think this is true -- unless we can all agree on a single
protocol for that first stage.  But I'm not holding my breath.  I can
live with multiple ways of doing the "first stage" (what I've been
calling "registries") so long as the decision of which "first stage" 
to use isn't dependent on some part of the URN.

> So once again, I think we can agree if only we could get our terms
> straight.

Yes.  If we're not speaking the same language, we'll never know to what
extent we're agreeing...

Keith