URN: vs alternatives from Daniel LaLiberte on 1995-11-26 (uri@w3.org from November 1995)

From: Daniel LaLiberte <liberte@ncsa.uiuc.edu>
Date: Sun, 26 Nov 95 00:47:04 CST
To: fielding@avron.ics.uci.edu, moore@cs.utk.edu
Cc: uri@bunyip.com, urn@mordred.gatech.edu
Message-Id: <9511260647.AA05834@void.ncsa.uiuc.edu>
> From: Keith Moore <moore@cs.utk.edu>

> I don't know how I can make this any clearer:

> 1. I (and others who were at Knoxville) have said, repeatedly, that
> the client chooses how to resolve a URN.  (One of our oft-stated
> design principles was "a client can and will do whatever it wants to".)

Ah, at last... this meme is taking on a life of its own, complete
with mutation.

> 2. I have also said, repeatedly, that the URN syntax that we defined
> is NOT tied to DNS, that other registries besides the DNS registry
> are expected.  It is essential that the syntax does not imply DNS -- 
> if for no other reason than to allow transitions to other registries 
> in the long term.

Although the syntax does not imply DNS, the protocol for the first
stage of resolving the "urn:..." does imply DNS, at least now.  Any
other protocol may be used in the future, and clients can do whatever
they want anyway, today.  The second sentence is true of all URLs.

> 3. URN: in the Knoxville proposal is NOT a "scheme".  URN: is a prefix 
> that allows clients to identify URNs in text and to distinguish URNs 
> from other kinds of URIs.  The Knoxville proposal doesn't have "schemes",
> because -- to the extent a "scheme" dictates a resolution protocol --
> the inclusion of a "scheme" impairs the longevity of the URN.

But what I and others have been referring to as the Knoxville URN scheme
IS a scheme.  The "urn:" prefix on an identifier signifies that it is a
urn, and that the protocol that *may* be used to look up info in the
global registry is to use DNS to query for the domain name constructed
from the naming authority, etc (as needs to be defined in more detail).
This is the first stage of the resolution protocol.  This protocol is
like many others in that the client communicates with a server to get
some information - in the case of DNS, there are several servers
contacted.

Now the info that is returned from this first stage is identification
of some (zero or more) resolution services, and what protocol they
speak.  The client can use these services to continue the resolution
process.  

The opaque string part of the URN is interpreted by these
secondary resolvers.  The name of the naming authority is essentialy a
sub-scheme name for the opaque string, since the naming authority
maps to a set of resolution services, each of which resolves the
remainder of the URN in whatever way is appropriate.

So the "sub-scheme" does not dictate a protocol so much as the structure
of the name space, since several different servers may resolve URNs in
the same name space, each with a different protocol.

For example, urn:urn/isbn:1234-56-7890 may be first resolved into
servers identified by: http://www.ncsa.uiuc.edu/cgi-bin/isbn and
whoispp://bunyip.com/scheme=isbn (I'm making up stuff, obviously).
Each of those services may be given the urn:urn/isbn:1234-56-7890
in whatever way is appropriate.

> Roy said:
> > That is true of any URI.  If the client is designed correctly, the
> > resolution protocol is defined at run-time as a binding from scheme
> > name to some resolution protocol (it doesn't matter what resolution
> > protocol, so long as it is one that the client has implemented).
> > The argument that this makes the identifier dependent on the resolution
> > scheme is just plain false, as proven by any HTTP proxy.

> It's false only if everyone runs a proxy.  For everyone who doesn't run
> a proxy (which as far as I can tell, includes most people who run native 
> IP), a URL is tied to its protocol.  

I agree with Roy.  Remember, the client can do whatever it wants, and
this is true for URLs or URNs.  But maybe you mean something different
by a URL being "tied to" its protocol.

> Do you assume that everyone *should* run a proxy?  I don't.  

I do.  Caching proxies are essential for the scalability of the web.
But nevertheless...

> I'd far rather see the rules for how you look up a URN advertised by the 
> owner of the resource, and optionally overriden by a client, than 
> for reliable URN resolution to *depend* on people keeping their 
> clients/proxys configured.  

Hmm, we're talking about different things.  The first stage of the
global resolution of a Knoxville URN is as described above, using DNS.
The second stage is directed by the info found from the first stage.

What I believe Roy is suggesting (and I agree with) is that other URN
schemes are possible for this first stage (or the whole process) that
do not use DNS.  On the other hand, one scheme using DNS is not a bad
idea, especially one that is very extensible, but it should not be
assumed that all schemes need to fit into the same mold, no matter how
general.  And (incidently) no matter what the scheme, clients can be
configured to do whatever they want anyway.

> We agree, except that you're assuming the existance of a "scheme".

You can't avoid the existence of a scheme.  But, again, that depends on
what you mean by scheme.  I mean the syntax and structure of the name
space.  You mean the particular protocol used after the first stage of
name resolution that finds the resolution services.

>   + a common prefix and NA space for all URNs

I see no need to have a common prefix for all names.   We will already
have failed because of news message ids.  We don't gain anything we don't
already have for all other URLs, namely global uniqueness (not explicitly
promised, but close enough).  The persistence of URNs is ultimately a
matter of human organizational longevity in any case, though we can make
it better by allowing reorganization more easily.  If there is something
essential gained by knowing an identifier is a URN, then we can have
a registry of URI scheme names that satisfy name semantics, whatever
those turn out to be.

>   + resolution services for URNs are advertised in one or more
>     global registries. clients need not be configured to resolve
>     URNs on a per-scheme basis; they can simply consult one or more
>     of the registries to see which services/protocols are available. 
>     (clients can special-case lookups for part of the name space
>     if they want to; but the ability to resolve a URN doesn't depend
>     on them doing so.)  

This is very good, but not sufficient reason to require all future URNs 
to fit into the mold.

> > Using the scheme "URN:" would doom all other names to the same resolution
> > strategy, which just isn't sufficient for my needs and certainly isn't
> > sufficient for the World-Wide Web.

> Nothing in the Knoxville proposal dictates what resolution strategy to use.

Saying that is confusing.  Given what you have said above, I believe
you mean the second stage of the resolution is not dictated - and this
is true.  But what Roy means is that the first stage is dictated (for
the near future anyway) - and this is true.

So once again, I think we can agree if only we could get our terms
straight.

dan
Received on Sunday, 26 November 1995 01:48:20 UTC