Re: URN Services draft

Ron Daniel Jr. (rdaniel@acl.lanl.gov)
Fri, 16 Jun 1995 16:45:46 -0600


From: "Ron Daniel Jr." <rdaniel@acl.lanl.gov>
Message-Id: <9506161651.ZM1810131@macrdan.acl.lanl.gov>
Date: Fri, 16 Jun 1995 16:45:46 -0600
To: uri@bunyip.com
Subject: Re: URN Services draft

First, I want to thank Keith, Stu, Eric, and Vince for submitting this
draft. I find parts of it very attractive, even though most of this
message is devoted to critiques.

Second, I would like to see this material broken into two drafts - one
on the URN syntax and another on how the URNs are to be resolved. The
level of detail this draft goes into on resolution is too much for a
naming paper but not enough for a resolution paper. I would like to
follow Larry's comment that names are names and resolution is resolution
so we can deal with their details seperately. Certainly we don't want a
naming system that impedes resolution. However, if we break things up
into smaller pieces we stand a better chance of swift agreement.


> 1.0 Design Philosophy
>
> Several URN proposals (including this one) propose a URN syntax that
> consists of some variation on "Name Authority/Opaque String" (NA/OS).

This leaves out an important element - a name space identifier. Later on
you suggest a method for allowing both FQDNs and some other naming
scheme to co-exist. That is a really nice idea that we can build on for
interoperability. However, I don't think that a NA/OS syntax is going to
be capable of the necessary level of backward compatibility. Later in
the paper is it suggested that naming authority is the place to identify
the old scheme to be grandfathered in. This doesn't seem quite right.
For example, DUNS numbers are a good system for identifying
organizations, but not the publications produced by the organizations.
 DUNS/04-997-747 3---and what else after the last digit?---
now Dun and Bradstreet has to set policy that about 30 million
organizations are going to follow? I don't think so.

Certainly shorter names are nicer, and we may want to have a default
namespace so that common naming schemes can have short URNs, but I would
like to see this point of other naming systems addressed.


> I. In the simplest case, a URN should be resolvable to a URL. In other
> cases, a URN may need to be resolved to a list of URLs [URC0] or a
> complete URC. To support this, one cannot merely say that a URN
> resolution request in a document is specified as URN:NA/OS. Instead,
> one should explicitly state the service desired.

The notion of resolution services is very nice, and in fact I have
shamelessly ripped off this idea and put it into the query language of
the SGML URC draft. However, I strongly disagree with the notion that
the author should specify the service. Sometimes I want the location so
my user agent can easily fetch the resource. Other times I want to look
at the URC. This is not a decision the author can or should make for me.

I can think of one case where an author might reasonably want to make
such a specification - when writing a document discussing URN resolution
and showing the difference between the URC and the resource. *Maybe* we
want to allow authors to specify it, I *strongly* disagree with the
notion that it should be required. I think this is better handled as
part of the protocol used in resolving names.


> II. To promote competitive, value-added URN resolution services, it is
> envisioned that multiple resolution hosts may be available to resolve
> URNs from certain name spaces. This results in a scheme that allows
> authors and users a means by which they can control which URN
> resolution hosts are contacted to resolve a URN. This control is
> independent of the actual "URN".  (see 4.0)

I *strongly* agree with the notion that we should be able to resolve
a URN at any one of a number of services. How much of this the
author should specify is another matter.

This seems to be a forward pointer to section 5 of the document,
Resolution Paths, rather than section 4.

There are certainly times when it will be useful to identify a particular
server's URC. For example, I might want to compare and contrast OCLC's
and LC's URCs for "Moby Dick". Michael Mealling and I had discussed this
issue. His solution, which was much nicer than mine, was to say that we use
a URL to identify a particular server's URC. To talk about the LC and OCLC
URCs for Maby Dick, we might use:
    http://uri.oclc.org/x-dsn-2:foo:MobyDick
    http://urns.lc.gov/x-dsn-2:foo:MobyDick
Here the URN is basically the query string we send to a particular server.
This will be very useful, but these are not URNs anymore.

RPs are a step back from the long life and machine independence of URNs.

They are dangerous for the reasons Paul Hoffman pointed out earlier.


> IV. This scheme requires a Naming Authority Registry. Since
> information from this Name Authority Registry will be easily cached, we
> assume a flat name authority space. Currently DNS root servers know the
> top two levels of the hierarchy, so, from an implementation standpoint,
> OCLC and OCLC.ORG are already functionally equivalent. The number
> of naming authorities can reasonably be expected to be smaller than the
> number of such domains, suggesting that the problem of a flat namespace
> is less problematic than it might appear. 
>
> In the event that the flatness of the namespace does present problems,
> there are reasonable ways to speed up initial name authority resolution that
> need not impose hierarchical requirements on the name authority space. 

It is not obvious to me that the number of naming authorities will be
smaller than the number of second level domains. Any arguments on why it
should be?

The URN requirements document stops just short of requiring a
hierarchical space for naming authorites.  Not enough details are
provided to evaluate the Naming Authority Registry part of this proposal
- it would be a seperate proposal anyway. However, I require a hell of a
lot of convincing that any flat namespace with centralized resolution is
going to scale to the degree that we need.


> 2.0 URN Definition
> 
...
> A URN is a string that consists of a naming authority, a "/", and an
> opaque string (e.g., OCLC/1234). 

See comments above about need for naming scheme identifier.

> A URN identifies a unique object (not unique content); different
> representations of the same intellectual content (or service) will have
> different URNs (these URNs may be tied together by a URC.) 

This certainly violates the letter of the URN requirements document,
which says that a URN identifies "whatever a name assignment authority
determines is a distinctly namable entity". It is OK for naming
authorities to decide that URNs identify format and not intellectual
content. The other way is OK too. It is not OK for us to mandate they do
it a particular way. Also, the spirit of the URN as set in discussions
on the list was a strong preference for identifying intellectual
content.


> 3.0 Naming Authorities
>
> The naming authority part of a URN can be of two types: a name
> authority ID or a resolution host. 

I like the ability to allow multiple naming schemes to be combined in this
fashion. This is something we will probably want to build on for
interoperability. Distinguishing between lots of different schemes is
quickly going to become difficult without a name space identifier.


> 4.0 Resolution Services
> ...
> 4.1 BNF for Resolution Requests

See comment above about splitting this draft into 2 parts -
naming and resolution.


> 4.5 URL to URN (L2N)
>
> Resolution of a URL to zero or more URNs. This would allow for old
> URLs to be updated appropriately. 

This will be handy for reverse lookups, which experience has shown is
a useful capability. However, I am uncertain of their utility for
turning crufty old URLs into shiny new URNs. In a URC, old
URLs should be deleted. To take an obsolete URL and get the URN will
require that the URC server keep all the deltas for a URC around
in a form that can be searched. Not impossible, but bulky and slow.

> 5.0 Resolution and Resolution Paths
...
> The client has three types of resolution hosts it can send requests to. 
>
>    1. Client specified Resolution Host from client table. (see 5.2) 
>    2. Author specified Resolution Host from optional Resolution Path.
>    3. Naming Authority from URN (Name Authority ID or Resolution
>       Host). 

Providing 2. makes URNs into URLs, with all that implies for the long-term
utilty of a document. 1 and 3 are OK.

> 5.1 Resolution Service (Resolver)
>
> The mechanism that will fulfill the resolution request is called the
> resolution service. 
>
> The resolution service can either resolve the request fully and return the
> requested information or it can return a new resolution request (a redirect)
> telling the client to send its request to another server. 

Indeed, an excellent thing to work into the resolution protocol.

> 5.2 Client Table
>
> The client may contain a table to allow override specification for each
> resolution service and naming authority ID. 

A good suggestion for implementors, but should be part of a seperate draft
on resolution.

> 5.3.1 Naming authority ID is not overridden and a resolution
> path is not specified.
...
> The information returned by a name authority resolver is
> relatively static and can easily be cached by the client to reduce lookups. It
> is further assumed that this information can be mirrored by various
> servers.

By the time you fold in TTL, caching, and non-authoritative servers, this name
authority scheme is starting to look a lot like DNS.

> 6.1 Requirements for URN Function
...
> Scalability
>       A URN can be assigned to anything. It is up to the resolution
>       host to decide how a URN should be resolved or what it should
>       be resolved to.

Actually, making the resolution service a mandatory part of the request
means that it is NOT up to the resolution host to decide what a URN should be
resolved to.

> Legacy Support
>       Legacy schemes like "ISBN" are simply naming authorities and
>       can be supported assuming a resolution host is made available to
>       handle the resolution requests.

ISBN is a naming authority. DUNS is a naming scheme, a different beast indeed.



Once again, I think there is good stuff in here that we can build on, even
if I oppose many of the details.



-- 
Ron Daniel Jr.                     email: rdaniel@lanl.gov    
Advanced Computing Lab             voice: (505) 665 0597
MS B287                              fax: (505) 665 4939
Los Alamos National Laboratory      http://www.acl.lanl.gov/~rdaniel/
Los Alamos, NM  87545          tautology:"Conformity is very popular"