Re: The Path URN Specification

michael rabinovich (misha@research.att.com)
Mon, 20 Mar 95 17:11:40 EST


Message-Id: <9503202219.AA29164@mocha.bunyip.com>
Date: Mon, 20 Mar 95 17:11:40 EST
From: misha@research.att.com (michael rabinovich)
To: liberte@ncsa.uiuc.edu, martin@mrrl.lut.ac.uk
Subject: Re: The Path URN Specification
Cc: uri@bunyip.com

I also enjoyed reading LaLiberte & Shapiro's proposal. 

However, I think, it has some shortcomings.

(1) It will increase traffic due to DNS requests for partial
name resolutions. This problem, however, could be avoided if 
DNS server software could be changed.

(2) More serious problem, I think, is that scalability (a purely
performance issue) is now not transparent to the end-user. Thus,
certain non-semantic reasons influence the way resource names are
constructed. For instance, assume there is a URN/URL resolver with 
/publishers/uk prefix. Then, as the scale grows beyond this resolver
capability, we add another server, and give it prefix
/publishers/uk/hamish-hamilton. What should we do with documents
that used to be semantically under hamish-hamilton before? Change
their names? This would violate name persistence.

Also, as a resolution server becomes overloaded and a new server is added,
it would be natural to split the load in half, rather than to 
have the parent resolver work at capacity while the newly added
resolver stay almost idle until enough new names are registered.

The root problem is that this scheme makes the server hierarchy
part of a resource name. This is semantically confusing; it also
makes names unstable, as server hierarchies tend to change.


What we are doing as part of an on-going project here at the Bell Labs
is use a URL syntax as a URN, and http as the protocol for talking
with URN/URL resolution server. For instance, URN from the 
original draft: <URN:dns:path.net:mitra1234> looks as

http://path.net/resolver/mitra1234.

Resolver is a CGI script that actually does the resolution. (It is 
just a technicality to remove it from the name).

Then, just as LaLibrter&Shapiro's proposal, the resolver can either
return a redirect with the actual URL of the document (in fact,
we use dynamic replication to deal with information server overloading,
so there are several URLs to choose from).

This scheme also makes it simple to register new names, change
URN->URL mappings when a document moves, etc. Our prototype server
has scripts that do that. Moreover, when registering a new document,
the user can ask for a specific URN, which will be assigned if it
does not exist already. 

Also, no change to current Mosaic browsers is required.

We deal with scalability issues internally, so that the user is not
affected. We do allow hierarchical namespace, but the hierarchy
is determined entirely by the semantics, not the server hierarchy.
In fact, I anticipate that the flat namespace will be used most often
(just like we use flat namespace for telephone numbers).

We should have a server  outside the firewall pretty soon, and I will then
ask people to try it out. In the meantime, does anyone see anything
immediately wrong with our approach?

Michael Rabinovich.