Re: [URN] About realtive URNs

Daniel LaLiberte (liberte@ncsa.uiuc.edu)
Thu, 1 May 1997 13:07:31 -0500 (CDT)


From: Daniel LaLiberte <liberte@ncsa.uiuc.edu>
Date: Thu, 1 May 1997 13:07:31 -0500 (CDT)
Message-Id: <199705011807.NAA23146@void.ncsa.uiuc.edu>
To: "Ron Daniel, Jr." <rdaniel@lanl.gov>
Cc: "URN Workgroup" <urn-ietf@bunyip.com>, uri@bunyip.com
Subject: Re: [URN] About realtive URNs
In-Reply-To: <3.0.32.19970501093256.009c2970@cic-mail.lanl.gov>

 > > >   1) Unambiguous determination of the base URN
 > >
 > >No problem.

Ron Daniel, Jr. writes:
 > I wish I shared your faith on this Dan. However, I'm uneasy about
 > it.

There is not any problem (I am aware of) that is specific to URNs.
There are interesting problems having to do with splitting collections
and replication, as you mention.  But these problems are orthogonal to
URNs.  And there are solutions in any event.

 > >the default
 > >base URI for a document, if not specified by the document or the
 > >delivery package of the document, is the last URI known by the client
 > >in accessing the document,
 > 
 > But this only works if resources that refer to each other using relative
 > links are migrated together.

Indeed, relative URIs depend on being used in the context of their
base URI.  This is an advantage if the relative URIs stay in
the correct context and a disadvantage if not.  But all is not lost.
(And again, this is independent of URNs.)

 > There are several reasonable scenarios where
 > this will not hold:
 > 1)  The owner of a set of such resources sells 1/2 of them to another
 >     party, who takes charge of their storage. Now, 1/2 of the relative
 >     links will have the wrong base if it is determined using the "last
 >     URI known by the client" rule.

In the case of splitting resources that refer to each other by
relative URIs, it will be necessary to make some changes, not
necessarily to the documents themselves.  We can change some of the
relative URIs into absolute URIs.  Or we can designate one of the
locations of the split resources as the base for all the resources and
any requests for resources that are actually at another location will
get a redirect.  (Each such redirect can be remembered by clients to
avoid returning to the base server each time a resource needs to
be requested.)

One must ask how likely it is that resources that refer to each other
by relative URIs will be split up.  Relative URIs ought not be used
generally except when resources are likely to stay together.

 > 2)  Automated replication mechanisms spring up, and the most popular
 >     resource in an interlinked set gets widely replicated while less
 >     frequently used ones are not replicated.

The same kinds of solutions I described for the problem of splitting
resources can apply here too.  Replicas of collections may be full or
partial, and the replicas should probably know which kind they are.  A
partial replica probably needs to work via redirections from the full
base replica, whereas a full replica can let relative URIs use the
replica directly.

Another kind of solution is to dynamically, automatically rewrite
the appropriate relative URIs as the replica is created.   Rewriting
is generally to be avoided though.

 > >If a URN is redirected to a URL, and the URL is
 > >resolved to a document containing relative URIs, then they are
 > >relative to the URL (if the base is not otherwise specified), not the
 > >URN.
 > 
 > Right, and this can break in the two scenarios I mentioned above.

So you are really pointing out the problems of relative URIs.  URNs
have nothing to do with it, once you have a way of deciding what the
base URI is.

 > I'm more in favor of explicit determination of the base URI, either
 > by the BASE tag in HTML or the "destination" field mentioned in the
 > message yesterday.

Even a single explicit base URI is not sufficient if some of the relative
URIs in a document are relative to one base and some are relative to
another base.  This occurs in the splitting and partial replica cases.

 > But here I think we have to be very careful to
 > say that only one BASE tag is allowed. People may associate any
 > number of identifiers with a work, only one of which will make the
 > relative URNs function correctly.

It is possible that muliple base URIs will in fact work
simultaneously.  This will work as long as the neighborhoods of the
name spaces used by all the relative URIs are all the same.

Another kind of multiple base URI that I am interested in is nested
base URIs.  For example, at the top level of a document one base URI
would apply.  But in one particular section that uses lots of icons,
say, another base URI could be designated (e.g. /my/cool/icons).
Nested base URIs will be even more valuable when embedded documents
are supported.  So the relative URIs in a document that is embedded in
another can be specified as relative to their own base URI.  At every
point in the document, only one base URI applies, but it can be a
different base URI at each point.

Furthermore, while I'm at it, instead of only one base URI applying at
any one point in a document, I'd like to see multiple named base URIs
available simultaneously so that several contexts could be mixed.

And then there is the "root" which is always the same for documents on
a server.  A relative URI starting with '/' is rooted relative to that
one server, and there is no way to specify a different root that might
be elsewhere.  This would be useful when replicating a collection of
documents on another server but not relative to the same root. (One
might put each replica in a directory named after the server it came
from.)  Or I might want to say that all the documents under
/groups/sdg/people/liberte/ are "rooted" at that prefix, so
'/resume.html' would be in that directory rather than all the way back
at the server root.  This is much like the Unix chroot.

--
Daniel LaLiberte (liberte@ncsa.uiuc.edu)
National Center for Supercomputing Applications
http://union.ncsa.uiuc.edu/~liberte/