Re: Comments on "Generic URN Syntax" from Daniel LaLiberte on 1995-07-10 (uri@w3.org from July 1995)

From: Daniel LaLiberte <liberte@ncsa.uiuc.edu>
Date: Sun, 9 Jul 95 23:55:56 CDT
To: fielding@beach.w3.org, paf@bunyip.com, uri@bunyip.com
Message-Id: <9507100455.AA24162@void.ncsa.uiuc.edu>
This is partly a response to a criticism of the path scheme, and
maybe more importantly a criticism of the generic URN idea and an
idea for how to deal with as yet undefined and unknown properties of
URIs.

> From: paf@bunyip.com (Patrik Faltstrom)

> At 20.52 95-07-09, Roy Fielding wrote:
> >On the other hand, you may want to give a more realistic example, like
> >
> >   <URN: path:/edu/bigstate/physics/thesis12>
> >
> >as a comparison.

> I do not agree with this. There is a need for recognizing the three
> different parts of a URN by the syntax because different resolution
> methods might be used when resolving the service/host part and from
> what is used when sending the "opaque string" to the service.

First of all, why should a different resolution method depend only
on the "service/host" part and not on the opaque string part?  Second,
there should be no requirement that the host part is really a host -
that would be too location dependent wouldn't it?  I believe the
original idea was that the "host" is a naming authority, and its
presense, along with the "service" (or scheme), gives you two levels of
subdivision across all URNs, assuming that is sufficient.  For some
schemes, the naming authority would be the name of a host; for others,
not.

> For this reason, I don't think that
> <URN: path:/edu/bigstate/physics/thesis12>
> should be a valid URN, because you don't know
> where the hostname ends and the pathname starts.

There are several ways to consider this, if you really want to fold
the path scheme into a single generic URN scheme.  First, you could
consider "path" to be the naming authority and everything after it is
opaque.  Second, you could consider the whole path up until the last
component after the last "/" as the naming authority and the last
component is an opaque string.  The latter is how the path scheme
defines things.  (If you want a different character in place of the
last "/", I wouldn't object too much.  I think it's not really
necessary and it would clash with current practice.)

The fact that the path naming authority is hierarchical could be
considered irrelevant to a generic URN scheme - it's just the name of
a naming authority, not the name of a host.  On the other hand, there
is great value for scalability in supporting a publically hierarchical
name space of naming authorities, and I would recommend that any
generic URN syntax support it.  (But I don't see much value in a
generic URN syntax anyway.)

> I know that there is in the path scheme built in
> functionality to find this border by querying DNS,
> but I don't like the fact that the syntax by doing this
> requires a specific protocol for resolution.

The resolution protocol that we define with the specification of the
path scheme does as you say, but another resolution protocol could
be used simultaneously and independently, if people thought that was of
value.  This other resolution protocol could ignore our protocol
completely and do things however it wanted.  E.g. the handle protocol
might be used.

> The syntax must be protocol independent.

I agree, but it is hard for a syntax not to be protocol independent,
if you just throw away the default protocol.  On the other hand, you
and several other folks want to define a syntax for all URNs that 
does impose on the semantics of names more than I believe is necessary.

Why is it necessary and why is it sufficient to have a single naming
authority in every name?

The only possible constraint on the syntax of names that I can see
might be reasonable is that there be some indication *that* a URI is a
name and so it meets the requirements of being a name.  How it meets
those requirements should be up to the internals of each scheme.
So the scheme name would essentially be the naming authority for all
names under its umbrella.

While you are at it, though, why not have some indication that a URI
is a content ID, or that it contains a signature, or that it is a
complete ID for the resource (usable for caching), or other things
I cant think of?  Note I am listing properties not of the resource but
of the URI itself, although some properties are likely to be
relationships between the URI and the resource it references.
How can we know in advance all the properties that might be relevant
in the future.  It turns out that "news" URLs are already URNs, but
did the designers realize that at the time?

If we don't have such indictions of the properties of a URI as part
of the URI, what is the alternative?  How bad would it be to have a
table of scheme names and an extensible list of properties that might
be appropriate for each scheme?  IANA could maintain the table of
scheme names, and another table of standardized properties.  When
a new property comes along, we go through the table of scheme names
and add the property where appropriate.

Daniel LaLiberte (liberte@ncsa.uiuc.edu)
National Center for Supercomputing Applications
http://union.ncsa.uiuc.edu/~liberte/
Received on Monday, 10 July 1995 00:56:11 UTC