Re: The UR* scheme registry, Citing URL/URI specs

Larry Masinter (masinter@parc.xerox.com)
Sun, 26 Oct 1997 10:28:23 PST


Message-ID: <34538BC7.6DBB1281@parc.xerox.com>
Date: Sun, 26 Oct 1997 10:28:23 PST
From: Larry Masinter <masinter@parc.xerox.com>
To: Harald.T.Alvestrand@uninett.no
CC: Al Gilman <asgilman@access.digex.net>, Dan Connolly <connolly@w3.org>,
Subject: Re: The UR* scheme registry, Citing URL/URI specs

> Claiming that the URL syntax document is the URI syntax document
> would certainly seem simpler to me than not doing so, but Larry
> did not agree last time I asked him (or so I understood).

While it might make a lot of sense to try to subsume URNs within the
URL syntax and framework, I'm not sure about every other kind of
resource identifier. For example, in the original URI work, we had
intended to work on URCs: a syntax for describing documents by their
metadata so that one might do an appropriate lookup that would
would correspond to the typical literary reference by date, author
and title. The closest proposal that would fill that role is the
RDF syntax being offered by W3C; I think that a partially filled
RDF with 'match' semantics could be used as a resource identifier,
would be uniform, but that trying to shoehorn it into a
"scheme:encoded whatever" syntax doesn't make sense.

In addition, there are other kinds of protocol elements for
resource identification that I think we *need* that I'm uneasy
about assuming will fit; for example, I think we *need* some
way of passing, securely, the credentials for accessing a
resource along with the resource locator. Perhaps that becomes
a scheme-specific component of a current URL, but I'm wondering
if we might not need some wrapper syntax for containing a
URL plus other name/attribute values in order to get the
extensibility needed.

As for URLs and URNs, Personally, I think that the desire to
distinguish URNs and URLs syntactically (with the distinguishing
"urn:" prefix) should be balanced with the difficulty of deciding
the role of the string in advance. As many have pointed out,
"location independent" is not criterial, since many URLs are
location independent, in the sense that the URL wouldn't change
even if the information moved (mid:* and http://www.purl.org/*),
and the converse (e.g., that ISBN numbers would stay the same
even if the book were republished by a different publisher) is
similarly problematic.

So, while it may seem useful for a recipient of a URL to be
able to tell, syntactically "is this URL location-independent?",
the number of ambiguous cases leads me to believe that the distinction
is more in the usage (how are you using the URL?) than intrinsic.
I think 'location independent' is only one of several criteria,
and that some finer grained distinctions might have a better use
in many application protocols; for example, it's important to
know whether a URL is unique to a particular entity body (as
with cid: or MD5-based schemes), as a way of facilitating caching,
or whether an embedded URL for the entity itself would need to be rewritten when
moving the entity (e.g., how to implement the
MOVE operation in WebDAV). It would be extremely useful to
distinguish between those kinds of resources of the entity kind
(most HTTP and FTP URLs, z39.50r) from those that are used as
portals into interactive services (telnet, z39.50s). Of course
this is ambiguous, too.

As it stands, I'd just as soon see URN schemes in the same
registry as URL schemes (no conflicts allowed), and just annoted
as to whether the scheme implies URN-ness. I'd also like to
see "urn:" turned into a more universal URL prefix, e.g., 
allow "urn:http://www.purl.org/blah" as a means of indicating
"I intend this usage of http://www.purl.org/blah to be treated
as a permanent name rather than as just the current location".

Larry
-- 
http://www.parc.xerox.com/masinter