URIs vs. URNs vs. URLs from Sean B. Palmer on 2001-08-04 (uri@w3.org from August 2001)

From: Sean B. Palmer <sean@mysterylights.com>
Date: Sat, 4 Aug 2001 01:55:01 +0100
To: <uri@w3.org>
Cc: <www-rdf-interest@w3.org>
Message-ID: <011101c11c80$4cc42880$4fdc93c3@z5n9x1>
Following the pretty huge discussion of #rdfig [1], here are some
thoughts on the URI vs. URN vs. URL debate.

I think that we can all agree to a couple of things:-

* The very concept of persistence is one that is context based
* However, although persistence as a "quality of service" is therefore
a qualitative thing, it is evident that in practise there is a
difference between URLs and URNs in general in that the persistence of
URNs is generally better than that of URLs. This is not quantitative,
and not a rule

So, obviously, what we all mean by "persistence" varies not only from
person to person, but from context to context. But we can look more
closely at that context.

URI schemes that use a component corresponding to the DNS system tend
to be difficult for some people to use as terms on the Semantic Web,
which are often "just" names, not intended to get you any network
retrievable content upon dereferencing them. Sometimes it's useful to
provide some sort of material at the end of a URL being used on the
Semantic Web, but on some occasions it just isn't possible. Not
everyone wants to use PURL (not everyone's heard of PURL), perhaps
because they don't want to maintain the redirect. And not everyone has
access to the W3C datespace.

So we can avoid linkrot in places where we are using URIs simply as
conceptual identifiers by using some other-than-URL scheme. Or should
that be "termrot"? The Semantic Web has a chance to avoid the 404
problem to an extent: not totally, because obviously all of the
information about the terms on the Semantic Web (i.e. the connections
that are the "semantics") have to be stored as real data, and that
data can only be accessed at an address, which is very often (but not
always) dynamic to an extent where at some point it will be unusable.
viz. the quality of service may often be higher by using non-URLs for
terms, and at any rate, people should have the choice.

So, on to the debate as to whether a tag: URI should be a URN, and as
to why a URN is actually different from a URI at all. DanC said on
#rdfig that:-

21:24:43 <DanC_> it's not different from any other URI.

I respectfully disagreed, but only in that practically, the quality of
service for URLs is worse than for URNs. Conceptually, I agree that
URNs are not different from any other URI, but when do theory and
practice ever coincide? A little bit of evidence for this comes from a
discussion on #rdfig, starting with a good question:-

[[[
23:14:42 <ambient> Is the reason for RFC2611 stating that URNs cannot
be reassigned that URIs can be reassigned?
[...]
<sbp> I think that the practice is that, yes, URIs other than URNs are
often reassigned
[...]
<ambient> then what sense can you make of RDF assertions that use
non-URNs?
<sbp> I suppose that depends upon context...
<sbp> Here's an example
<ambient> exactly
<sbp> at the moment, I want to create a new namespace for a set of
terms that I want to use in my documentation. Let's say I want a URI
for the predicate "countryOfOrigin"
<sbp> Now, I own the domain infomesh.net - so you would have thought
that it would make sense to put that predicate there. Ignoring what
HTTP can retrieve, let's put it at
http://infomesh.net/2001/countryOfOrigin
<sbp> Now, I'm pretty poor. There's a good chance I won't be able to
afford infomesh.net when it comes up for renewal, or I may just choose
to spend the money on something else. So, now the new owner decides to
make a competing term at that same URI, and issues a load of software
to back it up
<sbp> clearly, software that I create that uses "my" term as a built
in, and software that uses the "new" term as a built in cannot
interact: I think that's a given
<sbp> So, HTTP, due to the instability of the DNS system, cannot
guarantee persistence *within an SW context*, or at least, not very
often
<ambient> i think this a problems with URLs rather than URIs in
general.
]]] - [1]

I agree; but that still proves that practically speaking, URLs do not
offer the same quality of service as URNs do.

As for arbitrary URI schemes vs. URN schemes... well, you can't be
sure. With arbitrary URI schemes, you cannot be sure of the
persistence unless you know the particulars of the scheme. If it
involves something based upon DNS, then I'd say it would have a lower
persistence than a scheme which uses a domain and a date component.
What would be good is if each URI scheme had a machine readable
profile, from which an SW processor could gather enough information to
decide whether or not to trust the persistence of a URI using that
scheme in a particular context. I'm not sure how useful that would be,
but you never know.

So I think that the "urn:" bit on the front of URNs is just a hack to
let (humans, more than processors, it seems) know that the quality of
service for that particular URI is "high". Now, there's are going to
be two camps of thought on this considering that it's true. One will
assert that the information is harmless, and indeed gives a useful bit
of contextual data to the user of the URI. Another will assert that it
breaks the opacity axiom for URIs.

Perhaps metadata that relates to an identifier scheme can indeed be
placed in the identifier itself: it is only metadata about the
*resource* which cannot be?

Personally, I think that considering the importance of persistence,
and in particular the importance with which it carries on the Semantic
Web, having the extra "urn:" bit may well be useful enough to warrant
breaking a Web axiom, because it reassures people against the 404 bug.

[1] http://ilrt.org/discovery/chatlogs/rdfig/2001-08-03.txt

--
Kindest Regards,
Sean B. Palmer
@prefix : <http://webns.net/roughterms/> .
:Sean :hasHomepage <http://purl.org/net/sbp/> .
Received on Friday, 3 August 2001 20:54:20 UTC