W3C home > Mailing lists > Public > whatwg@whatwg.org > May 2009

[whatwg] Link rot is not dangerous

From: Philip Taylor <excors+whatwg@gmail.com>
Date: Fri, 15 May 2009 20:30:23 +0100
Message-ID: <ea09c0d10905151230o7fb14d36gbd422619beaad75a@mail.gmail.com>
On Fri, May 15, 2009 at 6:25 PM, Shelley Powers
<shelleyp at burningbird.net> wrote:
> The most important point to take from all of this, though, is that link rot
> within the RDF world is an extremely rare and unlikely occurrence.

That seems to be untrue in practice - see
http://philip.html5.org/data/rdf-namespace-status.txt

The source data is the list of common RDF namespace URIs at
http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces
from three years ago. Out of those 284:
 * 56 are 404s. (Of those, 37 end with '#', so that URI itself really
ought to exist. In the other cases, it'd be possible that only the
prefix+suffix URIs are meant to exist. Some of the cases are just
typos, but I'm not sure how many.)
 * 2 are Forbidden. (Of those, 1 looks like a typo.)
 * 2 are Bad Gateway.
 * 22 could not connect to the server. (Of those, 2 weren't http://
URIs, and 1 was a typo. The others represent 13 different domains.)

(For the URIs which returned Redirect responses, I didn't check what
happens when you request the URI it redirected to, so there may be
more failures.)

Over a quarter of the most common namespace URIs don't resolve
successfully today, and most of those look like they should have
resolved when they were originally used, so link rot seems to be
common.

(Major vocabularies like RSS and FOAF are likely to exist for a long
time, but they're the easiest cases to handle - we could just
pre-define the prefixes "rss:" and "foaf:" and have a centralised
database mapping them onto schemas/documentation/etc. It seems to me
that URIs are most valuable to let any tiny group make one for their
rarely-used vocabulary, and be guaranteed no name collisions without
needing to communicate with a centralised registry to ensure
uniqueness; but it's those cases that are most vulnerable to link rot,
and in practice the links appear to fail quite often.)

(I'm not arguing that link rot is dangerous - just that the numbers
indicate it's a common situation rather than an extremely rare
exception.)

-- 
Philip Taylor
excors at gmail.com
Received on Friday, 15 May 2009 12:30:23 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:49 UTC