Re: Broken Links in LOD Data Sets

Bernhard,

Very valid point, IMHO.

I don't have a sound proposal either, but like to suggest a possible way
(whatever we do about it should scale to the size of the Web and be of a
fair cost-benefit ratio, right?)

So, regarding your 

> 4.) observer / notification mechanisms

I *could* imagine that voiD [1] (no big surprise that I mention it, now,
right? ;) can be extended in this direction. Please, if you think as well
this might be sensible, create an issue at [2] for it.

Cheers,
      Michael

[1] http://semanticweb.org/wiki/VoiD
[2] http://code.google.com/p/void-impl/issues/list

-- 
Dr. Michael Hausenblas
DERI - Digital Enterprise Research Institute
National University of Ireland, Lower Dangan,
Galway, Ireland, Europe
Tel. +353 91 495730
http://sw-app.org/about.html


> From: Bernhard Haslhofer <bernhard.haslhofer@univie.ac.at>
> Date: Thu, 5 Feb 2009 16:35:35 +0100
> To: Linked Data community <public-lod@w3.org>
> Subject: Broken Links in LOD Data Sets
> Resent-From: Linked Data community <public-lod@w3.org>
> Resent-Date: Thu, 05 Feb 2009 15:36:13 +0000
> 
> 
> Hi all,
> 
> we are currently working on the question how to deal with broken links/
> references between resources in (distinct) LOD data sets and would
> like to know your opinion on that issue. If there is some work going
> on into this direction, please let me know.
> 
> I think I do not really need to explain the problem. Everybody knows
> it from the "human" Web when you follow a link and you get an annoying
> 404 response.
> 
> If we assume that the consumers of LOD data are not humans but
> applications, broken links/references are not only "annoying" but
> could lead to severe processing errors if an application relies on a
> kind of "referential integrity".
> 
> Assume we have an LOD data source X exposing resources that describe
> images and these images are linked with resources in DBPedia (e.g.,
> http://dbpedia.org/resource/Berlin)
> . An application built on-top of X follows links to retrieve the geo-
> coordinates in order to display the images on a virtual map. If now,
> for some reason, the URL of the linked DB-Pedia resource changes
> either because DBPedia is moved or re-organized, which I guess could
> happen to any LOD source in a long-term perspective, the application
> might crash if doesn't consider that referenced resources might move
> or become unavailable.
> 
> I know that "cool URIs don't change" but I am not sure if this
> assumption holds in practice, especially in a long-term perspective.
> 
> For the "human" Web several solutions have been proposed, e.g.,
> 1.) PURL and DOI services for translating URNs into resolvable URLs
> 2.) forward references
> 3.) robust link implementations, i.e., with each link you keep a set
> of related search terms to retrieve moved / changed resources
> 4.) observer / notification mechanisms
> X.) ?
> 
> I guess (1) is not really applicable for LOD resources because of
> scalability and single-point of failure issues. (2) would require that
> LOD providers take care of setting up HTTP redirects for their moved
> resources - no idea if anybody will do that in reality and how this
> can scale. (3) could help to re-locate moved resources via search
> engines like Sindice but not really fully automatically. (4) could at
> least inform a data source that certain references are broken and it
> could remove them.
> 
> Another alternative is of course to completely leave the problem to
> the application developers, which means that they must consider that a
> referenced resource might exist or not. I am not sure about the
> practical consequences of that approach, especially if several data
> sources are involved, but I have the feeling that it is getting really
> complicated if one cannot rely on any kind of referential integrity.
> 
> Are there any existing mechanism that can give us at least some basic
> feedback about the "quality" of an LOD data source? I think, the
> referential integrity could be such a quality property...
> 
> Thanks for your input on that issue,
> 
> Bernhard
> 
> ______________________________________________________
> Research Group Multimedia Information Systems
> Department of Distributed and Multimedia Systems
> Faculty of Computer Science
> University of Vienna
> 
> Postal Address: Liebiggasse 4/3-4, 1010 Vienna, Austria
> Phone: +43 1 42 77 39635 Fax: +43 1 4277 39649
> E-Mail: bernhard.haslhofer@univie.ac.at
> WWW: http://www.cs.univie.ac.at/bernhard.haslhofer
> 
> 

Received on Thursday, 5 February 2009 17:17:19 UTC