RE: Decommissioning a linked data site

Sustainability of LOD vocabularies? Isn't there "a project for that"?

http://labs.mondeca.com/dataset/lov/about/

From: Bradley Allen [mailto:bradley.p.allen@gmail.com]
Sent: 30 May 2012 23:03
To: Tim Berners-Lee
Cc: Antoine Isaac; public-lod@w3.org community; rufus.pollock@okfn.org Pollock
Subject: Re: Decommissioning a linked data site

Tim- OK, I'm game. I am in the process of hacking together a small redirect server that would replace the existing site, and can easily incorporate the ability to return something like the following on a request for the root resource:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns>> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#<http://www.w3.org/2000/01/rdf-schema>> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix void: <http://rdfs.org/ns/void#<http://rdfs.org/ns/void>> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#<http://www.w3.org/2004/02/skos/core>> .

<http://www.t4gm.info/>
  rdf:type void:Dataset ;
  foaf:homepage <http://www.t4gm.info/> ;
  void:uriSpace "http://www.t4gm.info/concept/" ;
  dcterms:isReplacedBy <http://id.loc.gov/vocabulary/graphicMaterials> .

<http://id.loc.gov/vocabulary/graphicMaterials>
  rdf:type void:Dataset ;
  foaf:homepage <http://id.loc.gov/vocabulary/graphicMaterials.html> ;
  dcterms:replaces <http://www.t4gm.info/> .

<http://www.t4gm.info/linkset/>
  rdf:type void:Linkset ;
  void:linkPredicate skos:exactMatch ;
  void:subjectsTarget <http://www.t4gm.info/> ;
  void:objectsTarget <http://id.loc.gov/vocabulary/graphicMaterials> ;
  void:dataDump <http://www.t4gm.info/linkset/> .

Would something like that suffice? I think this captures the crucial information that there are two datasets, that one is replacing the other, and that (following Antoine's lead) we provide access to a linkset that maps resources from one dataset to its replacement. Additionally we could provide additional general voiD dataset metadata and provenance information to give additional background.

I could not, however, find a vocabulary that made it easy to express the normative statement that <http://id.loc.gov/vocabulary/graphicMaterials> was in some manner better or to be preferred to <http://www.t4gm.info/>. The provenance vocabularies don't seem to address this question; the closest thing I can find are the predicates in and around stating opinions about scientific discourse in ontologies like SWAN. Perhaps this is an issue that could be taken up in the Open Provenance effort, if someone hasn't already addressed it.

I'm regret to say that I'd already nuked the CKAN record by the time this email reached me, but I'll follow up and see what support there is there to provide a historical record with similar information for others proceeding down this path in the future.

Finally, I'd like to try to do something to address your suggestion with respect to providing query rewrite rules, but I'm unclear exactly what you mean by that; what form would you expect that kind of information to take? - regards, BPA

Bradley P. Allen
http://bradleypallen.org

On Wed, May 30, 2012 at 3:57 AM, Tim Berners-Lee <timbl@w3.org<mailto:timbl@w3.org>> wrote:
Seems to me that the crucial bit of information that the data
which is served by your site now can be got much better by
going th LoC site woul dbe nice to have in machine readable
form.

One idea is *leaving* it in CKAN but mark it
as historical so it can be a place o make the pointer the
the superceding point.  The entry could be a sort of rallying
point for people who were interested in the data.
In a way, the CKAN entry has an added value once you are gone.
I don't know if CKAN has that facility.

For total extra kudos, provide query rewriting rules
from yours site to LoC data, linked so that you can write a program
to start with a sparql query which fails
and figures out from metadata how to turn it into one which works!


Tim


On 2012-05 -29, at 16:47, Antoine Isaac wrote:

> Dear Bradley,
>
> The second part of your plan reminds me of my recent question on "moving" a dataset
> http://lists.w3.org/Archives/Public/public-lod/2012Apr/0123.html
>
> The object moved (a thesaurus) is quite the same, as the cause for the "moving": in both cases an official version has arisen to replace a first prototype.
>
> We have tried to create a redirection, using a 301 code. But I guess that if we had decided to shut done our server altogether, we would have opted for the same 410 code as you!
> and maybe we'll do, one day...
>
> Cheers,
>
> Antoine
>
>
>> Back in 2009, as an experiment in working with RDFa and linked data, I created t4gm.info<http://t4gm.info> <http://t4gm.info>. It is based solely on US Library of Congress library linked data (specifically, the Thesaurus for Graphical Materials), which at the time I created the site didn't have any equivalently accessible linked data. That has long since been rectified by the LoC. So t4gm.info<http://t4gm.info> <http://t4gm.info> is at best redundant and at worst potentially confusing.
>>
>> So what I want to do is shut the site down. But there doesn't seem to be much if any best practice around doing that, especially when the site by virtue of its listing with CKAN is part of the LOD Cloud diagram. What I want to do is 1) delist it from CKAN, and then 2) shut the site down, perhaps replacing it with a simple web service returning a 410 status code per RFC 2616. I assume it will be removed from the LOD cloud diagram when that is next updated from the CKAN data.
>>
>> Anyone have any suggestions beyond that? Also, if anyone from CKAN is reading this, I could also use some guidance on how deletion of records is accomplished through the online interface. - cheers, BPA
>>
>> Bradley P. Allen
>> http://bradleypallen.org
>
>
>

Received on Thursday, 31 May 2012 08:03:23 UTC