Re: Persistence

Hi,
Let's please remember our mantra, "re-use, re-use, re-use" and "Open Source for Open Government Data."  

On Nov 22, 2011, at 10:12 AM, Phil Archer wrote:

> This makes a lot of sense.
> 
> Who would be that will executor? In the case of public sector websites, presumably the relevant national archive? Is there a business model here I wonder ;-)

Yes, its called a non-profit like OCLC who for better or for worse, supports the worldwide library community through the purls.org domain. A look at OCLC's website is instructive.  The library community has wrestled & solved many of the issues the LD/LOD community raise.  See below.

> As for top level domains, some are more politically acceptable than others of course. Perversely perhaps, it seems that a vocabulary hosted on example.eu, example.us or example.gov.uk might face more resistance to uptake than example.ie or example.ly, especially if it spelled out a nice word like semantical.ly (which appears to be available btw).

Are you kidding?? ly = Libya.  Anyone who approves that for government use should have their badge taken away.  Seriously?!

> 
> What we're talking about is maintaining a set of URIs for the long term for vocabularies. For documents and Web content in general, an archivist might take a different view. Britain's National Archives can, legitimately, say that, for example, the Bercow Report of July 2008 is still publicly available online. It's at:
> 
> http://webarchive.nationalarchives.gov.uk/20080528125538/http://www.dcsf.gov.uk/bercowreview/docs/7771-DCSF-BERCOW.PDF
> 
> The issue though is that it used to be at
> http://www.dcsf.gov.uk/bercowreview/docs/7771-DCSF-BERCOW.PDF

This example is *precisely* the case for implementing a persistent identifier solution (also called a permanent URL architecture).  

PURLs  is one such Open Source project that is used extensively by the worldwide library community, the US Government Printing Office through the US Federal Depository Library Program (for which 3 Round Stones provides commercial support), National Center for Biomedical Ontology, Shared Names, among others.

> and if anyone had linked to the original URI then someone following that link would see a short HTML page explaining at the dcsf.gov.uk site is no longer in operation, where the current live version is, and where the archive is. That's a very basic message for humans and no message at all for machines.
> 
> Hmmm... Given that the original URI of the doc is preserved within the new one, it shouldn't be too hard to come up with a script that automatically gave a 301 redirect *if* the target gave a sensible 200 response and a helpful message in case the target lead to a 404?
> 
> Phil.

Hang on, there is no need to write a script or recreate the wheel here.  

For permanent URLs that transcend changing infrastructure, I urge using the modern PURLs server has been running in production for 3+ years as OCLC's PURLs service and 2+ years for the US Government Printing Office.  The predecessor to the modern PURLs server was in production for 12 years.  This is a tested & proven solution to permanent URL architecture.

The Open Source PURLs server is a web-scale, production application with an easy to use interface (a bookmarklet for creating PURLs), and nice reporting capabilities for maintenance.

PURLs is based on HTTP and URI specs from the IETF.  Recently we've thrown in some TAG decisions and W3C Best Practices for use with RDF and Linked Data (303 support).

Check out the PURLs site & if you have further questions about production deployments, I'm happy to respond to them.  Re-use, re-use, re-use.

Cheers,

Bernadette Hyland
co-chair W3C Government Linked Data Working Group
charter http://www.w3.org/2011/gld/charter

> 
> On 22/11/2011 14:30, Richard Cyganiak wrote:
>> On 17 Nov 2011, at 19:26, Sandro Hawke wrote:
>>> My strawman proposal would be:
>>> 
>>> - vocabularies should be given their own domain name, probably in .net
>>> (they are infrastructure).   this way full ownership as well as
>>> maintenance duties can be transfered, legally, as necessary.
>> 
>> +1. Getting an own domain for the vocabulary also helps keeping the URIs short.
>> 
>> On the other hand, using something like purl.org also seems reasonable.
>> 
>> I'm agnostic regarding the top-level domain. I note that the .net TLD isn't terribly popular and I can't think of many current examples of vocabularies in the .net namespace.
>> 
>>> - there should be a two-level ownership structure, where one
>>> disinterested, trusted, 3rd party (like the executor of a will) retains
>>> final control, but delegates to the creator/maintainer.   With written
>>> policies about what happens in various eventualities.   But, basically,
>>> if either of these parties loses interest, they can be smoothly
>>> replaced, and if the creator/maintainer ceases operation or stops acting
>>> in good faith, it can be replaced.
>> 
>> Again, +1.
>> 
>> Best,
>> Richard
>> 
> 
> -- 
> 
> 
> Phil Archer
> W3C eGovernment
> http://www.w3.org/egov/
> 
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
> 

Received on Tuesday, 22 November 2011 19:13:00 UTC