W3C home > Mailing lists > Public > public-egov-ig@w3.org > February 2012

Re: PURLs don't matter, at least in the LOD world

From: Gannon Dick <gannon_dick@yahoo.com>
Date: Sat, 18 Feb 2012 15:49:53 -0800 (PST)
Message-ID: <1329608993.33677.YahooMailNeo@web112607.mail.gq1.yahoo.com>
To: David Wood <david@3roundstones.com>
Cc: "eGov IG \(Public\)" <public-egov-ig@w3.org>
Hi David,

I've been working on an RDF Link Server for urn:lex.  Brazil categorizes their Law and Legislation like this.  This URN syntax is easily extensible to ISO 3166 Country Codes as well as sub-regions.  Right now, it's a MYSQL database.  It accepts all possible 2 letter codes (Kosovo has no ISO code yet, however) and spits back the link list.  There are some codes which are redirected, some that are User Defined, some that are Reserved and some that are not assigned.

rdf:first points to the best I could find for a "National Website" , presumably a top level repository for Government Linked Data.  the last of rdf:rest points to the PII namespace node.  If a government is putting out datasets, they are not likely to contain PII (except for Public Figures, at work).  So, this gets around the sticking point for Commercial Search Engines.  Their data sets may be tainted by PII collected improperly or without the user's consent.  There is a lot of trouble around this issue and GLD does not want to go there.

The data base also includes many subdivisions as well as a list of Currencies in use.  The examples don't include that.

"IT" - Italy  http://www.rustprivacy.org/2012/urn-lex/IT.html
"QQ" - User Defined  http://www.rustprivacy.org/2012/urn-lex/QQ.html
"AB" - Unassigned  http://www.rustprivacy.org/2012/urn-lex/AB.html
"US" - United States  http://www.rustprivacy.org/2012/urn-lex/US.html
"BR" - Brazil  http://www.rustprivacy.org/2012/urn-lex/BR.html
"AC" - Redirect http://www.rustprivacy.org/2012/urn-lex/AC.html

The db should be Callimachus ready shortly.


 From: David Wood <david@3roundstones.com>
To: Hugh Glaser <hg@ecs.soton.ac.uk> 
Cc: "public-lod@w3.org" <public-lod@w3.org> 
Sent: Friday, February 17, 2012 6:02 PM
Subject: Re: PURLs don't matter, at least in the LOD world
Hi Hugh,

There are several aspects to PURLs that I think are relevant to LOD.  Some of them are:

- PURLs allow a general Web user to curate the location of a persistent identifier without needing administrative access to a DNS server, an Apache server or other non-user-oriented technology.  For many people, this is a big deal.

- PURLs allow for the implementation of http-range-14 (303) redirection without the need for administrator-level access to technology.

- "Partial" PURLs allow for the assignment of bulk persistent identifiers to classes of data (e.g. data set or database aggregation) with a minimum of administrative overhead.

- PURL Federation (currently in Beta, see [1]) allows for long-term persistence to be offered for identifiers in the face of changing hosting providers.

We are also in process of making Callimachus [2] into a PURL server specifically designed for the LOD community.  That
 will allow PURLs to fit more naturally with Linked Data in its various forms.

PURLs have historically been used by the library community (e.g. OCLC, a number of universities and e.g. the US Government Printing Office, which uses them to manage persistent Web addresses for US Government documents regardless of physical location).  However, their use by LOD developers seems to be mostly (but not always) for persistence of vocabularies.  Given that vocabulary developers have often hosted their vocabularies on fungible Web hosting providers but LOD applications and users often hard-code vocabulary URLs into their offerings, this use of PURLs seems particularly appropriate to me.

Given what I personally know of the state of US Government agencies, I'll take your bet whether the Web services of the Library of Congress or OCLC lasts longer :)  You might look back at the tortured history of id.loc.gov before we agree to a figure.

David Wood, Ph.D.
3 Round Stones
Cell: +1 540 538 9137

[1]  http://purlz.org
[2]  http://callimachusproject.org

On Feb 17, 2012, at 13:48, Hugh Glaser wrote:

> (Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.)
> M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this.
> It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem.
> Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist.
> (Personally I would bet the Library of Congress will last
 longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.)
> In the Linked Data world, at least, what does a PURL give protection from?
> Let's say I have http://dbpedia.org/resource/Tokyo. I can:
> a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to);
> b) I can resolve to find out what DBPedia thinks it "means" (returns as RDF);
> c) I can use it as an ID for another source to find out what that other source thinks it "means".
> Now let's say dbpedia.org goes Phut!
> What I lose is facility (b)
> What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo?
> I have (a), (b) and (c) as before.
> Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost.
> Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively.
> But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over.
> Not much to be gained for the overheads of having the purl?
> I can see that in the Web of Text, a URI that has gone 404 is rather painful.
> And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem,
 in my view.
> But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again.
> Is there something I am missing about PURLs?
> Best
> Hugh
> -- 
> Hugh Glaser,  
>             Web and Internet Science
>             Electronics and Computer Science,
>             University of Southampton,
>             Southampton SO17 1BJ
> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
> Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
> http://www.ecs.soton.ac.uk/~hg/
Received on Saturday, 18 February 2012 23:50:22 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:00:46 UTC