Re: Problems and Opportunities at purl.org - a Dublin Core perspective

I posted a summary of discussion here (and on the PURLz mailing list) to
the DC-ARCHITECTURE mailing list of the Dublin Core Metadata Initiative.

DCMI was one of the very first users of purl.org; indeed, the service
was created by some of the same people at OCLC who managed the Dublin
Core effort itself.

I have created [2] and issued a pull request.  FWIW, I have my own 
inventory of DCMI PURLs that I maintain offline.

Unless anyone suggests otherwise, I will try to keep discussion of
DC-specific issues on DC-ARCHITECTURE and occasionally report back
between DC-ARCHITECTURE and public-perma-id.

I note that five members of the Dublin Core community are members of
this community group -- Stuart Sutton, Makx Dekkers, John Kunze, Paul
Walk, and myself.

Tom

[1] https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1511&L=DC-ARCHITECTURE&D=0&P=3711
[2] https://github.com/dcmi/w3id.org/tree/master/dc

======================================================================

Date: Thu, 19 Nov 2015 16:11:16 +0100
From: Thomas Baker <tom@tombaker.org>
To: DCMI Architecture <dc-architecture@jiscmail.ac.uk>
Subject: The future of PURLs

Dear all,

The longevity of PURLs is key to the future of DCMI Metadata Terms because DCMI
uses PURLs to identify all of its RDF properties and classes (e.g., [6]).  With
PURLs, the identifiers for DCMI metadata terms resolve to Web pages documenting
the meaning of those terms.  If the PURLs were to stop resolving, there would
be no direct link between an identifier for a metadata term and its
documentation.

The purl.org server has recently experienced outages.  Since late October, it
has no longer been possible for DCMI (or anyone else) to log into the purl.org
server to maintain its PURLs.  According to the purl.org administrator at OCLC,
the SOLR index on the purl.org site stopped updating, preventing effective
maintenance, so the login mechanism has been turned off pending a solution.

The future of the purl.org service at OCLC has recently become a topic of
lively discussion on two mailing lists:

1) The PURLz mailing list on Google Groups, where issues related to PURLs are
   discussed [1], and specifically issues related to the PURLz software 
   implementation of PURLs used by (among others) OCLC.  OCLC points its users 
   to this list.

2) The mailing list [2] for the W3C Permanent Identifier Community Group [3].
   Much of the discussion there has revolved around w3id.org [4], a "secure, 
   permanent URL re-direction service for Web applications" operated by the 
   Community Group.  The thread about PURLs on this list starts at [5].

It has been proposed that the purl.org service be migrated to w3id.org
-- a possibility to which OCLC is apparently receptive.  See below for
my summary of the discussion.

Tom

[1] https://groups.google.com/forum/#!forum/persistenturls 
[2] https://lists.w3.org/Archives/Public/public-perma-id/ 
[3] https://www.w3.org/community/perma-id/
[4] https://w3id.org/
[5] https://lists.w3.org/Archives/Public/public-perma-id/2015Nov/0003.html
[6] http://purl.org/dc/terms/creator
[7] https://github.com/dcmi/w3id.org (created an hour ago...)

----------------------------------------------------------------------

The idea behind the w3id.org site is to maintain .htaccess files (used
to configure redirects on Apache Web servers) in a Github repository.
Anyone can fork the w3id.org repository (e.g., [7]), create a directory
for their organization, edit an .htaccess file and README.md, then issue
a pull request to the w3id.org maintainers to have their redirects added
to the main repository.  The service is backed by a group of software
companies who have pledged to maintain it as a service for the community
and by a W3C community group.

It has been proposed that PURLz group, and/or the operators of the w3id.org
service, simply take over purl.org. Jeff Young at OCLC, currently the part-time
administrator of purl.org, reportedly thinks this is a good idea.  It was also
reported that OCLC "seems willing" to provide the data from purl.org for the
purposes of porting it to w3id.org -- if an acceptable plan were presented.

Issues:

-- Some people find the Github style of collaborative editing on w3id.org
   more congenial than the PURLz style of registering and managing maintainers.
   Others hate Github and argue that purl.org's existing users would need to 
   have a more user-friendly interface.  Dave Wood (one of the developers of 
   PURLz) is looking into options.

-- Github now serves RDF via gh-pages and has a decent workflow, https, mime
   types, file extensions and CORS (but no conneg).  But there is some concern 
   about dependence on Github.  Will Github be around 18 years from now?  It 
   is however argued that with some effort, the w3id.org service could "fall 
   back" to plain git.

-- Someone suggested the possibility of writing a bot that would "back up" 
   purl.org by trawling the purl.org service via the API and autogenerating 
   a folder structure with .htaccess files under https://w3id.org/purl.org --
   a theoretical possibility of uncertain legality.

Monica Omodei (Project Manager, Australian National Data Service) writes: "Maybe
we need to find a way OCLC can get recognition for having provided this service
in the beginning and committing to its ongoing support.  Supporting
'persistence' is not something you can back away from. I know they are a
non-profit cooperative so perhaps the membership need to be lobbied to support
allocating some funding for better support.  ...  OCLC are to be commended for
developing and supporting this service for so long so that we could generate
purls for persistent identification without them being reliant on the
persistence of our own organisations. This is their strength and hopefully OCLC
will consider either handing over the domain purl.org for the resolver service
to be maintained by an organisation for whom it is core business or decide it
is part of their core business."

-- 
Tom Baker <tom@tombaker.org>

Received on Thursday, 19 November 2015 16:20:08 UTC