W3C home > Mailing lists > Public > public-perma-id@w3.org > November 2015

RE: Problems and Opportunities at purl.org [SEC=UNCLASSIFIED]

From: Car Nicholas <Nicholas.Car@ga.gov.au>
Date: Wed, 18 Nov 2015 05:26:04 +0000
To: "Pavel.Golodoniuc@csiro.au" <Pavel.Golodoniuc@csiro.au>, "david@3roundstones.com" <david@3roundstones.com>, "norman@astro.gla.ac.uk" <norman@astro.gla.ac.uk>, "jason.haag.ctr@adlnet.gov" <jason.haag.ctr@adlnet.gov>, "public-perma-id@w3.org" <public-perma-id@w3.org>
cc: "Simon.Cox@csiro.au" <Simon.Cox@csiro.au>
Message-ID: <d84ef992104d4d36aed12d4d8a3df295@win-exch-prod01.prod.lan>
In addition to the system description below, I'll just add a note on deployment.

The Australian Government has a multi-agency Linked Data working group that is using the PID Service to manage agency-independent URIs (data.gov.au) for use by government staff. Several agencies manage their own URIs with their own PID Services too. We are moving towards strong URI governance, across agencies, with URI management being undertaken by non-technical officers and the PID Service lets us do this.

The PID Service is now a stable product and on GitHub as Pavel says and in addition we are developing hardened operational instances of it on government infrastructure to provide dependable service. Soon we will be able to share the Puppet deploy scripts for operational installs.

Nicholas Car
Data Architect, Geoscience Australia

-----Original Message-----
From: Pavel.Golodoniuc@csiro.au [mailto:Pavel.Golodoniuc@csiro.au] 
Sent: Wednesday, 18 November 2015 4:16 PM
To: david@3roundstones.com; norman@astro.gla.ac.uk; jason.haag.ctr@adlnet.gov; public-perma-id@w3.org
Cc: Simon.Cox@csiro.au; Car Nicholas
Subject: RE: Problems and Opportunities at purl.org


We're watching this thread and believe we have something to add. We faced a similar issue a few years ago while working on the Spatial Identifier Reference Framework (SIRF)[1] and also for managing publication of some semantic web resources. We had a list of requirements that include the ones discussed in your conversation, but also add some more ideas:

1. UI for non-technical users, API for integration with other services; 2. Rules using regular expressions or some sort of wildcards; 3. Technology-agnostic solution - ability to serialise rules as CSV/XML/JSON/etc. for improved manageability and governance; 4. Cascading mapping rules - i.e. apply a sequence of rules, starting with the most specific match, then falling through to more general patterns; 5. Rules based on HTTP headers as well as URIs; 6. Mechanism that allow to build highly distributed networks of identifiers resolution services through delegations; 7. Built-in backup functionality.

After a review of available technologies, we developed the PID Service. Some more information can be obtained from the project Wiki at https://www.seegrid.csiro.au/wiki/Siss/PIDService. It is also described in a short paper to be presented at an upcoming conference [2][3]. 

The PID Service is released under an open source licence and is available from GitHub at http://github.com/SISS/PID. So this is available as a potential basis for a new implementation, if the design appeals. We'd love to see this effort succeed, and would be very happy if our work was able to help.

[1] https://www.researchgate.net/publication/264083782

[2] http://www.mssanz.org.au/modsim2015/

[3] http://www.researchgate.net/publication/284087065

Kind regards,

Pavel Golodoniuc
Research Team Leader
Mineral Resources
E Pavel.Golodoniuc@csiro.au T +61 8 6436 8776 Australian Resources Research Centre (ARRC)
26 Dick Perry Avenue, Kensington WA 6151 www.csiro.au
The information contained in this email may be confidential or privileged. Any unauthorised use or disclosure is prohibited. If you have received this email in error, please delete it immediately and notify the sender by return email. Thank you. To the extent permitted by law, CSIRO does not represent, warrant and/or guarantee that the integrity of this communication has been maintained or that the communication is free of errors, virus, interception or interference.

-----Original Message-----
From: David Wood [mailto:david@3roundstones.com]
Sent: Thursday, 12 November 2015 3:44 AM
To: Norman Gray <norman@astro.gla.ac.uk>
Cc: Haag, Jason <jason.haag.ctr@adlnet.gov>; Pemanent Identifier CG <public-perma-id@w3.org>
Subject: Re: Problems and Opportunities at purl.org

Hi Norman,

> On Nov 11, 2015, at 06:11, Norman Gray <norman@astro.gla.ac.uk> wrote:
> On 11 Nov 2015, at 1:41, David Wood wrote:
>> I actually agree with Jason - but think we need an optional UI for non-technical users on top of the GitHub interface.
> Not just for non-technical users, perhaps.
> The w3id.org solution of letting everyone customise a pile of .htaccess files is a very smart one, because it let w3id.org get up quickly, but I hope it's just seen as an interim solution.
> At present, I can apparently use _anything_ from mod_rewrite in there, which gives me a great deal of scope for being Clever, which would be a vice.  It would also tie w3id.org to Apache, or at least to a mod_rewrite work-a-like for all eternity, so may not be an optimal archival solution.
> A pile of .htaccess files is a fine implementation technology, but not, I think, an interface.
> As an alternative, one could imagine something as simple as a CSV file:
>    /people/nxg/myurl,http://example.org/foo/myurl

>    /people/nxg/tree1/*,http://example.org/bar/$$/index.html

> /people/nxg/tree2/([a-z]*)-v([0-9*),http://example.org/baz/$1/version-

> $2
> Put angle brackets round that and call it XML, or curly brackets and call it JSON, and you're up-to-the-minute.  And technology-agnostic.
> Something like that could be prepared (on- or off-line), uploaded, validated, and journaled, quite easily perhaps.
> One could also take a great deal of useful inspiration from DNS zone files.

Yes, I agree, presuming that we wish to collaborate to create a new implementation from scratch. That is tempting, given the state of the available options. None of them really nail the simplicity of PURLs and the common use cases cleanly IMO. I think I can say that with impunity given how many of them I’ve worked on. Hopefully I’ve learned something from the experiences.

> Also, as a more general point, I consider myself a technical user, but I... am not a fan of git.  Not a fan.  A not-fan.  Not, by any means or in any sense, an Enthusiast.

:) As a friend, I advise you to say what you mean. You wouldn’t want to end up with ulcers.

However, your point is well taken. The issue that I have is the longevity of the commercial GitHub service more than git itself, but we end up in the same place for different reasons.


> All the best,
> Norman
> --
> Norman Gray  :  https://nxg.me.uk
> SUPA School of Physics and Astronomy, University of Glasgow, UK

Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks.

Received on Wednesday, 18 November 2015 10:09:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:43:41 UTC