W3C home > Mailing lists > Public > public-perma-id@w3.org > February 2016

Re: Problems and Opportunities at purl.org

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Mon, 29 Feb 2016 21:35:42 +0000
Message-ID: <CAPRnXtmw2tnOxDAtD2=BfWzeh=kzcWrbhL6uCH-LS=8c=O0DPA@mail.gmail.com>
To: Shane McCarron <shane@aptest.com>
Cc: Pemanent Identifier CG <public-perma-id@w3.org>, David Wood <david@3roundstones.com>
I assume this one was also meant for the list, Shane? :)  (Perhaps w3c
guys would be able to turn on Reply-To header).


On 29 February 2016 at 14:15, Shane McCarron <shane@aptest.com> wrote:
> Is the intent to, originally, just take the database and map it into one or
> several .htaccess files?
>
> Also, as you all know, there are already a number of w3id permenent URIs.
> Clearly we are not going to override any of these.  Is there a plan for how
> to handle collisions?

As the plan is evolving, I think what we can do is that we'll do two steps:

1) A script that generates rules.csv (?) from purl.org CSV dump and
puts them in their appropriate folder (e.g. rules for /fred/soup.pdf
goes into fred/rules.csv). We run this once. It might also add a
mini-README.md that shows who made the entries in purl.org -- but I
guess we should not expose their email addresses (they didn't sign up
for that)

2) Another script that generates .htaccess from rules.csv - as
discussed later. This mechanism can be used by non-purl folks as well
- by editing CSV files as github pull requests.  We'll make this
'safe' so that it does not touch an existing w3id.org .htaccess unless
it already has a ## DO NOT MODIFY section (in which case it will only
modify that section).


BTW, here are the top-level (potential) conflicts I found:


    <id>/CC/</id>
    <id>/commerce</id>
    <id>/DC</id>
    <id>/greycite/1</id>
    <id>/hydra/</id>
    <id>/isa/isa-rdf/</id>
    <id>/library</id>
    <id>/mtv</id>
    <id>/nidash</id>
    <id>/nkos</id>
    <id>/omn/Omni_Schema/</id>
    <id>/ontolink</id>
    <id>/ontology/bkn</id>
    <id>/payswarm</id>
    <id>/people</id>
    <id>/role/terms/*</id>
    <id>/ro/ont</id>
    <id>/role/terms/*</id>
    <id>/spar</id>
    <id>/xapi/</id>


.. and friends. (The longer we wait, the more the list will grow I guess! :)


by checking the top-level folders of w3id

stain@biggie:/tmp/1$ echo $dirs
3rs activity-streams als-telemonitoring bctt bundle cc charta77 class
clipc cmip6dr commerce credentials cwl dacura-errors dc dcat-ap
dgarijo dlo env food games geohealth greycite hydra iadb identity isa
isil itil legal_form library libris lio lob lss-usdl mare mtv
national-ocean-council navigation_menu nidash nkos omn ontolink
ontology openbadges ore ost own-pt patent_ontologies payswarm pbs
people plp prohow rdw ro role scc schema.org sdo security smetzger
socomp spar synbio unit valueflows verb web-keys webpayments xapi zdb
zericatalog

.. and using the REST API of purl.org

stain@biggie:/tmp/1$ for d in $dirs ; do curl -s
"https://purl.org/admin/purl/?p_id=/$d/" | grep '/id>' ; done
stain@biggie:/tmp/1$ for d in $dirs ; do curl -s
"https://purl.org/admin/purl/?p_id=/$d/" | grep '/id>' ; done
stain@biggie:/tmp/1$ for d in $dirs ; do curl -s
"https://purl.org/admin/purl/?p_id=/$d/*" | grep '/id>' | head -n 1;
done



-- 
Stian Soiland-Reyes, eScience Lab
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/    http://orcid.org/0000-0001-9842-9718
Received on Monday, 29 February 2016 21:36:36 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:43:41 UTC