Re: Problems and Opportunities at purl.org from Stian Soiland-Reyes on 2015-11-23 (public-perma-id@w3.org from November 2015)

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Mon, 23 Nov 2015 11:45:33 +0000
To: public-perma-id <public-perma-id@w3.org>
Message-Id: <1448278413-sup-4516@biggie>
Excerpts from Norman Gray's message of 2015-11-10 22:40:21 +0000:
> I'd guess that the truly genuinely necessary functionality would be:
> 
>    * maintain current purl.org redirects (even if some of the odder ones 
> have to be done by one-off hand-hacking);
> 
>    * allow registration of new 307 redirects (and possibly 303, but 
> because it's RTTD rather then necessarily widespread);
> 
>    * allow reservation of new 'domains';
> 
> ...and nothing else in version 1.

Agreed. The first milestone should be to keep the current purl.org redirects
with SOME (if a bit technical) way to update them.

Editing an existing .htaccess file is easier than making a brand new ones, 
at least if we add in a couple of #comments when converting the
purl.org database and don't just do a massive
purl.org .htaccess file, but split it by folders.



> > but we would need to migrate the existing w3id.org <http://w3id.org/> 
> > PURLs forward, I think.
> In the same spirit, is that _really_ the case?

Not migrating would undermine the whole reason for having w3id.org - how would
anyone trust to use us if suddenly we wipe the existing 
identifiers?  

The current collection should be quite managable to convert manually in a
couple of days - so I don't see this as a big issue.

Being able to support pretty much of all of the existing purl.org redirects is 
however much more important.  They should all be rewritable to .htaccess

The group management side of purl.org (the "domains" that aren't)  - obviously
w3id don't currently have much in comparison with regards to access control
as we run on an honour + sanity check system, but a folder/ with a README.md
with some names in it should do.



What I see a danger with proposing some new $shinyServerSoftware is that we can
easily bind ourself into the same trap as purl.org - becoming high maintenance
sysadmin-wise, and potentially relying on abandoned technology. 
Apache HTTP server also scales very well, and you can't say it's proprietary
or at immediate risk of being abandoned. :)


What I like is the ideas that have been proposed to have a kind of "build" 
stage with more managable CSV files or something, that then "compile"
into .htaccess or XML or whatever you fancy using a 
straight-forward Python/Ruby/nodejs script.


Making a simple interface on top of such a thing should be possible
in many different ways - e.g. a Javascript-based wizard that just 
presents the text you need to append, or even loads the existing
CSV (or a JSON derivative) to be able to "edit".

It doesn't have to be "live" like with purl.org - so it is OK if this just ends
up as a github pull request, or later becomes something more traditional
and client/server based. 


There could be the odd .htaccess hack that then would have to be done
differently or not at all, THAT I would be OK with - given a large enough
transitionary period.

This would also mean also that libraries and researchers could use & archive
the w3id "database" without having to parse .htaccess or do thousands of HTTP
request.  (We might want to clarify the license on that database!)



-- 
Stian Soiland-Reyes, eScience Lab
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/    http://orcid.org/0000-0001-9842-9718
Received on Monday, 23 November 2015 11:46:05 UTC