- From: Shane McCarron <shane@aptest.com>
- Date: Mon, 29 Feb 2016 11:45:40 -0600
- To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Cc: Pemanent Identifier CG <public-perma-id@w3.org>
- Message-ID: <CAOk_reH3A+Utw6fUbpvw6dg9ka6RtOW_43Ykn-Na9F3BeTg0Sw@mail.gmail.com>
In general I don't *hate* the idea if permitting the use of CSV files to drive the creation / updating of the .htaccess files. But I would prefer this to be an option. I think my mental model was that this was a one time migration from purl.org - after that we would just use .htaccess files as we have been. But I appreciate the thought that this might be overly onerous for some significant number of potential users. Editing those things is not for the meek! What would people think about a rule set like: 1. If there is a .htaccess file in a directory, that file can have sections in it that are demarked and will never be automatically modified. 2. If there is a rules.csv file in a directory, that file contains mapping rules that will update the (non-demarked) parts of the .htaccess file in the directory (creating the file if necessary) I haven't tried to implement this sort of github post-push processing magic on branches / pull requests before. Is that even possible? On Mon, Feb 29, 2016 at 11:32 AM, Stian Soiland-Reyes < soiland-reyes@cs.manchester.ac.uk> wrote: > On 29 February 2016 at 16:35, Shane McCarron <shane@aptest.com> wrote: > > The only downside to a huge top level .htaccess is the difficulty of > editing > > / maintaining it. Otherwise I am not concerned. Apache .htaccess > > processing is efficient enough for these purposes imho. > > I guess you meant to reply to the list, so I've CCed it in. > > Another issue then is if we are to allow editing a CSV file to > re-generate .htaccess (rather than a one-off move), then we have to > extra careful that there aren't any other modifications to the > top-level .htaccess. > > I was picturing we could move to a model where you have a folder, like > let's look at > https://github.com/perma-id/w3id.org/blob/master/cwl/ > then instead of the current .htaccess there, you could have a CSV file like > > https://gist.github.com/stain/c2d668b11b66948b5991 > > It should be quite easy to generate the corresponding .htaccess from > such files - they can have some headers to warn you: > > ## DO NOT EDIT > RewriteEngine On > ## END DO NOT EDIT > > > I think we can still do regular expressions, if they start with ^ - > which I think is fair enough) > > and the src paths are relative to the folder you are in, so on that > example the one with "context" in src basically means > https://w3id.org/cwl/context > > Special case then is for the folder itself, so either . or empty string. > > > > > The Very Advanced Edition can allow full paths like /cwl/context - > where the prefix from the current directory MUST match. (or we can > say this is the required format, even). This does however not work on > the regular expression side - as RewriteRules in a folder are relative > to their location (naturally). It's probably better to have a limited > number of options, so it's easy to validate the CSV files before > trying to generate the .htaccess. > > > > > On Mon, Feb 29, 2016 at 10:04 AM, Stian Soiland-Reyes > > <soiland-reyes@cs.manchester.ac.uk> wrote: > >> > >> I started > >> https://github.com/stain/w3id-csv > >> > >> it's quite simple start.. but it uses a CSV file like > >> > >> https://github.com/stain/w3id-csv/blob/master/purl_example.csv > >> which matches the schema David Wood mentioned. > >> > >> and then generates a bunch of .htaccess files. > >> > >> You can test it on a dummy install of Apache httpd with Docker - see the > >> README. > >> > >> > >> Obviously now this script is quite naive in that it makes a folder for > >> every purl.org entry, which (in addition to making loads of files) > >> would be a bit wrong (e.g. the purl /fred/soup.html would make the > >> fred/soup.html/.htaccess which would mean an intermediate HTTP > >> redirect from soup.html to soup.html/ -- and I've not gone through > >> the different types yet to do subtree matching or the correct HTTP > >> redirection status code. > >> > >> So one simple improvement would be to check if the path ends with a / > >> in purl.org or not - and then group those entries within the parent > >> path so there would be a bigger .htaccess. However I think we want to > >> avoid a single large top-level .htaccess for registrations like > >> http://purl.org/pav without a trailing / ? > >> > >> > >> As for conflicts this should be modified to only replace it's "own" > >> files by having a magic "#header". > >> > >> We also talked about having a "native" CSV file approach for w3id.org > >> - so this could be modified then to have a better file format that we > >> can convert the purl.org dump into. > >> > >> > >> > >> > >> On 29 February 2016 at 12:29, Stian Soiland-Reyes > >> <soiland-reyes@cs.manchester.ac.uk> wrote: > >> > Yeah, let's get this going. > >> > > >> > So looking at the purl database schema we don't really need the group > >> > and user stuff to start with (although that could be added to the > >> > README). > >> > > >> > the purls table itself should be sufficient to start. We can find the > >> > different "type" values in the purl.org source code I think? > >> > > >> > > >> > > >> > On 29 February 2016 at 11:58, Norman Gray <norman@astro.gla.ac.uk> > >> > wrote: > >> >> > >> >> Greetings, all. > >> >> > >> >> A little while ago (and this message is a reply to > >> >> > >> >> < > https://lists.w3.org/Archives/Public/public-perma-id/2015Dec/0001.html>, > to > >> >> resuscitate the thread), there was some interest expressed in a > >> >> purl.org > >> >> successor. That thread ended on a positive note, with David Wood and > >> >> some > >> >> others having access to the schema, and OCLC apparently keen on > passing > >> >> forward the current repository. > >> >> > >> >> I was asked about purl.org by a colleague today, and this reminded > me > >> >> about > >> >> last November/December's thread: is there any news about purl.org or > >> >> the > >> >> broader preservation plan, that can be passed on? Or is there any > way > >> >> that > >> >> I or others could help with this? > >> >> > >> >> > >> >> All the best, > >> >> > >> >> Norman > >> >> > >> >> > >> >> -- > >> >> Norman Gray : https://nxg.me.uk > >> >> SUPA School of Physics and Astronomy, University of Glasgow, UK > >> >> > >> > > >> > > >> > > >> > -- > >> > Stian Soiland-Reyes, eScience Lab > >> > School of Computer Science > >> > The University of Manchester > >> > http://soiland-reyes.com/stian/work/ > >> > http://orcid.org/0000-0001-9842-9718 > >> > >> > >> > >> -- > >> Stian Soiland-Reyes, eScience Lab > >> School of Computer Science > >> The University of Manchester > >> http://soiland-reyes.com/stian/work/ > >> http://orcid.org/0000-0001-9842-9718 > >> > > > > > > > > -- > > Shane McCarron > > Managing Director, Applied Testing and Technology, Inc. > > > > -- > Stian Soiland-Reyes, eScience Lab > School of Computer Science > The University of Manchester > http://soiland-reyes.com/stian/work/ > http://orcid.org/0000-0001-9842-9718 > -- Shane McCarron Managing Director, Applied Testing and Technology, Inc.
Received on Monday, 29 February 2016 17:46:11 UTC