- From: Christopher Gutteridge <totl@soton.ac.uk>
- Date: Mon, 16 Apr 2018 10:33:36 +0100
- To: "<public-lod@w3.org>" <public-lod@w3.org>
Hi, I can't remember if I ever mentioned our project <http://opd.data.ac.uk/> on this list? What it is, is a method for auto-discovering open data from (and about) an organisation, given it's website. It's specifically aimed at "predictable" datasets, that you would expect/hope an organisation to provide, rather than unique datasets which you would find in an organisation's data-catalogue rather than try to discover for a specific purpose. Our pilot use-case was discovering lists of research equipment from UK universities. Every week we check the website of *every* www.???.ac.uk website looking for open data, and if we find some we added to the daily crawl. Signposting complete datasets from the homepage is more distributed-web as it means you don't need a big crawler. One thing we felt was very important was to make it easy for non experts to create the basic RDF document describing their organisation. To this end we have given verbatim examples rather than just published ontologies <http://opd.data.ac.uk/docs/social>. I'm proud to say we've seen .ttl files produced by non-IT admin staff! To make things easier we provide a tool to check the autodiscovery and validity and meaning of the .ttl file <http://opd.data.ac.uk/checker>. We provide two alternate mechanisms to discover the OPD (the ttl file describing the organisation and it's major datasets) because my experience of website-politics is that the IT dept control the server (and can easily add a redirect) and the comms dept control the content (and can easily add to the homepage) At the bottom of http://opd.data.ac.uk/ you can see a respectable list of UK universities already implementing this, which I think is an exciting place to build from. While all people *had* to do is add the triples for their equipment dataset, many followed our other examples and added sections describing their basic metadata <http://opd.data.ac.uk/dataset/core>, social media accounts <http://opd.data.ac.uk/dataset/social>, and key webpages <http://opd.data.ac.uk/dataset/linkingyou> This is the technology that underpins <http://equipment.data.ac.uk/> (but to my frustration, some sites have been added to this service by hand, which increases the long term support effort to run the site). But if you happen to have a www.*.ac.uk website (we don't check subdomains) you should be able to add your data to the equipment list just by putting the correct information on your website. Right now, this is all ticking over, and only 27 institutions have implemented it. Right now, we've not got any new project work based on it. However it's all open source and infinitely extensible. - Christopher Gutteridge, University of Southampton
Received on Monday, 16 April 2018 14:05:01 UTC