Re: Linked data sets for evaluating interlinking?

Hi all:

Two humanities datasets of potential interest in this regard:

A number of datasets (around 20 different ones I think) related to the study of antiquity have aligned their geographic/toponymic fields with the Pleiades gazetteer (http://pleiades.stoa.org) and published RDF accordingly. Most of this work has been done under the auspices of something called the Pelagios Project, and the alignment processes used by many of the participants are documented in blog posts at http://pelagios-project.blogspot.com/ (most of them a combination of automated and manual). Pleiades itself is also a linked data resource, and has a growing number (still only a small percentage of its content) of outbound links to dbpedia, geonames, and OSM. All of those outbound links are hand-curated. Contributors to Pleiades, where possible, are aligned to VIAF (manually) and bibliography in Pleiades is also beginning to be aligned to the Open Library and Worldcat (again, manually).

On a much smaller scale, I offer the "About Roman Emperors" dataset, which rather than minting its own URIs for the Roman emperors, uses the dbpedia resource URIs for each: http://www.paregorios.org/resources/roman-emperors/. The primary purpose of the dataset is to provide a comprehensive list of these for easy access and reuse by third parties, and to associate the dbpedia URIs with corresponding Roman imperial mint and minting authority data in nomisma.org and finds.org.uk, and to a static, late-90s-vintage scholarly encyclopedia of Roman emperors: http://www.roman-emperors.org/

Tom


Tom Elliott, Ph.D.
Associate Director for Digital Programs and Senior Research Scholar
Institute for the Study of the Ancient World (NYU)
http://isaw.nyu.edu/people/staff/tom-elliott



On Aug 26, 2013, at 6:04 AM, Adrian Stevenson wrote:

> Hi All
> 
> As part of the LOCAH and Linking Lives projects, the latter in particular, we've being doing a lot of this auto and manual linking work, mainly to VIAF and DBPedia, with some links to things like LCSH and Geonames. We've been doing a lot of work just recently in fact, and we've published a blog post that's picked up quite a bit of interest on this - http://archiveshub.ac.uk/blog/2013/08/hub-viaf-namematching/. We haven't published our latest run of data yet, but we hope to finish this soon. It'll probably still be about a month or so as a few of us are on holiday soon.
> 
> We do have quite a few links done semi-automatically in our existing data set accessible via http://data.archiveshub.ac.uk but as I say we are updating this, I'd suggest not taking the URIs and data available there as the final word.
> 
> A good example is http://data.archiveshub.ac.uk/page/person/nra/webbmarthabeatrice1858-1943socialreformer
> 
> Project URIs:
> http://archiveshub.ac.uk/locah/
> http://archiveshub.ac.uk/linkinglives/
> 
> Adrian
> _____________________________
> Adrian Stevenson
> Senior Technical Innovations Coordinator
> Mimas, The University of Manchester
> Devonshire House, Oxford Road
> Manchester M13 9QH
> 
> Email: adrian.stevenson@manchester.ac.uk
> Tel: +44 (0) 161 275 6065
> http://www.mimas.ac.uk
> http://www.twitter.com/adrianstevenson
> http://uk.linkedin.com/in/adrianstevenson/
> 
> On 22 Aug 2013, at 16:06, Cristina Sarasua wrote:
> 
>> Hi, 
>> 
>> I am looking for pairs of linked data sets that can be used as gold standard for evaluations.  I would need pairs of data sets which have been manually linked, or data sets which have been (semi-)automatically linked with interlinking tools, and afterwards reviewed (to include the links which are not identified by tools). I have looked into the DataHub catalogue and queried VoiD descriptions, but unfortunately the information about how the interlinking process was carried out is often missing.
>> 
>> Apart from the data sets which have been used in the OAEI-instance matching track, could anyone recommend (based on past experience) good data sets for evaluating data interlinking processes?
>> 
>> Thanks in advance.
>> 
>> Kind regards, 
>> 
>> Cristina
>> -- 
>> Cristina Sarasua
>> 
>> Institute for Web Science and Technologies (WeST)
>> 
>> Universität Koblenz-Landau
>> Universitätsstraße 1
>> 56070 Koblenz
>> Germany
>> 
>> e: 
>> csarasua@uni-koblenz.de
>> 
>> p: +49 261 287 2772
>> f: +49 261 287 100 2772
>> w: 
>> http://west.uni-koblenz.de 
> 
> 

Received on Monday, 26 August 2013 15:26:29 UTC