- From: <Csarasua@uni-koblenz.de>
- Date: Tue, 27 Aug 2013 19:40:22 +0200
- To: "Hugh Glaser" <hg@ecs.soton.ac.uk>
- Cc: "Adrian Stevenson" <adrian.stevenson@manchester.ac.uk>, "Cristina Sarasua" <csarasua@uni-koblenz.de>, "Linked Data community" <public-lod@w3.org>, "Jane Stevenson" <jane.stevenson@manchester.ac.uk>
> Thanks to all who have mentioned other datasets - more fodder for > sameAs.org! :-) > > Christina, > A separate message from me about datasets. > As the maintainer of sameAs.org, I have quite a lot of such datasets in > quite convenient forms (only the links) :-) > So, for example, if you wanted Adrian's data, then I can give it to you. > (I have queried the SPARQL endpoint to put stuff in sameAs.org. Both > owl:sameAs and skos:exactMatch.) > I have lots of bibliographic ones, especially national libraries, who have > often sent me the data. > (British, German, US, Japanese, Norwegian, French, Spanish, Hungarian … as > best I recall.) > I also have the VIAF data. > This is all aggregated in http://sameas.org/store/kelle/ and other stuff > is kept in some sameAs stores - see http://sameas.org/store/ > Freebase is an interesting one (that is Google Graph, and they send me > their data.) > LATC has been mentioned, and I have a store with that data. > > Also, Rob Warren is spot on! > owl:differentFrom is your friend. > It can be used to tell you the resources that might have been considered > the same, but some more work has been done to find out that the system was > wrong. > In some sense it gives you upper and lower bounds on precision/recall. > > It so happens (!) that I also run http://differentfrom.org where I gather > such data. > Again, Freebase have given me their regression test for asserting > sameness, and I have a store with that in. > And LATC published their similar data, and I have put it in a store. > > I hope that helps - ask me for data if you need it, although I hope you > can be as specific as possible. > (If I don't have it, I may well decide to harvest it to put in > sameas.org.) > > Best > Hugh > Thanks a lot for all the references that I received. Best, Cristina > On 26 Aug 2013, at 12:04, Adrian Stevenson > <adrian.stevenson@manchester.ac.uk> > wrote: > >> Hi All >> >> As part of the LOCAH and Linking Lives projects, the latter in >> particular, we've being doing a lot of this auto and manual linking >> work, mainly to VIAF and DBPedia, with some links to things like LCSH >> and Geonames. We've been doing a lot of work just recently in fact, and >> we've published a blog post that's picked up quite a bit of interest on >> this - http://archiveshub.ac.uk/blog/2013/08/hub-viaf-namematching/. We >> haven't published our latest run of data yet, but we hope to finish this >> soon. It'll probably still be about a month or so as a few of us are on >> holiday soon. >> >> We do have quite a few links done semi-automatically in our existing >> data set accessible via http://data.archiveshub.ac.uk but as I say we >> are updating this, I'd suggest not taking the URIs and data available >> there as the final word. >> >> A good example is >> http://data.archiveshub.ac.uk/page/person/nra/webbmarthabeatrice1858-1943socialreformer >> >> Project URIs: >> http://archiveshub.ac.uk/locah/ >> http://archiveshub.ac.uk/linkinglives/ >> >> Adrian >> _____________________________ >> Adrian Stevenson >> Senior Technical Innovations Coordinator >> Mimas, The University of Manchester >> Devonshire House, Oxford Road >> Manchester M13 9QH >> >> Email: adrian.stevenson@manchester.ac.uk >> Tel: +44 (0) 161 275 6065 >> http://www.mimas.ac.uk >> http://www.twitter.com/adrianstevenson >> http://uk.linkedin.com/in/adrianstevenson/ >> >> On 22 Aug 2013, at 16:06, Cristina Sarasua wrote: >> >>> Hi, >>> >>> I am looking for pairs of linked data sets that can be used as gold >>> standard for evaluations. I would need pairs of data sets which have >>> been manually linked, or data sets which have been (semi-)automatically >>> linked with interlinking tools, and afterwards reviewed (to include the >>> links which are not identified by tools). I have looked into the >>> DataHub catalogue and queried VoiD descriptions, but unfortunately the >>> information about how the interlinking process was carried out is often >>> missing. >>> >>> Apart from the data sets which have been used in the OAEI-instance >>> matching track, could anyone recommend (based on past experience) good >>> data sets for evaluating data interlinking processes? >>> >>> Thanks in advance. >>> >>> Kind regards, >>> >>> Cristina >>> -- >>> Cristina Sarasua >>> >>> Institute for Web Science and Technologies (WeST) >>> >>> Universität Koblenz-Landau >>> Universitätsstraße 1 >>> 56070 Koblenz >>> Germany >>> >>> e: >>> csarasua@uni-koblenz.de >>> >>> p: +49 261 287 2772 >>> f: +49 261 287 100 2772 >>> w: >>> http://west.uni-koblenz.de >> >> > > >
Received on Tuesday, 27 August 2013 17:40:45 UTC