Re: automatic data interlinking

Dear François,

This is a great initiative in a crucial area.
I am wondering if there is anything we (rkbexplorer.com and sameas.org) can
do to help.
Clearly we have a lot of datasets which our tools have been grinding over
aligning for many years, and we would be happy to offer anything you would
find useful.
However, there may also be other things.
I looked at taking the outputs of last year's exercise into a sameas store,
but found the URIs (at least the few I tried) were not LD, so backed off.
So perhaps the first suggestion would be that whatever datasets you choose,
they should be over LD URIs.
Another suggestion would be that the outputs of the exercise should be
published in such a way that they will be useful to the LD world. Not least,
this would be more motivating to the participants.
We would be happy to bring up a sameas store for this, or indeed a separate
sameas store for each of the participants, where they can post their data,
and they and others can then access it.
And of course results with high precision can safely be put in sameas.org,
which would be very exciting for me.
(Of course the level of help we can give will be limited by the resources,
which are limited.)
In choosing datasets, perhaps an obvious place to start is something like
the geographical data in the data.gov.uk world?

Best
Hugh

PS
In fact, there are some useful ones that would help personally, although you
may feel they are too close to the last year's topics: for example we have
LD datasets of the NSF (National Science Foundation) project data with the
OAI (Open Archive Initiative) bibliographic data, and aligning these would
be challenging but very interesting.


On 21/05/2010 14:51, "François Scharffe" <francois.scharffe@inria.fr> wrote:

> Hello,
> 
> Part of the ontology alignment evaluation initiative [1] we will have
> for the 2nd year a data interlinking evaluation.
> 
> We propose in this track to evaluate systems able to *automatically*
> find interlinks between Web datasets, in contrast to semi-automatic
> tools. This year we will focus on large datasets. Two datasets are given
> in input and a set of links between equivalent resources will have to be
> given in output.
> 
> We're looking for systems to participate to the evaluation. We're also
> looking for datasets that may be used for the evaluation, that is have a
> nicely curated linkset to serve as a reference.
> 
> btw I also invite you to look at the result of last year evaluation [2]
> 
> Cheers
> 
> François
> 
> 
> 
> [1] http:///oaei.ontologymatching.org
> [2] http:///oaei.ontologymatching.org/2009/instances/
> 

Received on Monday, 31 May 2010 11:38:27 UTC