W3C home > Mailing lists > Public > public-lod@w3.org > June 2009

Re: vocabularies and data alignment

From: Aldo Bucchi <aldo.bucchi@gmail.com>
Date: Fri, 12 Jun 2009 10:35:46 -0400
Message-ID: <7a4ebe1d0906120735v21990526x5b4ea864e3f5a0ea@mail.gmail.com>
To: François Scharffe <francois.scharffe@inria.fr>
Cc: Kingsley Idehen <kidehen@openlinksw.com>, Hugh Glaser <hg@ecs.soton.ac.uk>, "public-lod@w3.org" <public-lod@w3.org>, Jerome Euzenat <Jerome.Euzenat@inrialpes.fr>
François,

On Fri, Jun 12, 2009 at 9:07 AM, François
Scharffe<francois.scharffe@inria.fr> wrote:
> Kingsley Idehen wrote:
>>
>> François Scharffe wrote:
>>>
>>> Hugh Glaser wrote:
>>>>
>>>> Hi,
>>>> To put it in simple terms for me :-)
>>>> Are you after the algorithms we use to identify when two instances are
>>>> the same?
>>>> Best
>>>> Hugh
>>>
>>> Yes !
>>>
>>> François
>>
>> So if the answer is "Yes". Then do you mean things in the ABox and TBox?
>> Must be clear here as being too generic leads to confusion.
>
> Link generators are working at the instance level (ABox), they generate
> links between instances. They need some input, a specification of what
> should be interlinked. We think this specification can be lifted to an
> alignment between vocabularies (TBoxes). Well we are not 100% sure this will
> work, that's why we would like to get such tools and their linkage
> specifications.
> I can take an example, interlinking persons: one dataset is described with
> FOAF, the other with VCard.
> ?x foaf:name ?name.
> ?y vc:n [
>        vc:family-name ?fn;
>        vc:given-name ?gn.
>        ].
> the linkage specification might be something like:
> if compare(?name, concat(?gn," ",?fn)) > threshold
> then output("?x owl:sameAs ?y")
>
> In fact, this specification says
> foaf:name <-> concat(vc:given-name," ",vc:family-name)
> which is an alignment at the TBox level that can be lifted from the linkage
> specification.
>
> I hope I was clear enough this time ;)

Yes you did. Hugh got it right but I was a bit lost ;)

My quick take on this issue is usually:
Strategy:
Use subPropertyOf, etc if possible, otherwise resort to SWRL or
SPARQL, otherwise use custom code ( for example, if the IFPs are
embedded in URIs ).

Implementation:
Stick with inference. If not possible, materialize intermediate graph.

Of course the above is not very useful as what you're looking for is
real world examples to mine for patterns, generalize, and try to push
the knowledge up to the TBox.

Good luck, it sounds interesting ;)
Thanks,
A

>
>
> Cheers,
> François
>>
>> sameAs is not the best way to align things in the TBox.
>>
>> Kingsley
>>>
>>>>
>>>> On 11/06/2009 12:57, "François Scharffe" <francois.scharffe@inria.fr>
>>>> wrote:
>>>>
>>>> Dear LODers,
>>>>
>>>> There has been a couple of discussions already on this list on the need
>>>> for a vocabulary to represent correspondences between terms of different
>>>> vocabularies. We also saw recently various tools (e.g. Silk, ODDlinker)
>>>> allowing to automatically interlink datasets given a specification of
>>>> what should be linked.
>>>>
>>>> However, there is currently no common way to publish and share this
>>>> information (i.e., not the links but the way to generate them, see [1]
>>>> for precision).
>>>>
>>>> We are setting up an experiment [1] to see if it is possible to provide
>>>> useful services from this data. But for that purpose we need your help.
>>>>
>>>> So this is a call for contribution: we are collecting any specification
>>>> of link generator for the LOD graph.
>>>>
>>>> Of course, do not hesitate to comment on the idea or to tell us if you
>>>> want to be involved.
>>>>
>>>> We promise a report on this by the end of summer (northern hemisphere
>>>> :).
>>>>
>>>> Cheers,
>>>> François
>>>>
>>>> [1] http://melinda.inrialpes.fr
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
>



-- 
Aldo Bucchi
U N I V R Z
Office: +56 2 795 4532
Mobile:+56 9 7623 8653
skype:aldo.bucchi
http://www.univrz.com/
http://aldobucchi.com/

PRIVILEGED AND CONFIDENTIAL INFORMATION
This message is only for the use of the individual or entity to which it is
addressed and may contain information that is privileged and confidential. If
you are not the intended recipient, please do not distribute or copy this
communication, by e-mail or otherwise. Instead, please notify us immediately by
return e-mail.
INFORMACIÓN PRIVILEGIADA Y CONFIDENCIAL
Este mensaje está destinado sólo a la persona u organización al cual está
dirigido y podría contener información privilegiada y confidencial. Si usted no
es el destinatario, por favor no distribuya ni copie esta comunicación, por
email o por otra vía. Por el contrario, por favor notifíquenos inmediatamente
vía e-mail.
Received on Friday, 12 June 2009 14:36:27 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:21 UTC