Re: vocabularies and data alignment

François Scharffe wrote:
> Kingsley Idehen wrote:
>> François Scharffe wrote:
>>> Hugh Glaser wrote:
>>>> Hi,
>>>> To put it in simple terms for me :-)
>>>> Are you after the algorithms we use to identify when two instances 
>>>> are the same?
>>>> Best
>>>> Hugh
>>>
>>> Yes !
>>>
>>> François
>>
>> So if the answer is "Yes". Then do you mean things in the ABox and 
>> TBox? Must be clear here as being too generic leads to confusion.
>
> Link generators are working at the instance level (ABox), they 
> generate links between instances. They need some input, a 
> specification of what should be interlinked. We think this 
> specification can be lifted to an alignment between vocabularies 
> (TBoxes). Well we are not 100% sure this will work, that's why we 
> would like to get such tools and their linkage specifications.
> I can take an example, interlinking persons: one dataset is described 
> with FOAF, the other with VCard.
> ?x foaf:name ?name.
> ?y vc:n [
>     vc:family-name ?fn;
>     vc:given-name ?gn.
>     ].
> the linkage specification might be something like:
> if compare(?name, concat(?gn," ",?fn)) > threshold
> then output("?x owl:sameAs ?y")
Fine, that's an instance data (ABox) oriented equivalence algorithm.
>
> In fact, this specification says
> foaf:name <-> concat(vc:given-name," ",vc:family-name)
> which is an alignment at the TBox level that can be lifted from the 
> linkage specification.
>
In the TBox you would be the properties are either equivalent or one 
property is a sub property of the other. Once done, reasoners can then 
navigate the instance data via the TBox mappings. This is basically a 
major aspect of the UMBEL project.  Even in its current form, if you 
taking the alignment rules (expressed in OWL) you have a pretty rich 
bases for leveraging linkages across many shared ontologies. To extend, 
you simply find you slot, and map to that. which is back to the: 
embraces and extend principle.

Anyway, your response provides clarity, including the fact that the end 
product of this effort isn't a solely about a  bag of  ABox oriented 
"owl:sameAs" links :-)

As I've stated before, coherent Linked Data magic happens, when we 
exploit the power of TBox level mapping across disparate ontologies. 
"Deceptively Simple" always trumps "Simply Simple" over the long haul, 
the latter simply doesn't scale :-)


Kingsley

> I hope I was clear enough this time ;)
>
>
> Cheers,
> François
>>
>> sameAs is not the best way to align things in the TBox.
>>
>> Kingsley
>>>
>>>>
>>>> On 11/06/2009 12:57, "François Scharffe" 
>>>> <francois.scharffe@inria.fr> wrote:
>>>>
>>>> Dear LODers,
>>>>
>>>> There has been a couple of discussions already on this list on the 
>>>> need
>>>> for a vocabulary to represent correspondences between terms of 
>>>> different
>>>> vocabularies. We also saw recently various tools (e.g. Silk, 
>>>> ODDlinker)
>>>> allowing to automatically interlink datasets given a specification of
>>>> what should be linked.
>>>>
>>>> However, there is currently no common way to publish and share this
>>>> information (i.e., not the links but the way to generate them, see [1]
>>>> for precision).
>>>>
>>>> We are setting up an experiment [1] to see if it is possible to 
>>>> provide
>>>> useful services from this data. But for that purpose we need your 
>>>> help.
>>>>
>>>> So this is a call for contribution: we are collecting any 
>>>> specification
>>>> of link generator for the LOD graph.
>>>>
>>>> Of course, do not hesitate to comment on the idea or to tell us if you
>>>> want to be involved.
>>>>
>>>> We promise a report on this by the end of summer (northern 
>>>> hemisphere :).
>>>>
>>>> Cheers,
>>>> François
>>>>
>>>> [1] http://melinda.inrialpes.fr
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Friday, 12 June 2009 14:01:20 UTC