Re: Merging Databases

Hello,

Trying to add some explanation wrt. the SKOS vocabulary, hoping not to conflict with Pat's clarifications ;-)

For skos:exactMatch, the SKOS reference says [1]:
> The property skos:exactMatch is used to link two concepts, indicating a high degree of confidence that the concepts can be used interchangeably across a wide range of information retrieval applications

So there can be substitution. But, contrary to what happens with owl:sameAs, this substitution is not automatic for *all* RDF triples the concept would be involved in. Actually it is left to implementers or ontology provider to define in which "context" two exactly equivalent concepts may be substituted.
As a (fictious!) result, one concept may be substituted by another if it is the object of a dc:subject statement, but not if it is the subject of a dc:creator statement.
The idea was really to be able to state some form of semantic equivalence that would be less committing than the RDF/OWL one.

Now, skos:exactMatch is transitive, which means that if a concept X in one vocabulary has been mapped to Y in a second vocabulary, and Y is connected to Z in a third vocabulary, then X and Z can be substituted.
This can (and will) be useful, but may be also harmful if the two mappings were created with different application concerns in mind, and that negligible semantic differences over a one-step mapping add up to a bigger lap over a longer path.

closeMatch is meant to deal with a lesser level of commitment. As written in the SKOS Primer, 
> skos:closeMatch is not defined as transitive, which prevents such similarity assessments to propagate beyond these two schemes. 

Imagine that for a specific application you are creating mappings between two vocabularies. You can thus create these mappings without bother with the possibility that these links may cause dubious substitutions for a different applications.


SKOS has also other mapping properties, esp. a skos:relatedMatch could be the anchor for the more specialized properties that Alan has in OBO.
This skos:relatedMatch has really not much semantics (even informal ones) but I feel it would still be better than rdfs:seeAlso. According to RDFS spec [3]
> The property rdfs:seeAlso specifies a resource that might provide additional information about the subject resource.

As far as I understand it (and keeping to the distinctions made e.g. in [4]) this means that rdfs:seeAlso can connect a non-document resource to an information resource, which I feel is a bit too broad for Alan's case...

Cheers,

Antoine

[1] http://www.w3.org/TR/skos-reference/#mapping
[2] http://www.w3.org/TR/skos-primer/#secmapping
[3] http://www.w3.org/TR/2000/CR-rdf-schema-20000327/#s2.3.4
[4] http://www.w3.org/TR/cooluris/#distinguishing

> 
> On Jul 21, 2009, at 7:26 PM, Alan Ruttenberg wrote:
> 
>> On Tue, Jul 21, 2009 at 1:23 PM, Toby Inkster<tai@g5n.co.uk> wrote:
>>> On Tue, 2009-07-21 at 19:52 +0300, Bernhard Schandl wrote:
>>>
>>>>> I would say: Never assert sameAs. It's just too big a hammer.
>>>>> Instead use a wider palette of relationships to connect entities
>>>>> to other ones.
>>>>
>>>> which ones would you recommend?
>>>
>>> skos:exactMatch = asserts that the two resources represent the same
>>> concept
> 
> Say, refer to the same thing.
> 
>>> , but does not assert that all triples containing the first
>>> resource are necessarily true when the second resource is substituted
>>> in.
>>
>> I'm having trouble parsing this one. I don't know what concepts are,
>> but they are an odd sort of thing if they can be the same, but can't
>> be substituted.
> 
> This is exactly what is needed in many cases. Philosophical terminology 
> is that they have the same referent but not the same sense, and lack of 
> substitutability reflects the unfortunate but inevitable fact that the 
> Web as a whole is not referentially transparent (yet). More mundane 
> example, the same person might need to be referred to in one way in one 
> context and differently in another, just because the two social contexts 
> require different forms of address. (That example from Lynn Stein.)
> 
>> In any case, this isn't much better when the issue I point out is that
>> there is a specific relation between e.g. the intervention and the
>> drug - that relation is no where near equivalence in any form.
> 
> True, but in cases like this, it is simply a basic conceptual mistake to 
> be using any kind of loose-sameAs property. rdf:seeAlso would be more 
> like what is needed for linking a drug to an intervention. I agree with 
> you about having a selection of better-thought-out relations rather than 
> just using sameAs as a kind of all-purpose knee-jerk connecting link. 
> Maybe this "Linked Data" slogan has a rather dumbing-down effect, as it 
> suggests that 'link' is a simple uniform notion that works in all cases.
> 
>>
>>> skos:closeMatch = same as exact match, but slightly woolier.
>>
>> Seems harmless, assuming one doesn't mind whatever one is dealing with
>> typed a concept.
>> Ditto the broader and narrower relations, which although not to my
>> taste  (i don't how to tell when they hold) are certainly better than
>> using sameAs.
>>
>>> owl:equivalentProperty = if {X equivalentProperty Y} and {A X B} then
>>> {A Y B}. In other words, the properties can be used completely
>>> interchangeably. But perhaps there are other important differences
>>> between X and Y, such as their rdfs:label or rdfs:isDefinedBy.
>>
>> Still near equivalence.
>>
>>> owl:equivalentClass = if {X equivalentClass Y} then all Xs are Ys and
>>> vice versa. Same dealy with owl:equivalentProperty really.
>>
>> Ditto.
>>
>>> ovterms:similarTo = a general, all-purpose wimps' predicate. I use this
>>> extensively.
>>
>> Under the principal "first do no harm", this seems to work, although I
>> note that the intervention (something that happens) isn't similar to
>> the drug used in it (something that is consumed when the intervention
>> happens).
>>
>> seeAlso seems pretty harmless and noncommittal.
>>
>> But better is probably to look more closely at what the entities are
>> and then choose a relationship that better expresses how they relate.
>> In the case of the intervention, one plausible interpretation is that
>> the "intervention" names a class of processes, and that there is a
>> subclass of such processes in which the drug participates. (the other
>> subclass are those in which a placebo is the participant) This can be
>> modeled in OWL.
>>
>> (My real advice for clinical trial resource is to collaborate with the
>> OBI project and use terminology that is being developed for exactly
>> that purpose)
>>
>> In my line of work I start with the OBO Relation ontology,
>> http://www.obofoundry.org/ro/ which provides a basic set of well
>> documented relations, such as the has_participant relationship.
>>
>> OWL also provides some relations of beyond equivalences - subclass
>> relations are an option, when appropriate, as well as making
>> statements that classes overlap - by expressing that the intersection
>> of the two is not empty.
>>
>> That ontology is undergoing some reform, as it should in time. Some of
>> the new candidate relations are documented in links from that page. In
>> addition it is proposed that that there be class level and instance
>> level versions of the relations - the class level relations might
>> better a modeling style that would rather avoid using OWL
>> restrictions, and fits well with OWL 2 which allows a name(URI) to be
>> used as both a class and an instance.
>>
>> Finally, for those cases where there are more than one URI and they
>> *really* mean the same thing - why not try to get the parties who
>> minted them to collaborate and retire one of the URIs. If they really
>> mean the same thing there should be no harm in either party using the
>> other's URI.
> 
> Its not that simple, unfortunately. I'm going to make this issue the 
> center of my invited talk at ISWC later this year :-)
> 
> Pat
> 
>>
>> -Alan
>>
>>>
>>> -- 
>>> Toby A Inkster
>>> <mailto:mail@tobyinkster.co.uk>
>>> <http://tobyinkster.co.uk>
>>>
>>>
>>
>>
>>
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> 
> 
> 
> 
> 
> 
> 

Received on Thursday, 23 July 2009 09:25:57 UTC