Re: Merging Databases

On Tue, Jul 21, 2009 at 9:22 PM, Pat Hayes<phayes@ihmc.us> wrote:
> Heres another example. Cyc lists all the chemical elements, and cross-links
> to other such lists in other ontologies using owl:sameAs. But the Cyc
> ontology says that an element is the set (class) of all pieces of the pure
> element, so that for example sodium in Cyc has a member which is the lump of
> pure metallic sodium I keep safely under glycerin in a glass bottle on my
> shelf. This is a clever ontological device which makes a bunch of inferences
> very slick in Cyc, but I bet its not the same *idea* of sodium that most
> ontologies would agree with. So that sameAs ought to be (and it is
> understood as meaning) 'same chemical element', but it does not allow mutual
> substitutivity, even if you were to translate those other ontologies into
> CycL, which nobody is ever likely to do.

My gut reaction is that URIs ought to be names that refer, and that
sense ought to be conveyed more explicitly as statements. That seems
to be the basis of the model theory that underlies the semweb
languages (yes, I realize that there's currently room for 2+ different
referencings using the same name). I realize that in natural language
name can carry both sense and reference (or let's just say "more than
reference" since there seem to be a number of theories of exactly what
goes on with words). But it seems that it's been at least a hundred
years that relatively modern philosophers have been hacking away at
trying to understand exactly what the phenomena are, and how to
understand them. Should we really try to adopt exactly the same model
as language, given that we don't really understand it?

In your sodium example, i don't really know what to do with the "idea
of sodium" being the same or different, but I *can* say that a
molecule of sodium is not the same sort of thing as a lump of sodium
metal. They have different physical properties and some things that
make sense to say about one don't make sense to say about the other
(like the melting point of xxx is 370.87 K).

Now you might say: Well, they are the same *concept*. But what am I to
do with that? What can I conclude from that statement. Isn't it
throwing a whole lot under the rug to lump all these sorts of
relations into any single "same" bucket? And for what good? Google is
pretty good at bringing all these different sorts of things together
already - shouldn't the semweb stuff be doing something different?

-Alan
(who's been reading and puzzling too many days in a row about how
words relate to ... everything)

> On Jul 21, 2009, at 7:58 PM, Pat Hayes wrote:
>
>>
>> On Jul 21, 2009, at 7:26 PM, Alan Ruttenberg wrote:
>>
>>> On Tue, Jul 21, 2009 at 1:23 PM, Toby Inkster<tai@g5n.co.uk> wrote:
>>>>
>>>> On Tue, 2009-07-21 at 19:52 +0300, Bernhard Schandl wrote:
>>>>
>>>>>> I would say: Never assert sameAs. It's just too big a hammer.
>>>>>> Instead use a wider palette of relationships to connect entities
>>>>>> to other ones.
>>>>>
>>>>> which ones would you recommend?
>>>>
>>>> skos:exactMatch = asserts that the two resources represent the same
>>>> concept
>>
>> Say, refer to the same thing.
>>
>>>> , but does not assert that all triples containing the first
>>>> resource are necessarily true when the second resource is substituted
>>>> in.
>>>
>>> I'm having trouble parsing this one. I don't know what concepts are,
>>> but they are an odd sort of thing if they can be the same, but can't
>>> be substituted.
>>
>> This is exactly what is needed in many cases. Philosophical terminology is
>> that they have the same referent but not the same sense, and lack of
>> substitutability reflects the unfortunate but inevitable fact that the Web
>> as a whole is not referentially transparent (yet). More mundane example, the
>> same person might need to be referred to in one way in one context and
>> differently in another, just because the two social contexts require
>> different forms of address. (That example from Lynn Stein.)
>>
>>> In any case, this isn't much better when the issue I point out is that
>>> there is a specific relation between e.g. the intervention and the
>>> drug - that relation is no where near equivalence in any form.
>>
>> True, but in cases like this, it is simply a basic conceptual mistake to
>> be using any kind of loose-sameAs property. rdf:seeAlso would be more like
>> what is needed for linking a drug to an intervention. I agree with you about
>> having a selection of better-thought-out relations rather than just using
>> sameAs as a kind of all-purpose knee-jerk connecting link. Maybe this
>> "Linked Data" slogan has a rather dumbing-down effect, as it suggests that
>> 'link' is a simple uniform notion that works in all cases.
>>
>>>
>>>> skos:closeMatch = same as exact match, but slightly woolier.
>>>
>>> Seems harmless, assuming one doesn't mind whatever one is dealing with
>>> typed a concept.
>>> Ditto the broader and narrower relations, which although not to my
>>> taste  (i don't how to tell when they hold) are certainly better than
>>> using sameAs.
>>>
>>>> owl:equivalentProperty = if {X equivalentProperty Y} and {A X B} then
>>>> {A Y B}. In other words, the properties can be used completely
>>>> interchangeably. But perhaps there are other important differences
>>>> between X and Y, such as their rdfs:label or rdfs:isDefinedBy.
>>>
>>> Still near equivalence.
>>>
>>>> owl:equivalentClass = if {X equivalentClass Y} then all Xs are Ys and
>>>> vice versa. Same dealy with owl:equivalentProperty really.
>>>
>>> Ditto.
>>>
>>>> ovterms:similarTo = a general, all-purpose wimps' predicate. I use this
>>>> extensively.
>>>
>>> Under the principal "first do no harm", this seems to work, although I
>>> note that the intervention (something that happens) isn't similar to
>>> the drug used in it (something that is consumed when the intervention
>>> happens).
>>>
>>> seeAlso seems pretty harmless and noncommittal.
>>>
>>> But better is probably to look more closely at what the entities are
>>> and then choose a relationship that better expresses how they relate.
>>> In the case of the intervention, one plausible interpretation is that
>>> the "intervention" names a class of processes, and that there is a
>>> subclass of such processes in which the drug participates. (the other
>>> subclass are those in which a placebo is the participant) This can be
>>> modeled in OWL.
>>>
>>> (My real advice for clinical trial resource is to collaborate with the
>>> OBI project and use terminology that is being developed for exactly
>>> that purpose)
>>>
>>> In my line of work I start with the OBO Relation ontology,
>>> http://www.obofoundry.org/ro/ which provides a basic set of well
>>> documented relations, such as the has_participant relationship.
>>>
>>> OWL also provides some relations of beyond equivalences - subclass
>>> relations are an option, when appropriate, as well as making
>>> statements that classes overlap - by expressing that the intersection
>>> of the two is not empty.
>>>
>>> That ontology is undergoing some reform, as it should in time. Some of
>>> the new candidate relations are documented in links from that page. In
>>> addition it is proposed that that there be class level and instance
>>> level versions of the relations - the class level relations might
>>> better a modeling style that would rather avoid using OWL
>>> restrictions, and fits well with OWL 2 which allows a name(URI) to be
>>> used as both a class and an instance.
>>>
>>> Finally, for those cases where there are more than one URI and they
>>> *really* mean the same thing - why not try to get the parties who
>>> minted them to collaborate and retire one of the URIs. If they really
>>> mean the same thing there should be no harm in either party using the
>>> other's URI.
>>
>> Its not that simple, unfortunately. I'm going to make this issue the
>> center of my invited talk at ISWC later this year :-)
>>
>> Pat
>>
>>>
>>> -Alan
>>>
>>>>
>>>> --
>>>> Toby A Inkster
>>>> <mailto:mail@tobyinkster.co.uk>
>>>> <http://tobyinkster.co.uk>
>>>>
>>>>
>>>
>>>
>>>
>>
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>>
>>
>
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>

Received on Wednesday, 22 July 2009 01:44:14 UTC