Re: Merging Databases

On Jul 21, 2009, at 8:43 PM, Alan Ruttenberg wrote:

> On Tue, Jul 21, 2009 at 9:22 PM, Pat Hayes<phayes@ihmc.us> wrote:
>> Heres another example. Cyc lists all the chemical elements, and  
>> cross-links
>> to other such lists in other ontologies using owl:sameAs. But the Cyc
>> ontology says that an element is the set (class) of all pieces of  
>> the pure
>> element, so that for example sodium in Cyc has a member which is  
>> the lump of
>> pure metallic sodium I keep safely under glycerin in a glass bottle  
>> on my
>> shelf. This is a clever ontological device which makes a bunch of  
>> inferences
>> very slick in Cyc, but I bet its not the same *idea* of sodium that  
>> most
>> ontologies would agree with. So that sameAs ought to be (and it is
>> understood as meaning) 'same chemical element', but it does not  
>> allow mutual
>> substitutivity, even if you were to translate those other  
>> ontologies into
>> CycL, which nobody is ever likely to do.
>
> My gut reaction is that URIs ought to be names that refer, and that
> sense ought to be conveyed more explicitly as statements.

That doesn't work. IF there really are opaque contexts out there, then  
making statements in a transparent language is never going to capture  
the full sense.

> That seems
> to be the basis of the model theory that underlies the semweb
> languages

Indeed, the current model theory presumes, implicitly, a referentially  
transparent system. Obviously this would be great, but I think this  
isn't what we in fact have.

> (yes, I realize that there's currently room for 2+ different
> referencings using the same name). I realize that in natural language
> name can carry both sense and reference (or let's just say "more than
> reference" since there seem to be a number of theories of exactly what
> goes on with words). But it seems that it's been at least a hundred
> years that relatively modern philosophers have been hacking away at
> trying to understand exactly what the phenomena are, and how to
> understand them. Should we really try to adopt exactly the same model
> as language, given that we don't really understand it?
>
> In your sodium example, i don't really know what to do with the "idea
> of sodium" being the same or different, but I *can* say that a
> molecule of sodium is not the same sort of thing as a lump of sodium
> metal. They have different physical properties and some things that
> make sense to say about one don't make sense to say about the other
> (like the melting point of xxx is 370.87 K).

Of course: they have different masses as well. But one thing they have  
in common is exactly that they are both pieces of sodium, in the exact  
sense of 'piece' required by mereology: a piece of sodium with no  
parts that are not also parts of sodium. The **concept** is perfectly  
clear and coherent, and extremely handy for inference-making. For  
example, an atom is a piece which has no smaller pieces.

So, what in your view should the name of the element, 'sodium', be  
taken to denote? One possible answer is, it denotes the class of all  
sodium molecules, or the class of all sodium atoms, or some such. This  
seems natural to a chemist, but it means that my lump of reactive  
metal isn't a piece of sodium: its a piece of stuff all of whose atoms  
are sodium atoms, which seems awkward and unnatural (for example,  
people know about sodium before the idea of atoms was universally  
accepted, so they seem to be conceptually distinguishable.) The Cyc  
technique is really only a small step from the class-of-all-atoms  
idea, but it has the merits that all pieces of sodium are, indeed,  
pieces of sodium. My point is only that this theory of what  
constitutes a chemical element is, while coherent, also idiosyncratic;  
so we seem to need a way to say "denotes same chemical element as"  
without also saying "is logically identical to" (sameAs), because we  
have to allow ontology A to have a somewhat different conception of,  
say, sodium than that used by ontology B, even though they are both  
ontologies about the same topic, and we want to be able to record this  
useful fact. OR else we need to face up to the possibility that  
because A is referntially opaque when seen from B, a bare statement of  
equality does *not* automatically give us a licence to substitute one  
name for another. That is an ugly but Im beginning to think inevitable  
truth.

>
> Now you might say: Well, they are the same *concept*.

I'd prefer to avoid the c-word as long as possible. But in fact, I'd  
say that they aren't the same concept (of sodium) and that is  
precisely the issue here.

> But what am I to
> do with that? What can I conclude from that statement. Isn't it
> throwing a whole lot under the rug to lump all these sorts of
> relations into any single "same" bucket?

I entirely agree. What I want is to find a way to keep different  
senses of same-as distinct. But the puzzling thing is that in cases  
like sodium, here, this is **exactly** what sameAs is supposed to  
mean: both ontologies claim to be talking about the actual element in  
the real world (presuming that the real world contains elements, of  
course, which a very strict nominalist might deny) and they mean the  
same one of those by their two names, so this seems like a tailor-made  
case for using sameAs; but it has the potential for misleading  
entailments, all the same. This line of thinking is what makes me  
suggest that we have to treat the SWeb as somewhat referentially  
opaque, at least for the time being.

> And for what good? Google is
> pretty good at bringing all these different sorts of things together
> already

? I don't think so. Google is pretty terrible at detecting that two  
different surface names denote the same thing.

> - shouldn't the semweb stuff be doing something different?
>
> -Alan
> (who's been reading and puzzling too many days in a row about how
> words relate to ... everything)

Join the club :-)


>
>> On Jul 21, 2009, at 7:58 PM, Pat Hayes wrote:
>>
>>>
>>> On Jul 21, 2009, at 7:26 PM, Alan Ruttenberg wrote:
>>>
>>>> On Tue, Jul 21, 2009 at 1:23 PM, Toby Inkster<tai@g5n.co.uk> wrote:
>>>>>
>>>>> On Tue, 2009-07-21 at 19:52 +0300, Bernhard Schandl wrote:
>>>>>
>>>>>>> I would say: Never assert sameAs. It's just too big a hammer.
>>>>>>> Instead use a wider palette of relationships to connect entities
>>>>>>> to other ones.
>>>>>>
>>>>>> which ones would you recommend?
>>>>>
>>>>> skos:exactMatch = asserts that the two resources represent the  
>>>>> same
>>>>> concept
>>>
>>> Say, refer to the same thing.
>>>
>>>>> , but does not assert that all triples containing the first
>>>>> resource are necessarily true when the second resource is  
>>>>> substituted
>>>>> in.
>>>>
>>>> I'm having trouble parsing this one. I don't know what concepts  
>>>> are,
>>>> but they are an odd sort of thing if they can be the same, but  
>>>> can't
>>>> be substituted.
>>>
>>> This is exactly what is needed in many cases. Philosophical  
>>> terminology is
>>> that they have the same referent but not the same sense, and lack of
>>> substitutability reflects the unfortunate but inevitable fact that  
>>> the Web
>>> as a whole is not referentially transparent (yet). More mundane  
>>> example, the
>>> same person might need to be referred to in one way in one context  
>>> and
>>> differently in another, just because the two social contexts require
>>> different forms of address. (That example from Lynn Stein.)
>>>
>>>> In any case, this isn't much better when the issue I point out is  
>>>> that
>>>> there is a specific relation between e.g. the intervention and the
>>>> drug - that relation is no where near equivalence in any form.
>>>
>>> True, but in cases like this, it is simply a basic conceptual  
>>> mistake to
>>> be using any kind of loose-sameAs property. rdf:seeAlso would be  
>>> more like
>>> what is needed for linking a drug to an intervention. I agree with  
>>> you about
>>> having a selection of better-thought-out relations rather than  
>>> just using
>>> sameAs as a kind of all-purpose knee-jerk connecting link. Maybe  
>>> this
>>> "Linked Data" slogan has a rather dumbing-down effect, as it  
>>> suggests that
>>> 'link' is a simple uniform notion that works in all cases.
>>>
>>>>
>>>>> skos:closeMatch = same as exact match, but slightly woolier.
>>>>
>>>> Seems harmless, assuming one doesn't mind whatever one is dealing  
>>>> with
>>>> typed a concept.
>>>> Ditto the broader and narrower relations, which although not to my
>>>> taste  (i don't how to tell when they hold) are certainly better  
>>>> than
>>>> using sameAs.
>>>>
>>>>> owl:equivalentProperty = if {X equivalentProperty Y} and {A X B}  
>>>>> then
>>>>> {A Y B}. In other words, the properties can be used completely
>>>>> interchangeably. But perhaps there are other important differences
>>>>> between X and Y, such as their rdfs:label or rdfs:isDefinedBy.
>>>>
>>>> Still near equivalence.
>>>>
>>>>> owl:equivalentClass = if {X equivalentClass Y} then all Xs are  
>>>>> Ys and
>>>>> vice versa. Same dealy with owl:equivalentProperty really.
>>>>
>>>> Ditto.
>>>>
>>>>> ovterms:similarTo = a general, all-purpose wimps' predicate. I  
>>>>> use this
>>>>> extensively.
>>>>
>>>> Under the principal "first do no harm", this seems to work,  
>>>> although I
>>>> note that the intervention (something that happens) isn't similar  
>>>> to
>>>> the drug used in it (something that is consumed when the  
>>>> intervention
>>>> happens).
>>>>
>>>> seeAlso seems pretty harmless and noncommittal.
>>>>
>>>> But better is probably to look more closely at what the entities  
>>>> are
>>>> and then choose a relationship that better expresses how they  
>>>> relate.
>>>> In the case of the intervention, one plausible interpretation is  
>>>> that
>>>> the "intervention" names a class of processes, and that there is a
>>>> subclass of such processes in which the drug participates. (the  
>>>> other
>>>> subclass are those in which a placebo is the participant) This  
>>>> can be
>>>> modeled in OWL.
>>>>
>>>> (My real advice for clinical trial resource is to collaborate  
>>>> with the
>>>> OBI project and use terminology that is being developed for exactly
>>>> that purpose)
>>>>
>>>> In my line of work I start with the OBO Relation ontology,
>>>> http://www.obofoundry.org/ro/ which provides a basic set of well
>>>> documented relations, such as the has_participant relationship.
>>>>
>>>> OWL also provides some relations of beyond equivalences - subclass
>>>> relations are an option, when appropriate, as well as making
>>>> statements that classes overlap - by expressing that the  
>>>> intersection
>>>> of the two is not empty.
>>>>
>>>> That ontology is undergoing some reform, as it should in time.  
>>>> Some of
>>>> the new candidate relations are documented in links from that  
>>>> page. In
>>>> addition it is proposed that that there be class level and instance
>>>> level versions of the relations - the class level relations might
>>>> better a modeling style that would rather avoid using OWL
>>>> restrictions, and fits well with OWL 2 which allows a name(URI)  
>>>> to be
>>>> used as both a class and an instance.
>>>>
>>>> Finally, for those cases where there are more than one URI and they
>>>> *really* mean the same thing - why not try to get the parties who
>>>> minted them to collaborate and retire one of the URIs. If they  
>>>> really
>>>> mean the same thing there should be no harm in either party using  
>>>> the
>>>> other's URI.
>>>
>>> Its not that simple, unfortunately. I'm going to make this issue the
>>> center of my invited talk at ISWC later this year :-)
>>>
>>> Pat
>>>
>>>>
>>>> -Alan
>>>>
>>>>>
>>>>> --
>>>>> Toby A Inkster
>>>>> <mailto:mail@tobyinkster.co.uk>
>>>>> <http://tobyinkster.co.uk>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>> ------------------------------------------------------------
>>> IHMC                                     (850)434 8903 or (650)494  
>>> 3973
>>> 40 South Alcaniz St.           (850)202 4416   office
>>> Pensacola                            (850)202 4440   fax
>>> FL 32502                              (850)291 0667   mobile
>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494  
>> 3973
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>>
>>
>>
>>
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Received on Friday, 24 July 2009 15:46:39 UTC