Re: Merging Databases

On Fri, Jul 24, 2009 at 11:11 AM, David Baxter<retxabd@gmail.com> wrote:
>
>
> On Tue, Jul 21, 2009 at 8:43 PM, Alan Ruttenberg <alanruttenberg@gmail.com>
> wrote:
>>
>> On Tue, Jul 21, 2009 at 9:22 PM, Pat Hayes<phayes@ihmc.us> wrote:
>> > Heres another example. Cyc lists all the chemical elements, and
>> > cross-links
>> > to other such lists in other ontologies using owl:sameAs. But the Cyc
>> > ontology says that an element is the set (class) of all pieces of the
>> > pure
>> > element, so that for example sodium in Cyc has a member which is the
>> > lump of
>> > pure metallic sodium I keep safely under glycerin in a glass bottle on
>> > my
>> > shelf. This is a clever ontological device which makes a bunch of
>> > inferences
>> > very slick in Cyc, but I bet its not the same *idea* of sodium that most
>> > ontologies would agree with. So that sameAs ought to be (and it is
>> > understood as meaning) 'same chemical element', but it does not allow
>> > mutual
>> > substitutivity, even if you were to translate those other ontologies
>> > into
>> > CycL, which nobody is ever likely to do.
>>
>> My gut reaction is that URIs ought to be names that refer, and that
>> sense ought to be conveyed more explicitly as statements. That seems
>> to be the basis of the model theory that underlies the semweb
>> languages (yes, I realize that there's currently room for 2+ different
>> referencings using the same name). I realize that in natural language
>> name can carry both sense and reference (or let's just say "more than
>> reference" since there seem to be a number of theories of exactly what
>> goes on with words). But it seems that it's been at least a hundred
>> years that relatively modern philosophers have been hacking away at
>> trying to understand exactly what the phenomena are, and how to
>> understand them. Should we really try to adopt exactly the same model
>> as language, given that we don't really understand it?
>>
>> In your sodium example, i don't really know what to do with the "idea
>> of sodium" being the same or different, but I *can* say that a
>> molecule of sodium is not the same sort of thing as a lump of sodium
>> metal. They have different physical properties and some things that
>> make sense to say about one don't make sense to say about the other
>> (like the melting point of xxx is 370.87 K).
>
> For what it's worth, Cyc does not generally consider individual molecules of
> a substance to be instances of that substance. For example, "iodine
> molecule"
> (http://sw.opencyc.org/concept/Mx8Ngh4rwPzt4pwpEbGdrcN5Y29ycB4rvVj8dJwpEbGdrcN5Y29ycA)
> is not a subclass of "iodine"
> (http://sw.opencyc.org/concept/Mx4rvVj8dJwpEbGdrcN5Y29ycA).
>
> David

Thanks for pointing that out! Which one, if either, do you think is
sameAs http://dbpedia.org/resource/Iodine (only one of them is
according to sameAs.org.

As an aside, another amusing sameAs in that family is

http://dbpedia.org/resource/Iodine

Iodine, is a chemical element that has the symbol I and atomic number
53. Naturally-occurring iodine is a single isotope with 74 neutrons...

http://dbpedia.org/resource/Iodo

Iodo may refer to; Socotra Rock Iodo (film), South Korean film
directed by Kim Ki-young Iodine, chemical element...

ref:

http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FIodine&x=0&y=0

I mean no slight to the intentions of creating the sameAs resource
but, realistically, pretty much anywhere you look there are
substantive errors.

-Alan

>
>>
>> Now you might say: Well, they are the same *concept*. But what am I to
>> do with that? What can I conclude from that statement. Isn't it
>> throwing a whole lot under the rug to lump all these sorts of
>> relations into any single "same" bucket? And for what good? Google is
>> pretty good at bringing all these different sorts of things together
>> already - shouldn't the semweb stuff be doing something different?
>>
>> -Alan
>> (who's been reading and puzzling too many days in a row about how
>> words relate to ... everything)
>>
>> > On Jul 21, 2009, at 7:58 PM, Pat Hayes wrote:
>> >
>> >>
>> >> On Jul 21, 2009, at 7:26 PM, Alan Ruttenberg wrote:
>> >>
>> >>> On Tue, Jul 21, 2009 at 1:23 PM, Toby Inkster<tai@g5n.co.uk> wrote:
>> >>>>
>> >>>> On Tue, 2009-07-21 at 19:52 +0300, Bernhard Schandl wrote:
>> >>>>
>> >>>>>> I would say: Never assert sameAs. It's just too big a hammer.
>> >>>>>> Instead use a wider palette of relationships to connect entities
>> >>>>>> to other ones.
>> >>>>>
>> >>>>> which ones would you recommend?
>> >>>>
>> >>>> skos:exactMatch = asserts that the two resources represent the same
>> >>>> concept
>> >>
>> >> Say, refer to the same thing.
>> >>
>> >>>> , but does not assert that all triples containing the first
>> >>>> resource are necessarily true when the second resource is substituted
>> >>>> in.
>> >>>
>> >>> I'm having trouble parsing this one. I don't know what concepts are,
>> >>> but they are an odd sort of thing if they can be the same, but can't
>> >>> be substituted.
>> >>
>> >> This is exactly what is needed in many cases. Philosophical terminology
>> >> is
>> >> that they have the same referent but not the same sense, and lack of
>> >> substitutability reflects the unfortunate but inevitable fact that the
>> >> Web
>> >> as a whole is not referentially transparent (yet). More mundane
>> >> example, the
>> >> same person might need to be referred to in one way in one context and
>> >> differently in another, just because the two social contexts require
>> >> different forms of address. (That example from Lynn Stein.)
>> >>
>> >>> In any case, this isn't much better when the issue I point out is that
>> >>> there is a specific relation between e.g. the intervention and the
>> >>> drug - that relation is no where near equivalence in any form.
>> >>
>> >> True, but in cases like this, it is simply a basic conceptual mistake
>> >> to
>> >> be using any kind of loose-sameAs property. rdf:seeAlso would be more
>> >> like
>> >> what is needed for linking a drug to an intervention. I agree with you
>> >> about
>> >> having a selection of better-thought-out relations rather than just
>> >> using
>> >> sameAs as a kind of all-purpose knee-jerk connecting link. Maybe this
>> >> "Linked Data" slogan has a rather dumbing-down effect, as it suggests
>> >> that
>> >> 'link' is a simple uniform notion that works in all cases.
>> >>
>> >>>
>> >>>> skos:closeMatch = same as exact match, but slightly woolier.
>> >>>
>> >>> Seems harmless, assuming one doesn't mind whatever one is dealing with
>> >>> typed a concept.
>> >>> Ditto the broader and narrower relations, which although not to my
>> >>> taste  (i don't how to tell when they hold) are certainly better than
>> >>> using sameAs.
>> >>>
>> >>>> owl:equivalentProperty = if {X equivalentProperty Y} and {A X B} then
>> >>>> {A Y B}. In other words, the properties can be used completely
>> >>>> interchangeably. But perhaps there are other important differences
>> >>>> between X and Y, such as their rdfs:label or rdfs:isDefinedBy.
>> >>>
>> >>> Still near equivalence.
>> >>>
>> >>>> owl:equivalentClass = if {X equivalentClass Y} then all Xs are Ys and
>> >>>> vice versa. Same dealy with owl:equivalentProperty really.
>> >>>
>> >>> Ditto.
>> >>>
>> >>>> ovterms:similarTo = a general, all-purpose wimps' predicate. I use
>> >>>> this
>> >>>> extensively.
>> >>>
>> >>> Under the principal "first do no harm", this seems to work, although I
>> >>> note that the intervention (something that happens) isn't similar to
>> >>> the drug used in it (something that is consumed when the intervention
>> >>> happens).
>> >>>
>> >>> seeAlso seems pretty harmless and noncommittal.
>> >>>
>> >>> But better is probably to look more closely at what the entities are
>> >>> and then choose a relationship that better expresses how they relate.
>> >>> In the case of the intervention, one plausible interpretation is that
>> >>> the "intervention" names a class of processes, and that there is a
>> >>> subclass of such processes in which the drug participates. (the other
>> >>> subclass are those in which a placebo is the participant) This can be
>> >>> modeled in OWL.
>> >>>
>> >>> (My real advice for clinical trial resource is to collaborate with the
>> >>> OBI project and use terminology that is being developed for exactly
>> >>> that purpose)
>> >>>
>> >>> In my line of work I start with the OBO Relation ontology,
>> >>> http://www.obofoundry.org/ro/ which provides a basic set of well
>> >>> documented relations, such as the has_participant relationship.
>> >>>
>> >>> OWL also provides some relations of beyond equivalences - subclass
>> >>> relations are an option, when appropriate, as well as making
>> >>> statements that classes overlap - by expressing that the intersection
>> >>> of the two is not empty.
>> >>>
>> >>> That ontology is undergoing some reform, as it should in time. Some of
>> >>> the new candidate relations are documented in links from that page. In
>> >>> addition it is proposed that that there be class level and instance
>> >>> level versions of the relations - the class level relations might
>> >>> better a modeling style that would rather avoid using OWL
>> >>> restrictions, and fits well with OWL 2 which allows a name(URI) to be
>> >>> used as both a class and an instance.
>> >>>
>> >>> Finally, for those cases where there are more than one URI and they
>> >>> *really* mean the same thing - why not try to get the parties who
>> >>> minted them to collaborate and retire one of the URIs. If they really
>> >>> mean the same thing there should be no harm in either party using the
>> >>> other's URI.
>> >>
>> >> Its not that simple, unfortunately. I'm going to make this issue the
>> >> center of my invited talk at ISWC later this year :-)
>> >>
>> >> Pat
>> >>
>> >>>
>> >>> -Alan
>> >>>
>> >>>>
>> >>>> --
>> >>>> Toby A Inkster
>> >>>> <mailto:mail@tobyinkster.co.uk>
>> >>>> <http://tobyinkster.co.uk>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>
>> >> ------------------------------------------------------------
>> >> IHMC                                     (850)434 8903 or (650)494 3973
>> >> 40 South Alcaniz St.           (850)202 4416   office
>> >> Pensacola                            (850)202 4440   fax
>> >> FL 32502                              (850)291 0667   mobile
>> >> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> > ------------------------------------------------------------
>> > IHMC                                     (850)434 8903 or (650)494 3973
>> > 40 South Alcaniz St.           (850)202 4416   office
>> > Pensacola                            (850)202 4440   fax
>> > FL 32502                              (850)291 0667   mobile
>> > phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> >
>> >
>> >
>> >
>> >
>> >
>>
>
>

Received on Friday, 24 July 2009 15:35:33 UTC