Re: Merging Databases

As a constructive suggestion, an incremental improvement to sameAs.org
would be to systematically eradicate any references to dbpedia entries
that are disambiguation pages on wikipedia.

-Alan

On Fri, Jul 24, 2009 at 11:26 AM, Alan
Ruttenberg<alanruttenberg@gmail.com> wrote:
> On Fri, Jul 24, 2009 at 11:11 AM, David Baxter<retxabd@gmail.com> wrote:
>>
>>
>> On Tue, Jul 21, 2009 at 8:43 PM, Alan Ruttenberg <alanruttenberg@gmail.com>
>> wrote:
>>>
>>> On Tue, Jul 21, 2009 at 9:22 PM, Pat Hayes<phayes@ihmc.us> wrote:
>>> > Heres another example. Cyc lists all the chemical elements, and
>>> > cross-links
>>> > to other such lists in other ontologies using owl:sameAs. But the Cyc
>>> > ontology says that an element is the set (class) of all pieces of the
>>> > pure
>>> > element, so that for example sodium in Cyc has a member which is the
>>> > lump of
>>> > pure metallic sodium I keep safely under glycerin in a glass bottle on
>>> > my
>>> > shelf. This is a clever ontological device which makes a bunch of
>>> > inferences
>>> > very slick in Cyc, but I bet its not the same *idea* of sodium that most
>>> > ontologies would agree with. So that sameAs ought to be (and it is
>>> > understood as meaning) 'same chemical element', but it does not allow
>>> > mutual
>>> > substitutivity, even if you were to translate those other ontologies
>>> > into
>>> > CycL, which nobody is ever likely to do.
>>>
>>> My gut reaction is that URIs ought to be names that refer, and that
>>> sense ought to be conveyed more explicitly as statements. That seems
>>> to be the basis of the model theory that underlies the semweb
>>> languages (yes, I realize that there's currently room for 2+ different
>>> referencings using the same name). I realize that in natural language
>>> name can carry both sense and reference (or let's just say "more than
>>> reference" since there seem to be a number of theories of exactly what
>>> goes on with words). But it seems that it's been at least a hundred
>>> years that relatively modern philosophers have been hacking away at
>>> trying to understand exactly what the phenomena are, and how to
>>> understand them. Should we really try to adopt exactly the same model
>>> as language, given that we don't really understand it?
>>>
>>> In your sodium example, i don't really know what to do with the "idea
>>> of sodium" being the same or different, but I *can* say that a
>>> molecule of sodium is not the same sort of thing as a lump of sodium
>>> metal. They have different physical properties and some things that
>>> make sense to say about one don't make sense to say about the other
>>> (like the melting point of xxx is 370.87 K).
>>
>> For what it's worth, Cyc does not generally consider individual molecules of
>> a substance to be instances of that substance. For example, "iodine
>> molecule"
>> (http://sw.opencyc.org/concept/Mx8Ngh4rwPzt4pwpEbGdrcN5Y29ycB4rvVj8dJwpEbGdrcN5Y29ycA)
>> is not a subclass of "iodine"
>> (http://sw.opencyc.org/concept/Mx4rvVj8dJwpEbGdrcN5Y29ycA).
>>
>> David
>
> Thanks for pointing that out! Which one, if either, do you think is
> sameAs http://dbpedia.org/resource/Iodine (only one of them is
> according to sameAs.org.
>
> As an aside, another amusing sameAs in that family is
>
> http://dbpedia.org/resource/Iodine
>
> Iodine, is a chemical element that has the symbol I and atomic number
> 53. Naturally-occurring iodine is a single isotope with 74 neutrons...
>
> http://dbpedia.org/resource/Iodo
>
> Iodo may refer to; Socotra Rock Iodo (film), South Korean film
> directed by Kim Ki-young Iodine, chemical element...
>
> ref:
>
> http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FIodine&x=0&y=0
>
> I mean no slight to the intentions of creating the sameAs resource
> but, realistically, pretty much anywhere you look there are
> substantive errors.
>
> -Alan
>
>>
>>>
>>> Now you might say: Well, they are the same *concept*. But what am I to
>>> do with that? What can I conclude from that statement. Isn't it
>>> throwing a whole lot under the rug to lump all these sorts of
>>> relations into any single "same" bucket? And for what good? Google is
>>> pretty good at bringing all these different sorts of things together
>>> already - shouldn't the semweb stuff be doing something different?
>>>
>>> -Alan
>>> (who's been reading and puzzling too many days in a row about how
>>> words relate to ... everything)
>>>
>>> > On Jul 21, 2009, at 7:58 PM, Pat Hayes wrote:
>>> >
>>> >>
>>> >> On Jul 21, 2009, at 7:26 PM, Alan Ruttenberg wrote:
>>> >>
>>> >>> On Tue, Jul 21, 2009 at 1:23 PM, Toby Inkster<tai@g5n.co.uk> wrote:
>>> >>>>
>>> >>>> On Tue, 2009-07-21 at 19:52 +0300, Bernhard Schandl wrote:
>>> >>>>
>>> >>>>>> I would say: Never assert sameAs. It's just too big a hammer.
>>> >>>>>> Instead use a wider palette of relationships to connect entities
>>> >>>>>> to other ones.
>>> >>>>>
>>> >>>>> which ones would you recommend?
>>> >>>>
>>> >>>> skos:exactMatch = asserts that the two resources represent the same
>>> >>>> concept
>>> >>
>>> >> Say, refer to the same thing.
>>> >>
>>> >>>> , but does not assert that all triples containing the first
>>> >>>> resource are necessarily true when the second resource is substituted
>>> >>>> in.
>>> >>>
>>> >>> I'm having trouble parsing this one. I don't know what concepts are,
>>> >>> but they are an odd sort of thing if they can be the same, but can't
>>> >>> be substituted.
>>> >>
>>> >> This is exactly what is needed in many cases. Philosophical terminology
>>> >> is
>>> >> that they have the same referent but not the same sense, and lack of
>>> >> substitutability reflects the unfortunate but inevitable fact that the
>>> >> Web
>>> >> as a whole is not referentially transparent (yet). More mundane
>>> >> example, the
>>> >> same person might need to be referred to in one way in one context and
>>> >> differently in another, just because the two social contexts require
>>> >> different forms of address. (That example from Lynn Stein.)
>>> >>
>>> >>> In any case, this isn't much better when the issue I point out is that
>>> >>> there is a specific relation between e.g. the intervention and the
>>> >>> drug - that relation is no where near equivalence in any form.
>>> >>
>>> >> True, but in cases like this, it is simply a basic conceptual mistake
>>> >> to
>>> >> be using any kind of loose-sameAs property. rdf:seeAlso would be more
>>> >> like
>>> >> what is needed for linking a drug to an intervention. I agree with you
>>> >> about
>>> >> having a selection of better-thought-out relations rather than just
>>> >> using
>>> >> sameAs as a kind of all-purpose knee-jerk connecting link. Maybe this
>>> >> "Linked Data" slogan has a rather dumbing-down effect, as it suggests
>>> >> that
>>> >> 'link' is a simple uniform notion that works in all cases.
>>> >>
>>> >>>
>>> >>>> skos:closeMatch = same as exact match, but slightly woolier.
>>> >>>
>>> >>> Seems harmless, assuming one doesn't mind whatever one is dealing with
>>> >>> typed a concept.
>>> >>> Ditto the broader and narrower relations, which although not to my
>>> >>> taste  (i don't how to tell when they hold) are certainly better than
>>> >>> using sameAs.
>>> >>>
>>> >>>> owl:equivalentProperty = if {X equivalentProperty Y} and {A X B} then
>>> >>>> {A Y B}. In other words, the properties can be used completely
>>> >>>> interchangeably. But perhaps there are other important differences
>>> >>>> between X and Y, such as their rdfs:label or rdfs:isDefinedBy.
>>> >>>
>>> >>> Still near equivalence.
>>> >>>
>>> >>>> owl:equivalentClass = if {X equivalentClass Y} then all Xs are Ys and
>>> >>>> vice versa. Same dealy with owl:equivalentProperty really.
>>> >>>
>>> >>> Ditto.
>>> >>>
>>> >>>> ovterms:similarTo = a general, all-purpose wimps' predicate. I use
>>> >>>> this
>>> >>>> extensively.
>>> >>>
>>> >>> Under the principal "first do no harm", this seems to work, although I
>>> >>> note that the intervention (something that happens) isn't similar to
>>> >>> the drug used in it (something that is consumed when the intervention
>>> >>> happens).
>>> >>>
>>> >>> seeAlso seems pretty harmless and noncommittal.
>>> >>>
>>> >>> But better is probably to look more closely at what the entities are
>>> >>> and then choose a relationship that better expresses how they relate.
>>> >>> In the case of the intervention, one plausible interpretation is that
>>> >>> the "intervention" names a class of processes, and that there is a
>>> >>> subclass of such processes in which the drug participates. (the other
>>> >>> subclass are those in which a placebo is the participant) This can be
>>> >>> modeled in OWL.
>>> >>>
>>> >>> (My real advice for clinical trial resource is to collaborate with the
>>> >>> OBI project and use terminology that is being developed for exactly
>>> >>> that purpose)
>>> >>>
>>> >>> In my line of work I start with the OBO Relation ontology,
>>> >>> http://www.obofoundry.org/ro/ which provides a basic set of well
>>> >>> documented relations, such as the has_participant relationship.
>>> >>>
>>> >>> OWL also provides some relations of beyond equivalences - subclass
>>> >>> relations are an option, when appropriate, as well as making
>>> >>> statements that classes overlap - by expressing that the intersection
>>> >>> of the two is not empty.
>>> >>>
>>> >>> That ontology is undergoing some reform, as it should in time. Some of
>>> >>> the new candidate relations are documented in links from that page. In
>>> >>> addition it is proposed that that there be class level and instance
>>> >>> level versions of the relations - the class level relations might
>>> >>> better a modeling style that would rather avoid using OWL
>>> >>> restrictions, and fits well with OWL 2 which allows a name(URI) to be
>>> >>> used as both a class and an instance.
>>> >>>
>>> >>> Finally, for those cases where there are more than one URI and they
>>> >>> *really* mean the same thing - why not try to get the parties who
>>> >>> minted them to collaborate and retire one of the URIs. If they really
>>> >>> mean the same thing there should be no harm in either party using the
>>> >>> other's URI.
>>> >>
>>> >> Its not that simple, unfortunately. I'm going to make this issue the
>>> >> center of my invited talk at ISWC later this year :-)
>>> >>
>>> >> Pat
>>> >>
>>> >>>
>>> >>> -Alan
>>> >>>
>>> >>>>
>>> >>>> --
>>> >>>> Toby A Inkster
>>> >>>> <mailto:mail@tobyinkster.co.uk>
>>> >>>> <http://tobyinkster.co.uk>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >> ------------------------------------------------------------
>>> >> IHMC                                     (850)434 8903 or (650)494 3973
>>> >> 40 South Alcaniz St.           (850)202 4416   office
>>> >> Pensacola                            (850)202 4440   fax
>>> >> FL 32502                              (850)291 0667   mobile
>>> >> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> > ------------------------------------------------------------
>>> > IHMC                                     (850)434 8903 or (650)494 3973
>>> > 40 South Alcaniz St.           (850)202 4416   office
>>> > Pensacola                            (850)202 4440   fax
>>> > FL 32502                              (850)291 0667   mobile
>>> > phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>
>>
>

Received on Friday, 24 July 2009 15:29:52 UTC