- From: Kashyap, Vipul <VKASHYAP1@PARTNERS.ORG>
- Date: Sun, 21 Jan 2007 05:53:59 -0500
- To: "William Bug" <William.Bug@DrexelMed.edu>, "Jim Hendler" <hendler@cs.umd.edu>
- Cc: "public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>
- Message-ID: <2BF18EC866AF0448816CDB62ADF6538105EAF28A@PHSXMB11.partners.org>
Quite clearly, what is needed is an OR and not an XOR!
I heartily agree.
In the HCLSIG context, we need to figure out a way for better
communication/coordination of these two approaches.
Cheers,
---Vipul
=======================================
Vipul Kashyap, Ph.D.
Senior Medical Informatician
Clinical Informatics R&D, Partners HealthCare System
Phone: (781)416-9254
Cell: (617)943-7120
http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik
To keep up you need the right answers; to get ahead you need the right questions
---John Browning and Spencer Reiss, Wired 6.04.95
________________________________
From: William Bug [mailto:William.Bug@DrexelMed.edu]
Sent: Saturday, January 20, 2007 8:40 PM
To: Jim Hendler
Cc: Kashyap, Vipul; public-semweb-lifesci hcls
Subject: Re: Comments on the tension between bottom-up KR and use of top down
ontologies
I am an ardent and reverent disciple of the position Jim outlines below. :-)
This is the point I was trying to make in my response to an issue brought up a
few days back re: the relation between the BioRDF data sets being
identified/converted and their relation to the BioONT's efforts identifying what
I was referring to as distilled knowledge resources (DKR) - a neologism designed
to shed some of the baggage artifacts such as terminologies, taxonomies,
classification schemes, thesauri, and ontologies bring with them - though I'm
probably just making matters worse by adding another term to the bag.
I like to think we are converging on systems of ever increasing, emergent
"intelligence" as we work to combine these approaches - which would include in
the cornucopia of research domains:
- statistically-based data mining techniques (customized for different
types of data such as sequence analysis, gene expression analysis, computer
vision algorithms applied to automatically extracting biologically relevant
shapes from bioimaging data sets, Bayesian analysis applied to time-series
physiological data, etc.)
- modeling techniques - pathway modeling, physiological modeling from
molecules on up through systems, morpho-anatomical modeling, etc.
- Text-based KE/KR/KM - e.g., NLP, LSI, enhanced IR of all sorts, etc.
- the related field of computational linguistics and lexical analysis
informing the creation of CVs, taxonomies, classification schemes, and thesauri
- logic formalisms - DL, Rule-based, etc. - used both for bottom-up &
top-down KE/KR/KM
I've been taking this line, since the time when I was head of product
development for the Biological Abstracts (BIOSIS) in the mid-90s and further in
ontological engineering work I did in the late 90s with the bioinformatics
company DoubleTwist.
This is the editorializing I was avoiding in my initial forwarded post. ;-)
Cheers,
Bill
On Jan 20, 2007, at 8:11 PM, Jim Hendler wrote:
While I think there's a lot of iteresting issues in this space, let me point out
that the mail below seems to think that the "OR" between top-down and bottom-up
is an "XOR" - i.e. one of the other. The vision I've been pushing for a long
time is one where they both flourish, and even more importantly they link
together into a Web of Semantic Definitions. My very first briefs when the
DAML language was being created included the idea that a key aspect of a
"distributed ontological representation" was that it could combine top-down and
bottom-up approaches - here's the words from a slide from 2000 - you can see
I've been saying this for a long time now...
-Jim H.
p.s. For US readers, this is one of the arguments that helped convince the govt
to spend a chunk of your tax dollars on creating an ontology language :-)
(from DARPA presentation - ca. 2000 - apologies for formatting, copiy/pasted
from a slide in ppt format)
Small communities define common semantics
Technical Vocabularies abound
Mission specific
Technical jargons
Shared values
Larger communities form around shared terms
Mapping and "articulation" become crucial
Interoperability at web languages level
Top-Down (organization defines critical aircraft properties)
or bottom up (Oh, a "foxbat" is a Mig29)
Business case for improving communication!
At 11:14 AM -0500 1/20/07, Kashyap, Vipul wrote:
Bill,
I am glad you brought this up! This could make a good topic for a future
BIONT/HCLSIG Telcon Agenda.
I do have another version of the Top-Down/Bottom-Up "tension" which brings up
the same issue in a different context.
Top-Down: Use Cases => Ontologies => Mappings to Data => RDFize Data Sets
Bottom-Up: RDFize Data Sets => Ground RDF Graphs in Ontologies/Terminologies =>
See applicability to Use Cases.
Wherea, I do have a preference for one of the above, I do recognize the validity
and appropriateness of the second
approach in various scenarios.
Am glad that we are having this debate as a community and if we leverage the
discussions and thoughts proposed
around this, I think we would have made a contribution to the field.
Look forward to hearing from fellow HCLSIG-ers on this.
Cheers,
---Vipul
=======================================
Vipul Kashyap, Ph.D.
Senior Medical Informatician
Clinical Informatics R&D, Partners HealthCare System
Phone: (781)416-9254
Cell: (617)943-7120
<http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik>http://www.partner
s.org/cird/AboutUs.asp?cBox=Staff&stAb=vik
To keep up you need the right answers; to get ahead you need the right questions
---John Browning and Spencer Reiss, Wired 6.04.95
From: public-semweb-lifesci-request@w3.org
[mailto:public-semweb-lifesci-request@w3.org] On Behalf Of William Bug
Sent: Saturday, January 20, 2007 10:02 AM
To: public-semweb-lifesci hcls
Subject: Comments on the tension between bottom-up KR and use of top down
ontologies
Hi All,
This was recently posted to the UMLS list.
Given some of the issues we've been discussing, I thought others might
appreciate some of the ideas recounted here by Gary Merrill from GlaxoSmithKline
I have my own take on this very very important issue, but I'd rather not
editorialize on Gary's points - and give you a chance to process them as he so
clearly expressed them. Some familiarity with UMLS structure is helpful
(<http://umlsinfo.nlm.nih.gov>http://umlsinfo.nlm.nih.gov).
By the way, a site relevant to our efforts is the Open Clinical site (KM for
Medical Care -
<http://www.openclinical.org/medTermUmls.html>http://www.openclinical.org/medTer
mUmls.html).
Cheers,
Bill
Begin forwarded message:
From: <mailto:gary.h.merrill@GSK.COM>gary.h.merrill@GSK.COM
Date: January 19, 2007 10:52:11 AM EST
To: <mailto:UMLSUSERS-L@LIST.NIH.GOV>UMLSUSERS-L@LIST.NIH.GOV
Subject: Re: MRHIER and AUIs
Reply-To: <mailto:gary.h.merrill@GSK.COM>gary.h.merrill@GSK.COM
William:
I think that was a very good non-techincal summary of some issues in the
Metathesaurus that can be difficult and confusing. The nature and role of
AUIs (and their relationships to one another and to the CUIs that they
"realize") can require substantial thought.
I am always a little concerned when I see statements such as "In an ideal
harmonious world, NLM and all sources would agree, and Meta would become
a single unified principled
ontology." I do not in fact think that this is necessarily true (under
some reasonable constraints it is in fact provably false), and definitely
do not think it should be taken as a disideratum. Perhaps you do not
either, but I wanted to take this opportunity to say that, particularly in
the context of evolving empirical scientific theories, we should not
expect (and not necessarily even strive for) such a unified ontology.
(There are, of course, those who would disagree.) The history of science
and the history of philosphy has shown the folly of this, and I would
argue that while striving for a certain "convergence" is desireable,
striving for the one true theory/ontology is not. That's something of a
digression, but I take the strength of UMLS to lie in providing a way of
"communicating between" and using mulitple disparate (at times mutually
inconsistent) world views without imposing a strict ueber-ontology. Again,
there are those who tend to find the lack of the ueber-ontology to leave
them feeling insecure and adrift in metaphysical ream of uncertainty.
As I expressed to Chris in separate communication, from my perspective (as
a very application-oriented user), UMLS provides a usually adequate
representation of "concepts" (via CUIs), and terms/words/linguistic items
(via SUIs, LUIs, etc.). What it does not provide a particularly crisp
representation of at the moment is "things" -- e.g., diseases rather than
disease names or disease concepts (that is, the extensional correlate of
the (intensional) concept/CUI). AUIs are enlisted to support this to some
degree, but they are somewhat too closely allied to linguistic items
(terms) to carry the genuine semantic weight of "things" (extensions). At
best, one ends up using sets of AUIs as equivalence classes to represent
the thing to which each of the AUIs "refers" (though "refer" here is, I
think, a bit misleading). So in terms of a classic thing/word/concept
semantic hierarchy, my feeling is that UMLS does a good job of the
word/concept part, but the thing part is left a bit "mushy". However,
there is room for substantial debate here, and many of the issues are
unclear.
Largely this is a consequence of construing UMLS as a -- surprise --
meta*thesaurus* rather than a meta*ontology*, and focusing on meaning
relations (e.g., synonomy) rather than more fundamental semantic relations
(e.g., denotation and extension). I do have some ideas of how this might
be addressed, but won't even mention them here -- partly because working
them out requires substantial thought and care, and partly because I'm not
altogether sure of what the benefit would be (to most UMLS users) to
retrofitting such an approach to UMLS.
------------------------------
Gary H. Merrill, Director
Semantic Technologies Group
Statistical and Quantitative Sciences
GlaxoSmithKline Research and Development
Research Triangle Park, NC
919.483.8456
Bill Bug
Senior Research Analyst/Ontological Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email -
<mailto:William.Bug@DrexelMed.edu>William.Bug@DrexelMed.edu
THE INFORMATION TRANSMITTED IN THIS ELECTRONIC COMMUNICATION IS INTENDED ONLY
FOR THE PERSON OR ENTITY TO WHOM IT IS ADDRESSED AND MAY CONTAIN CONFIDENTIAL
AND/OR PRIVILEGED MATERIAL. ANY REVIEW, RETRANSMISSION, DISSEMINATION OR OTHER
USE OF OR TAKING OF ANY ACTION IN RELIANCE UPON, THIS INFORMATION BY PERSONS OR
ENTITIES OTHER THAN THE INTENDED RECIPIENT IS PROHIBITED. IF YOU RECEIVED THIS
INFORMATION IN ERROR, PLEASE CONTACT THE SENDER AND THE PRIVACY OFFICER, AND
PROPERLY DISPOSE OF THIS INFORMATION.
--
Prof James Hendler hendler@cs.rpi.edu
Tetherless World Constellation Chair
http://www.cs.umd.edu/~hendler
Computer Science Dept 301-405-2696 (work)
Rensselaer Polytechnic Inst 301-405-6707 (Fax)
Troy, NY 12180
Bill Bug
Senior Research Analyst/Ontological Engineer
Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)
Please Note: I now have a new email - William.Bug@DrexelMed.edu
THE INFORMATION TRANSMITTED IN THIS ELECTRONIC COMMUNICATION IS INTENDED ONLY FOR THE PERSON OR ENTITY TO WHOM IT IS ADDRESSED AND MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED MATERIAL. ANY REVIEW, RETRANSMISSION, DISSEMINATION OR OTHER USE OF OR TAKING OF ANY ACTION IN RELIANCE UPON, THIS INFORMATION BY PERSONS OR ENTITIES OTHER THAN THE INTENDED RECIPIENT IS PROHIBITED. IF YOU RECEIVED THIS INFORMATION IN ERROR, PLEASE CONTACT THE SENDER AND THE PRIVACY OFFICER, AND PROPERLY DISPOSE OF THIS INFORMATION.
Received on Sunday, 21 January 2007 10:54:35 UTC