RE: Comments on the tension between bottom-up KR and use of top down ontologies

Quite clearly, what is needed is an OR and not an XOR!

I heartily agree.

 

In the HCLSIG context, we need to figure out a way for better
communication/coordination of these two approaches.

 

Cheers,

 

---Vipul

 

=======================================

Vipul Kashyap, Ph.D.

Senior Medical Informatician

Clinical Informatics R&D, Partners HealthCare System

Phone: (781)416-9254

Cell: (617)943-7120

http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik

 

To keep up you need the right answers; to get ahead you need the right questions

---John Browning and Spencer Reiss, Wired 6.04.95

________________________________

From: William Bug [mailto:William.Bug@DrexelMed.edu] 
Sent: Saturday, January 20, 2007 8:40 PM
To: Jim Hendler
Cc: Kashyap, Vipul; public-semweb-lifesci hcls
Subject: Re: Comments on the tension between bottom-up KR and use of top down
ontologies

 

I am an ardent and reverent disciple of the position Jim outlines below. :-)

 

This is the point I was trying to make in my response to an issue brought up a
few days back re: the relation between the BioRDF data sets being
identified/converted and their relation to the BioONT's efforts identifying what
I was referring to as distilled knowledge resources (DKR) - a neologism designed
to shed some of the baggage artifacts such as terminologies, taxonomies,
classification schemes, thesauri, and ontologies bring with them - though I'm
probably just making matters worse by adding another term to the bag.

 

I like to think we are converging on systems of ever increasing, emergent
"intelligence" as we work to combine these approaches - which would include in
the cornucopia of research domains:

          - statistically-based data mining techniques (customized for different
types of data such as sequence analysis, gene expression analysis, computer
vision algorithms applied to automatically extracting biologically relevant
shapes from bioimaging data sets, Bayesian analysis applied to time-series
physiological data, etc.)

          - modeling techniques - pathway modeling, physiological modeling from
molecules on up through systems, morpho-anatomical modeling, etc.

          - Text-based KE/KR/KM - e.g., NLP, LSI, enhanced IR of all sorts, etc.

          - the related field of computational linguistics and lexical analysis
informing the creation of CVs, taxonomies, classification schemes, and thesauri

          - logic formalisms - DL, Rule-based, etc. - used both for bottom-up &
top-down KE/KR/KM

 

I've been taking this line, since the time when I was head of product
development for the Biological Abstracts (BIOSIS) in the mid-90s and further in
ontological engineering work I did in the late 90s with the bioinformatics
company DoubleTwist.

 

This is the editorializing I was avoiding in my initial forwarded post.  ;-)

 

Cheers,

Bill

 

 

 

On Jan 20, 2007, at 8:11 PM, Jim Hendler wrote:





While I think there's a lot of iteresting issues in this space, let me point out
that the mail below seems to think that the "OR" between top-down and bottom-up
is an "XOR" - i.e. one of the other.  The vision I've been pushing for a long
time is one where they both flourish, and even more importantly they link
together into a Web of Semantic Definitions.   My very first briefs when the
DAML language was being created included the idea that a key aspect of a
"distributed ontological representation" was that it could combine top-down and
bottom-up approaches - here's the words from a slide from 2000 - you can see
I've been saying this for a long time now...

 -Jim H.

p.s. For US readers, this is one of the arguments that helped convince the govt
to spend a chunk of your tax dollars on creating an ontology language :-)

 

(from DARPA presentation - ca. 2000 - apologies for formatting, copiy/pasted
from a slide in ppt format)

 

Small communities define common semantics

Technical Vocabularies abound

Mission specific

Technical jargons

Shared values

Larger communities form around shared terms

Mapping and "articulation" become crucial

Interoperability at web languages level

Top-Down (organization defines critical aircraft properties)

         or bottom up (Oh, a "foxbat" is a Mig29)

Business case for improving communication!

 

 

At 11:14 AM -0500 1/20/07, Kashyap, Vipul wrote:

Bill,

 

I am glad you brought this up! This could make a good topic for a future
BIONT/HCLSIG Telcon Agenda.

 

I do have another version of the Top-Down/Bottom-Up "tension" which brings up
the same issue in a different context.

 

Top-Down: Use Cases => Ontologies => Mappings to Data => RDFize Data Sets

Bottom-Up: RDFize Data Sets => Ground RDF Graphs in Ontologies/Terminologies =>
See applicability to Use Cases.

 

Wherea, I do have a preference for one of the above, I do recognize the validity
and appropriateness of the second

approach in various scenarios.

 

Am glad that we are having this debate as a community and if we leverage the
discussions and thoughts proposed

around this, I think we would have made a contribution to the field.

 

Look forward to hearing from fellow HCLSIG-ers on this.

 

Cheers,

 

---Vipul

 

=======================================

Vipul Kashyap, Ph.D.

Senior Medical Informatician

Clinical Informatics R&D, Partners HealthCare System

Phone: (781)416-9254

Cell: (617)943-7120

<http://www.partners.org/cird/AboutUs.asp?cBox=Staff&stAb=vik>http://www.partner
s.org/cird/AboutUs.asp?cBox=Staff&stAb=vik

 

To keep up you need the right answers; to get ahead you need the right questions

---John Browning and Spencer Reiss, Wired 6.04.95

 

From: public-semweb-lifesci-request@w3.org
[mailto:public-semweb-lifesci-request@w3.org] On Behalf Of William Bug

Sent: Saturday, January 20, 2007 10:02 AM

To: public-semweb-lifesci hcls

Subject: Comments on the tension between bottom-up KR and use of top down
ontologies

 

Hi All,

 

This was recently posted to the UMLS list.

 

Given some of the issues we've been discussing, I thought others might
appreciate some of the ideas recounted here by Gary Merrill from GlaxoSmithKline

 

I have my own take on this very very important issue, but I'd rather not
editorialize on Gary's points - and give you a chance to process them as he so
clearly expressed them.  Some familiarity with UMLS structure is helpful
(<http://umlsinfo.nlm.nih.gov>http://umlsinfo.nlm.nih.gov).

 

By the way, a site relevant to our efforts is the Open Clinical site (KM for
Medical Care -
<http://www.openclinical.org/medTermUmls.html>http://www.openclinical.org/medTer
mUmls.html).

 

Cheers,

Bill

 

Begin forwarded message:

 

 

From: <mailto:gary.h.merrill@GSK.COM>gary.h.merrill@GSK.COM

Date: January 19, 2007 10:52:11 AM EST

To: <mailto:UMLSUSERS-L@LIST.NIH.GOV>UMLSUSERS-L@LIST.NIH.GOV

Subject: Re: MRHIER and AUIs

Reply-To: <mailto:gary.h.merrill@GSK.COM>gary.h.merrill@GSK.COM

 

William:

 

I think that was a very good non-techincal summary of some issues in the

Metathesaurus that can be difficult and confusing.  The nature and role of

AUIs (and their relationships to one another and to the CUIs that they

"realize") can require substantial thought.

 

I am always a little concerned when I see statements such as  "In an ideal

harmonious world,  NLM and all sources would agree, and Meta would become

a single unified principled

ontology."   I do not in fact think that this is necessarily true (under

some reasonable constraints it is in fact provably false), and definitely

do not think it should be taken as a disideratum.  Perhaps you do not

either, but I wanted to take this opportunity to say that, particularly in

the context of evolving empirical scientific theories, we should not

expect (and not necessarily even strive for) such a unified ontology.

(There are, of course, those who would disagree.)  The history of science

and the history of philosphy has shown the folly of this, and I would

argue that while striving for a certain "convergence" is desireable,

striving for the one true theory/ontology is not.  That's something of a

digression, but I take the strength of UMLS to lie in providing a way of

"communicating between" and using mulitple disparate (at times mutually

inconsistent) world views without imposing a strict ueber-ontology. Again,

there are those who tend to find the lack of the ueber-ontology to leave

them feeling insecure and adrift in metaphysical ream of uncertainty.

 

As I expressed to Chris in separate communication, from my perspective (as

a very application-oriented user), UMLS provides a usually adequate

representation of "concepts" (via CUIs), and terms/words/linguistic items

(via SUIs, LUIs, etc.).  What it does not provide a particularly crisp

representation of at the moment is "things" -- e.g., diseases rather than

disease names or disease concepts (that is, the extensional correlate of

the (intensional) concept/CUI).  AUIs are enlisted to support this to some

degree, but they are somewhat too closely allied to linguistic items

(terms) to carry the genuine semantic weight of "things" (extensions).  At

best, one ends up using sets of AUIs as equivalence classes to represent

the thing to which each of the AUIs "refers" (though "refer" here is, I

think, a bit misleading).  So in terms of a classic thing/word/concept

semantic hierarchy, my feeling is that UMLS does a good job of the

word/concept part, but the thing part is left a bit "mushy".  However,

there is room for substantial debate here, and many of the issues are

unclear.

 

Largely this is a consequence of construing UMLS as a -- surprise --

meta*thesaurus* rather than a meta*ontology*, and focusing on meaning

relations (e.g., synonomy) rather than more fundamental semantic relations

(e.g., denotation and extension).   I do have some ideas of how this might

be addressed, but won't even mention them here -- partly because working

them out requires substantial thought and care, and partly because I'm not

altogether sure of what the benefit would be (to most UMLS users) to

retrofitting such an approach to UMLS.

 

------------------------------

Gary H. Merrill, Director

Semantic Technologies Group

Statistical and Quantitative Sciences

GlaxoSmithKline Research and Development

Research Triangle Park, NC

919.483.8456

 

 

Bill Bug

Senior Research Analyst/Ontological Engineer

 

Laboratory for Bioimaging  & Anatomical Informatics

www.neuroterrain.org

Department of Neurobiology & Anatomy

Drexel University College of Medicine

2900 Queen Lane

Philadelphia, PA    19129

215 991 8430 (ph)

610 457 0443 (mobile)

215 843 9367 (fax)

 

 

Please Note: I now have a new email -
<mailto:William.Bug@DrexelMed.edu>William.Bug@DrexelMed.edu

 

 

 

 

 

THE INFORMATION TRANSMITTED IN THIS ELECTRONIC COMMUNICATION IS INTENDED ONLY
FOR THE PERSON OR ENTITY TO WHOM IT IS ADDRESSED AND MAY CONTAIN CONFIDENTIAL
AND/OR PRIVILEGED MATERIAL. ANY REVIEW, RETRANSMISSION, DISSEMINATION OR OTHER
USE OF OR TAKING OF ANY ACTION IN RELIANCE UPON, THIS INFORMATION BY PERSONS OR
ENTITIES OTHER THAN THE INTENDED RECIPIENT IS PROHIBITED. IF YOU RECEIVED THIS
INFORMATION IN ERROR, PLEASE CONTACT THE SENDER AND THE PRIVACY OFFICER, AND
PROPERLY DISPOSE OF THIS INFORMATION.

 

 

-- 

Prof James Hendler                                         hendler@cs.rpi.edu

Tetherless World Constellation Chair
http://www.cs.umd.edu/~hendler

Computer Science Dept                                  301-405-2696 (work)

Rensselaer Polytechnic Inst                             301-405-6707 (Fax)

Troy, NY 12180

 

Bill Bug

Senior Research Analyst/Ontological Engineer

 

Laboratory for Bioimaging  & Anatomical Informatics

www.neuroterrain.org

Department of Neurobiology & Anatomy

Drexel University College of Medicine

2900 Queen Lane

Philadelphia, PA    19129

215 991 8430 (ph)

610 457 0443 (mobile)

215 843 9367 (fax)

 

 

Please Note: I now have a new email - William.Bug@DrexelMed.edu

 

 





 





THE INFORMATION TRANSMITTED IN THIS ELECTRONIC COMMUNICATION IS INTENDED ONLY FOR THE PERSON OR ENTITY TO WHOM IT IS ADDRESSED AND MAY CONTAIN CONFIDENTIAL AND/OR PRIVILEGED MATERIAL.  ANY REVIEW, RETRANSMISSION, DISSEMINATION OR OTHER USE OF OR TAKING OF ANY ACTION IN RELIANCE UPON, THIS INFORMATION BY PERSONS OR ENTITIES OTHER THAN THE INTENDED RECIPIENT IS PROHIBITED.  IF YOU RECEIVED THIS INFORMATION IN ERROR, PLEASE CONTACT THE SENDER AND THE PRIVACY OFFICER, AND PROPERLY DISPOSE OF THIS INFORMATION.

Received on Sunday, 21 January 2007 10:54:35 UTC