RE: An argument for bridging information models and ontologies at the syntactic level from Ogbuji, Chimezie on 2008-03-26 (public-hcls-coi@w3.org from January to March 2008)

From: Ogbuji, Chimezie <OGBUJIC@ccf.org>
Date: Wed, 26 Mar 2008 14:19:20 -0400
To: "jim herber" <jimherber@gmail.com>, "Booth, David (HP Software - Boston)" <dbooth@hp.com>
cc: public-hcls-coi@w3.org, public-semweb-lifesci@w3.org
Message-ID: <2702D0EBA4F0A749968E52E8644184EA5A86DB@CCHSCLEXMB59.cc.ad.cchs.net>
Yes, "data model to conceptual mapping" is a better description of what
I had in mind.  The fact that it actually happens syntactically (via a
programmatic algorithm which takes HL7 CDA document instances and
transforms them into conforming RDF/XML documents, for instance) is
really an implementation detail.  The main point of emphasis for me is
that a separation of concerns is maintained so you aren't modeling
relations such as hasCode, for instance, in OWL.  Jim's point about this
being a basic tenant of Engineering principles is absolutely correct.  

Chimezie (chee-meh) Ogbuji
Lead Systems Analyst
Thoracic and Cardiovascular Surgery
Cleveland Clinic Foundation
9500 Euclid Avenue/ W26
Cleveland, Ohio 44195
Office: (216)444-8593
ogbujic@ccf.org 
________________________________

	From: jim herber [mailto:jimherber@gmail.com] 
	Sent: Wednesday, March 26, 2008 1:22 PM
	To: Booth, David (HP Software - Boston)
	Cc: Ogbuji, Chimezie; public-hcls-coi@w3.org;
public-semweb-lifesci@w3.org
	Subject: Re: An argument for bridging information models and
ontologies at the syntactic level
	
	
	Chimezie, excellent observation.  Agree with principals you are
articulating.  
	
	I would add:
	
	1. Data models like schemas, structures, and data formats are
implementation details.
	2. Concept models operate at many levels.  As an example,
concept models may represent the entire data model as a concept, or they
may point at an element within a data model as a concept.
	3. Different concept models that are unrelated or loosely
related may reference the same data model.
	4. Keeping the two (data models and conceptual models) separate
allows them to evolve independently.
	5. Pulling out the mapping versus attempting to represent
mapping and data model in conceptual language fits a basic tenant of
engineering principals, that is "loosely coupled modules with highly
cohesive functionality".
	
	David, do you like "data model to conceptual mapping" better?
	
	
	Jim Herber
	Independent Consultant
	jimherber_at_ gmail.com
	
	
	On Wed, Mar 26, 2008 at 11:47 AM, Booth, David (HP Software -
Boston) <dbooth@hp.com> wrote:
	


		+1.  Except I find the term "syntactic mapping" somewhat
misleading, because to my mind, the anti-pattern you are describing
involves the encoding of syntactic-level concerns into the ontology,
which as you point out, shouldn't be there.  So pertonally I would have
been more inclined to call it "semantic mapping", but maybe someone else
has a better idea.
		
		
		David Booth, Ph.D.
		HP Software
		+1 617 629 8881 office  |  dbooth@hp.com
		http://www.hp.com/go/software
		
		Opinions expressed herein are those of the author and do
not represent the official views of HP unless explicitly stated
otherwise.
		


		> -----Original Message-----
		> From: public-semweb-lifesci-request@w3.org
		> [mailto:public-semweb-lifesci-request@w3.org] On
Behalf Of
		> Ogbuji, Chimezie
		> Sent: Tuesday, March 25, 2008 9:07 PM
		> To: public-hcls-coi@w3.org;
public-semweb-lifesci@w3.org
		> Subject: An argument for bridging information models
and
		> ontologies at the syntactic level
		>
		> For some time I have had a concern about a theme in
the more
		> common approaches to bridging information models and
		> ontologies as a path towards bringing the advantages
of the
		> Semantic Web technologies to 'legacy' healthcare
terminology systems.
		>
		> I wanted to speak on this topic  for some time but
have
		> hesitated mostly because my thoughts were not fully
baked and
		> (in addition) I thought this anti-pattern was an
anomaly, but
		> today's conversation during the COI teleconference
suggested
		> that I should speak up about it.
		>
		> To get right to the point, 1) I consider approaches
that
		> attempt to perform this bridging directly between
information
		> models and ontologies as examples of this
'anti-pattern.' 2)
		> I think that performing this bridging at the syntactic
level
		> addresses the important problem of properly separating
these
		> two  in a way that emphasizes their strengths.
		>
		> I would like to offer an alternative view point
because I
		> think consensus on this particular topic is a
significant
		> roadblock to a clear path for moving healthcare
terminology
		> systems more towards formal knowledge representation
(where
		> they need to be) in a way that doesn't do so at the
expense
		> of the strengths of information models and conceptual
models
		> ('models of meaning' or ontologies, etc..).
		>
		> Information models are better equipped to handle
messaging,
		> data manipulation, validation, document management
(and
		> structured, controlled data entry) than most (I'd
venture to
		> say 'all') formal knowledge representations and
knowledge
		> representations are better equipped to handle
expressive
		> conceptualizations of the real world and inference.
Neither
		> should attempt to do the job of the other and doing so
seems
		> fundamentally problematic to me.
		>
		> In a perfect world, a messaging dialect (such as HL7
RIM or
		> even Atom for that matter) would be developed with a
formal
		> conceptualization as part of its specification.  This
		> conceptualization would be captured in a formal
knowledge
		> representation (such as some particular fragment of
FOL, for
		> instance) as a way to reach consensus on the 'real
world'
		> entities that the messages refer to.
		>
		> Such a conceptualization would re-use philosophical
precedent
		> in categorizing these real world entities in a well
		> understood (and fairly rigorous) way.  This could
bottom out
		> in an alignment with a particular (high fidelity)
upper
		> ontology (Cyc, DOLCE, and BFO come to mind) and
fleshing out
		> specializations relevant to the particular domain
associated
		> with the messages (healthcare in the case of HL7 RIM
and
		> "syndication of web content" in the case of Atom).
		>
		> Consensus on this formal, conceptual model would
happen first
		> and then would soon be followed by a process for
defining
		> what the syntax would look like (independent of what
		> instances of the syntax denote in the conceptual
model).
		> This separation minimizes interference between
concerns about
		> data structures and characteristics of the relevant
		> categories of real world entities that the data
structures represent.
		>
		> I consider this separation a good practice and it is
		> (perhaps) no surprise that this is how most Semantic
Web
		> knowledge representation dialects are formulated (OWL
1.1 and
		> RIF for instance): First there is consensus on their
		> semantics then there is a dialog about how the
language is
		> serialized.  Even if they don't happen in that
particular
		> order they typically happen independently.
		>
		> Unfortunately, with regard to healthcare
terminologies, we
		> have a situation where there is a large, well-deployed
(or at
		> least widely adopted) information model for messaging
that
		> was developed without a rigorous (formal) semantics
but that
		> is fairly robust with respect to data structures,
messaging,
		> syntax, and such.
		>
		> There are two ways to skin this cat, IMHO.  You can
attempt
		> to capture both the information model as well as the
		> conceptualization (or ontology) in a formal knowledge
		> representation (which seems to be the more common
approach).
		> Or you can leave the information model as it is and
instead
		> map its (XML) serializations into a corresponding
knowledge
		> representation serialization (RDF) that conforms to
either a
		> pre-existing conceptual model of healthcare (expressed
in
		> OWL) or one that was developed in order formalize the
		> conceptualization of the real world implicitly
referenced by
		> the information model.  In the latter case (where, for
		> example, a 'custom' model of meaning for HL7 RIM is
developed
		> and expressed formally in OWL) I think it is
incredibly
		> important that such a model does not inherit any
notions of
		> data constructs, validation, etc. since the necessity
of this
		> is completely removed by the syntactic mapping.
		>
		> There are many parallels between the question of how
you deal
		> with HL7 in this way and questions that the GRDDL WG
		> discussed about how Atom syndication content (for
which there
		> is plenty in the wild) could be mapped to RDF using a
		> syntactic transformation (which is all GRDDL really is
when
		> you boil it down).  Would this involve reusing an
already
		> existing ontology of web content (independent of Atom)
as the
		> target RDF syntax or would an ontology specifically
crafted
		> for Atom (which inherits all the idiosyncrasies of
Atom) be
		> adopted instead?
		>
		> In short, I think developing a syntactic mapping
eliminates
		> the need to basically bastardize a knowledge
representation
		> into doing what it was never designed to do (capture
		> structural, representationsl, and data-oriented
constraints).
		>  Leave that to the originating model (which, by all
accounts,
		> has done that particular job quite well).  My concern
that
		> this is a better practice has been the main reason why
most
		> of my attempts to demonstrate the value of aligning
HL7 to
		> 'reference ontologies' for healthcare have been
through the
		> use of syntactic mappings (via GRDDL for instance)
than to
		> try to bite off an unnecessarily large chunk of
capturing
		> both an information model and a model of meaning in a
single
		> framework.
		>
		> My $0.02 (and more)
		>
		> Chimezie (chee-meh) Ogbuji
		> Lead Systems Analyst
		> Thoracic and Cardiovascular Surgery
		> Cleveland Clinic Foundation
		> 9500 Euclid Avenue/ W26
		> Cleveland, Ohio 44195
		> Office: (216)444-8593
		> ogbujic@ccf.org
		>
		>
		>
		> P Please consider the environment before printing this
e-mail
		>
		>
		>
		> Cleveland Clinic is ranked one of the top hospitals in
		> America by U.S. News & World Report (2007).
		> Visit us online at http://www.clevelandclinic.org for
a
		> complete listing of our services, staff and locations.
		>
		>
		> Confidentiality Note:  This message is intended for
use only
		> by the individual or entity to which it is addressed
and may
		> contain information that is privileged, confidential,
and
		> exempt from disclosure under applicable law.  If the
reader
		> of this message is not the intended recipient or the
employee
		> or agent responsible for delivering the message to the
		> intended recipient, you are hereby notified that any
		> dissemination, distribution or copying of this
communication
		> is strictly prohibited.  If you have received this
		> communication in error,  please contact the sender
		> immediately and destroy the material in its entirety,
whether
		> electronic or hard copy.  Thank you.
		>
		>
		
		



===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News & World Report (2007).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.
Received on Wednesday, 26 March 2008 18:20:12 UTC