RE: An argument for bridging information models and ontologies at the syntactic level from Booth, David (HP Software - Boston) on 2008-03-26 (public-hcls-coi@w3.org from January to March 2008)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Wed, 26 Mar 2008 16:47:17 +0000
To: "Ogbuji, Chimezie" <OGBUJIC@ccf.org>, "public-hcls-coi@w3.org" <public-hcls-coi@w3.org>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Message-ID: <184112FE564ADF4F8F9C3FA01AE50009FCF1CBA2CB@G1W0486.americas.hpqcorp.net>
+1.  Except I find the term "syntactic mapping" somewhat misleading, because to my mind, the anti-pattern you are describing involves the encoding of syntactic-level concerns into the ontology, which as you point out, shouldn't be there.  So pertonally I would have been more inclined to call it "semantic mapping", but maybe someone else has a better idea.


David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.


> -----Original Message-----
> From: public-semweb-lifesci-request@w3.org
> [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of
> Ogbuji, Chimezie
> Sent: Tuesday, March 25, 2008 9:07 PM
> To: public-hcls-coi@w3.org; public-semweb-lifesci@w3.org
> Subject: An argument for bridging information models and
> ontologies at the syntactic level
>
> For some time I have had a concern about a theme in the more
> common approaches to bridging information models and
> ontologies as a path towards bringing the advantages of the
> Semantic Web technologies to 'legacy' healthcare terminology systems.
>
> I wanted to speak on this topic  for some time but have
> hesitated mostly because my thoughts were not fully baked and
> (in addition) I thought this anti-pattern was an anomaly, but
> today's conversation during the COI teleconference suggested
> that I should speak up about it.
>
> To get right to the point, 1) I consider approaches that
> attempt to perform this bridging directly between information
> models and ontologies as examples of this 'anti-pattern.' 2)
> I think that performing this bridging at the syntactic level
> addresses the important problem of properly separating these
> two  in a way that emphasizes their strengths.
>
> I would like to offer an alternative view point because I
> think consensus on this particular topic is a significant
> roadblock to a clear path for moving healthcare terminology
> systems more towards formal knowledge representation (where
> they need to be) in a way that doesn't do so at the expense
> of the strengths of information models and conceptual models
> ('models of meaning' or ontologies, etc..).
>
> Information models are better equipped to handle messaging,
> data manipulation, validation, document management (and
> structured, controlled data entry) than most (I'd venture to
> say 'all') formal knowledge representations and knowledge
> representations are better equipped to handle expressive
> conceptualizations of the real world and inference.  Neither
> should attempt to do the job of the other and doing so seems
> fundamentally problematic to me.
>
> In a perfect world, a messaging dialect (such as HL7 RIM or
> even Atom for that matter) would be developed with a formal
> conceptualization as part of its specification.  This
> conceptualization would be captured in a formal knowledge
> representation (such as some particular fragment of FOL, for
> instance) as a way to reach consensus on the 'real world'
> entities that the messages refer to.
>
> Such a conceptualization would re-use philosophical precedent
> in categorizing these real world entities in a well
> understood (and fairly rigorous) way.  This could bottom out
> in an alignment with a particular (high fidelity) upper
> ontology (Cyc, DOLCE, and BFO come to mind) and fleshing out
> specializations relevant to the particular domain associated
> with the messages (healthcare in the case of HL7 RIM and
> "syndication of web content" in the case of Atom).
>
> Consensus on this formal, conceptual model would happen first
> and then would soon be followed by a process for defining
> what the syntax would look like (independent of what
> instances of the syntax denote in the conceptual model).
> This separation minimizes interference between concerns about
> data structures and characteristics of the relevant
> categories of real world entities that the data structures represent.
>
> I consider this separation a good practice and it is
> (perhaps) no surprise that this is how most Semantic Web
> knowledge representation dialects are formulated (OWL 1.1 and
> RIF for instance): First there is consensus on their
> semantics then there is a dialog about how the language is
> serialized.  Even if they don't happen in that particular
> order they typically happen independently.
>
> Unfortunately, with regard to healthcare terminologies, we
> have a situation where there is a large, well-deployed (or at
> least widely adopted) information model for messaging that
> was developed without a rigorous (formal) semantics but that
> is fairly robust with respect to data structures, messaging,
> syntax, and such.
>
> There are two ways to skin this cat, IMHO.  You can attempt
> to capture both the information model as well as the
> conceptualization (or ontology) in a formal knowledge
> representation (which seems to be the more common approach).
> Or you can leave the information model as it is and instead
> map its (XML) serializations into a corresponding knowledge
> representation serialization (RDF) that conforms to either a
> pre-existing conceptual model of healthcare (expressed in
> OWL) or one that was developed in order formalize the
> conceptualization of the real world implicitly referenced by
> the information model.  In the latter case (where, for
> example, a 'custom' model of meaning for HL7 RIM is developed
> and expressed formally in OWL) I think it is incredibly
> important that such a model does not inherit any notions of
> data constructs, validation, etc. since the necessity of this
> is completely removed by the syntactic mapping.
>
> There are many parallels between the question of how you deal
> with HL7 in this way and questions that the GRDDL WG
> discussed about how Atom syndication content (for which there
> is plenty in the wild) could be mapped to RDF using a
> syntactic transformation (which is all GRDDL really is when
> you boil it down).  Would this involve reusing an already
> existing ontology of web content (independent of Atom) as the
> target RDF syntax or would an ontology specifically crafted
> for Atom (which inherits all the idiosyncrasies of Atom) be
> adopted instead?
>
> In short, I think developing a syntactic mapping eliminates
> the need to basically bastardize a knowledge representation
> into doing what it was never designed to do (capture
> structural, representationsl, and data-oriented constraints).
>  Leave that to the originating model (which, by all accounts,
> has done that particular job quite well).  My concern that
> this is a better practice has been the main reason why most
> of my attempts to demonstrate the value of aligning HL7 to
> 'reference ontologies' for healthcare have been through the
> use of syntactic mappings (via GRDDL for instance) than to
> try to bite off an unnecessarily large chunk of capturing
> both an information model and a model of meaning in a single
> framework.
>
> My $0.02 (and more)
>
> Chimezie (chee-meh) Ogbuji
> Lead Systems Analyst
> Thoracic and Cardiovascular Surgery
> Cleveland Clinic Foundation
> 9500 Euclid Avenue/ W26
> Cleveland, Ohio 44195
> Office: (216)444-8593
> ogbujic@ccf.org
>
>
>
> P Please consider the environment before printing this e-mail
>
>
>
> Cleveland Clinic is ranked one of the top hospitals in
> America by U.S. News & World Report (2007).
> Visit us online at http://www.clevelandclinic.org for a
> complete listing of our services, staff and locations.
>
>
> Confidentiality Note:  This message is intended for use only
> by the individual or entity to which it is addressed and may
> contain information that is privileged, confidential, and
> exempt from disclosure under applicable law.  If the reader
> of this message is not the intended recipient or the employee
> or agent responsible for delivering the message to the
> intended recipient, you are hereby notified that any
> dissemination, distribution or copying of this communication
> is strictly prohibited.  If you have received this
> communication in error,  please contact the sender
> immediately and destroy the material in its entirety, whether
> electronic or hard copy.  Thank you.
>
>
Received on Wednesday, 26 March 2008 16:49:05 UTC