Re: An argument for bridging information models and ontologies at the syntactic level from jim herber on 2008-03-26 (public-hcls-coi@w3.org from January to March 2008)

From: jim herber <jimherber@gmail.com>
Date: Wed, 26 Mar 2008 12:22:00 -0500
To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
Cc: "Ogbuji, Chimezie" <OGBUJIC@ccf.org>, "public-hcls-coi@w3.org" <public-hcls-coi@w3.org>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Message-ID: <2a9999d90803261022k6efa372epa731918661217d49@mail.gmail.com>
Chimezie, excellent observation.  Agree with principals you are
articulating.

I would add:

1. Data models like schemas, structures, and data formats are implementation
details.
2. Concept models operate at many levels.  As an example, concept models may
represent the entire data model as a concept, or they may point at an
element within a data model as a concept.
3. Different concept models that are unrelated or loosely related may
reference the same data model.
4. Keeping the two (data models and conceptual models) separate allows them
to evolve independently.
5. Pulling out the mapping versus attempting to represent mapping and data
model in conceptual language fits a basic tenant of engineering principals,
that is "loosely coupled modules with highly cohesive functionality".

David, do you like "data model to conceptual mapping" better?


Jim Herber
Independent Consultant
jimherber_at_ gmail.com

On Wed, Mar 26, 2008 at 11:47 AM, Booth, David (HP Software - Boston) <
dbooth@hp.com> wrote:

>
> +1.  Except I find the term "syntactic mapping" somewhat misleading,
> because to my mind, the anti-pattern you are describing involves the
> encoding of syntactic-level concerns into the ontology, which as you point
> out, shouldn't be there.  So pertonally I would have been more inclined to
> call it "semantic mapping", but maybe someone else has a better idea.
>
>
> David Booth, Ph.D.
> HP Software
> +1 617 629 8881 office  |  dbooth@hp.com
> http://www.hp.com/go/software
>
> Opinions expressed herein are those of the author and do not represent the
> official views of HP unless explicitly stated otherwise.
>
>
> > -----Original Message-----
> > From: public-semweb-lifesci-request@w3.org
> > [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of
> > Ogbuji, Chimezie
> > Sent: Tuesday, March 25, 2008 9:07 PM
> > To: public-hcls-coi@w3.org; public-semweb-lifesci@w3.org
> > Subject: An argument for bridging information models and
> > ontologies at the syntactic level
> >
> > For some time I have had a concern about a theme in the more
> > common approaches to bridging information models and
> > ontologies as a path towards bringing the advantages of the
> > Semantic Web technologies to 'legacy' healthcare terminology systems.
> >
> > I wanted to speak on this topic  for some time but have
> > hesitated mostly because my thoughts were not fully baked and
> > (in addition) I thought this anti-pattern was an anomaly, but
> > today's conversation during the COI teleconference suggested
> > that I should speak up about it.
> >
> > To get right to the point, 1) I consider approaches that
> > attempt to perform this bridging directly between information
> > models and ontologies as examples of this 'anti-pattern.' 2)
> > I think that performing this bridging at the syntactic level
> > addresses the important problem of properly separating these
> > two  in a way that emphasizes their strengths.
> >
> > I would like to offer an alternative view point because I
> > think consensus on this particular topic is a significant
> > roadblock to a clear path for moving healthcare terminology
> > systems more towards formal knowledge representation (where
> > they need to be) in a way that doesn't do so at the expense
> > of the strengths of information models and conceptual models
> > ('models of meaning' or ontologies, etc..).
> >
> > Information models are better equipped to handle messaging,
> > data manipulation, validation, document management (and
> > structured, controlled data entry) than most (I'd venture to
> > say 'all') formal knowledge representations and knowledge
> > representations are better equipped to handle expressive
> > conceptualizations of the real world and inference.  Neither
> > should attempt to do the job of the other and doing so seems
> > fundamentally problematic to me.
> >
> > In a perfect world, a messaging dialect (such as HL7 RIM or
> > even Atom for that matter) would be developed with a formal
> > conceptualization as part of its specification.  This
> > conceptualization would be captured in a formal knowledge
> > representation (such as some particular fragment of FOL, for
> > instance) as a way to reach consensus on the 'real world'
> > entities that the messages refer to.
> >
> > Such a conceptualization would re-use philosophical precedent
> > in categorizing these real world entities in a well
> > understood (and fairly rigorous) way.  This could bottom out
> > in an alignment with a particular (high fidelity) upper
> > ontology (Cyc, DOLCE, and BFO come to mind) and fleshing out
> > specializations relevant to the particular domain associated
> > with the messages (healthcare in the case of HL7 RIM and
> > "syndication of web content" in the case of Atom).
> >
> > Consensus on this formal, conceptual model would happen first
> > and then would soon be followed by a process for defining
> > what the syntax would look like (independent of what
> > instances of the syntax denote in the conceptual model).
> > This separation minimizes interference between concerns about
> > data structures and characteristics of the relevant
> > categories of real world entities that the data structures represent.
> >
> > I consider this separation a good practice and it is
> > (perhaps) no surprise that this is how most Semantic Web
> > knowledge representation dialects are formulated (OWL 1.1 and
> > RIF for instance): First there is consensus on their
> > semantics then there is a dialog about how the language is
> > serialized.  Even if they don't happen in that particular
> > order they typically happen independently.
> >
> > Unfortunately, with regard to healthcare terminologies, we
> > have a situation where there is a large, well-deployed (or at
> > least widely adopted) information model for messaging that
> > was developed without a rigorous (formal) semantics but that
> > is fairly robust with respect to data structures, messaging,
> > syntax, and such.
> >
> > There are two ways to skin this cat, IMHO.  You can attempt
> > to capture both the information model as well as the
> > conceptualization (or ontology) in a formal knowledge
> > representation (which seems to be the more common approach).
> > Or you can leave the information model as it is and instead
> > map its (XML) serializations into a corresponding knowledge
> > representation serialization (RDF) that conforms to either a
> > pre-existing conceptual model of healthcare (expressed in
> > OWL) or one that was developed in order formalize the
> > conceptualization of the real world implicitly referenced by
> > the information model.  In the latter case (where, for
> > example, a 'custom' model of meaning for HL7 RIM is developed
> > and expressed formally in OWL) I think it is incredibly
> > important that such a model does not inherit any notions of
> > data constructs, validation, etc. since the necessity of this
> > is completely removed by the syntactic mapping.
> >
> > There are many parallels between the question of how you deal
> > with HL7 in this way and questions that the GRDDL WG
> > discussed about how Atom syndication content (for which there
> > is plenty in the wild) could be mapped to RDF using a
> > syntactic transformation (which is all GRDDL really is when
> > you boil it down).  Would this involve reusing an already
> > existing ontology of web content (independent of Atom) as the
> > target RDF syntax or would an ontology specifically crafted
> > for Atom (which inherits all the idiosyncrasies of Atom) be
> > adopted instead?
> >
> > In short, I think developing a syntactic mapping eliminates
> > the need to basically bastardize a knowledge representation
> > into doing what it was never designed to do (capture
> > structural, representationsl, and data-oriented constraints).
> >  Leave that to the originating model (which, by all accounts,
> > has done that particular job quite well).  My concern that
> > this is a better practice has been the main reason why most
> > of my attempts to demonstrate the value of aligning HL7 to
> > 'reference ontologies' for healthcare have been through the
> > use of syntactic mappings (via GRDDL for instance) than to
> > try to bite off an unnecessarily large chunk of capturing
> > both an information model and a model of meaning in a single
> > framework.
> >
> > My $0.02 (and more)
> >
> > Chimezie (chee-meh) Ogbuji
> > Lead Systems Analyst
> > Thoracic and Cardiovascular Surgery
> > Cleveland Clinic Foundation
> > 9500 Euclid Avenue/ W26
> > Cleveland, Ohio 44195
> > Office: (216)444-8593
> > ogbujic@ccf.org
> >
> >
> >
> > P Please consider the environment before printing this e-mail
> >
> >
> >
> > Cleveland Clinic is ranked one of the top hospitals in
> > America by U.S. News & World Report (2007).
> > Visit us online at http://www.clevelandclinic.org for a
> > complete listing of our services, staff and locations.
> >
> >
> > Confidentiality Note:  This message is intended for use only
> > by the individual or entity to which it is addressed and may
> > contain information that is privileged, confidential, and
> > exempt from disclosure under applicable law.  If the reader
> > of this message is not the intended recipient or the employee
> > or agent responsible for delivering the message to the
> > intended recipient, you are hereby notified that any
> > dissemination, distribution or copying of this communication
> > is strictly prohibited.  If you have received this
> > communication in error,  please contact the sender
> > immediately and destroy the material in its entirety, whether
> > electronic or hard copy.  Thank you.
> >
> >
>
>
Received on Thursday, 27 March 2008 16:35:09 UTC