Re: An argument for bridging information models and ontologies at the syntactic level from Alan Rector on 2008-04-27 (public-semweb-lifesci@w3.org from April 2008)

From: Alan Rector <rector@cs.man.ac.uk>
Date: Sun, 27 Apr 2008 14:58:44 +0100
To: "Kashyap, Vipul" <VKASHYAP1@PARTNERS.ORG>
Cc: <dan.russler@oracle.com>, "Ogbuji, Chimezie" <OGBUJIC@ccf.org>, "Dan Corwin" <dan@lexikos.com>, "Oniki, Tom (GE Healthcare, consultant)" <Tom.Oniki@ge.com>, "Samson Tu" <swt@stanford.edu>, <public-semweb-lifesci@w3.org>, <public-hcls-coi@w3.org>
Message-Id: <C69E6AA2-4A96-42C7-8597-6064704060CE@cs.man.ac.uk>
All

I am coming in a bit late on this, but two points

A)   I'd like to suggest that there are two, largely orthogonal,  
dimensions (at least) being conflated:

i)	The evidence trail or "provenance" of information and our  
consequent degree of belief/willingness to rely on that information

ii)	The information to be transmitted - and what information is  
potentially available on each step of the process.

The informational part (ii) is sketched from a slightly different  
point of view in my Medinfo 07 paper (please use the corrected version)
http://www.cs.man.ac.uk/%7Erector/papers/Whats-in-a-code/Whats-in-a-code-rector-corrected.pdf

and in the paper that I shall present ar KR-MED 2008 in June, and is  
currently being refereed for JAMIA. (Our preprint server is still  
under construction, but I am happy to share the manuscript with  
individuals interested.)  Hopefully it will be there by June.

Although the fundamental problem is reasoning with clinical  
information, the immediate problem for clinical systems is the  
information itself, so I shall concentrate on that here.

My main contention is that the things that we put in medical records  
represent statements "ascribing" (or "not ascribing") characteristics  
and relationships to patients - i.e.
we are saying that the patient "_has_ a white count = 10,000" or that  
the patient "_has_ Diabetes".  (For diabetes we may also say that the  
patient "_does_not_have_ diabetes)

Whether we enter the coded form in the medical record for "WBC=10,000"  
or whether we enter "Diabetes" we are ascribing that condition to the  
patient (at a given time, place, etc.)

We may be basing this information on information about a sample, an  
artifact (e.g. a Radiology study), a direct observation, or a  
diagnostic inference.  In each case there is a degree of inference -  
indicated by the fact that most information has to be "approved"  
before it gets into the record.

i) concerns the chain of evidence, long or short, and our systems  
sometimes conflate the measurement and the statement of belief based  
on that measurement (the "ascription").  However, when we go to reason  
about it the reasoning is very different.  If we infer that the  
patient has an elevated potassium we do something; if we think the  
sample has been haemolized we do something else.  But no person "has"  
a haemolised K+" although they may have the source from which "a  
haemolised sample" was taken on which a measurement of K+ was performed.

II) concerns what statements can convey information.  Since our  
background information model (sometimes oddly called an "ontology")  
says that all people at all times have a white count, there is no  
point is saying "The patient has a white count" (although there is a  
point in saying: "the patient has had a white count performed").

All patients at all times have white counts, we may just be ignorant  
of them. Therefore, simply saying that somebody _has_ a white count  
tells us nothing we don't' know already and does not differentiate  
them from other patients.   It conveys no information. To convey  
information we have to say something about the white count, usually  
its numerical value.

By contrast, "Diabetes" and "Cardiac Murmur" are both things that only  
some people have only some of the time.  Simply to say that a patient  
_has_ them conveys information because we don't know it already and  
does differentiate them from other patients, or the same patient at  
different times or as observed by different observers.

We tend to use the label "Situation" for the entity that reprsents a  
patient at a time as observed by an observer (who records their  
information) and "includes" as the property, so that, the appropriate  
level for transforming between ontologies, codes, and information  
models must take this into account.

Note that "having diabetes" is different from "diabetes".  There is  
different information to be conveyed about "diabetes" and about  
"having diabetes" (or more precisely, ("situations having diabetes" -  
or in our usual notation Situation THAT includes Diabetes).

This approach deliberately makes it possible get the equivalences  
between a finding

	"'_has_ WBC>=10,000" and what SNOMED has trditionally called an  
"observable "'_has_ WBC' >= 10000'" as a test and value (range).
And alows us to say of the same WBC that it is considered to be  
'elevated".

The evidence chain for the statement that the WBC is elevated goes  
back to the statement about the WBC being above 10,000 which in turn  
goes back to the lab test etc.

B) There is different information to be conveyed about the entity that  
is being tested for - e.g. WBC - and the method of testing. Therefore  
it makes sense for there to be separate entities for them at some  
level in our modelling. (You can order a test, you can't order  
somebody to have a WBC).  In the same way, the test result is clearly  
different from the statement that it is valid for the patient.  We may  
often elide this differences and encapsulate two or more entities for  
purposes of a more efficient information system and/or a more  
computationally tractable logical model, but they are real. We should  
be clear when we are deliberately eliding different entities.

I hope this is a useful intrusion.

Regards

Alan


-----------------------
Alan Rector
Professor of Medical Informatics
School of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL +44 (0) 161 275 6149/6188
FAX +44 (0) 161 275 6204
www.cs.man.ac.uk/mig
www.clinical-esciences.org
www.co-ode.org




On 16 Apr 2008, at 20:16, Kashyap, Vipul wrote:

>
> Ogbuji, Chimezie wrote:
>> Dan,
>>
>> I've very familiar with the SOAP model.  The primary motivation for  
>> my questions about assessment had more to do with distinguishing an  
>> action from data that is derived from it.  This speaks directly to  
>> the problem of the 'anti-pattern' where ontologies for medical  
>> records are built *directly* from models that were designed with  
>> data-level concerns in mind and thus semantically inconsistent (so  
>> called "information models").
>>
>> The sense of assessment as used in this paper suggests that an  
>> assessment is data (and thus appropriate to consider a diagnosis),  
>> but consider that there are other senses of the word and one in  
>> particular is "the act of judging or assessing a person or  
>> situation or event".  In the latter case, an assessment refers to  
>> the act.  I was simply trying to tease out which of these Tom had  
>> in   mind.
>>
> <danR> It is true that in traditional lab department systems, the  
> 'data from the assessment' was modeled separately from the  
> 'assessment action.'  This is not exactly "wrong." However, it was  
> noted that one cannot deliver a "numeric result" without restating  
> the action that generated the result, e.g. serum WBC is the action  
> and serum WBC of 10,000 WBcells/ml is the result. In physical  
> sciences, it is considered good practice to always include the  
> methodology of the action when describing the data. Accordingly, it  
> is best practice in the science of healthcare to also report on the  
> nature of the action itself at the same time as reporting on the  
> data derived from the action.
>
> [VK] It may be the case that one can model key properties that can  
> enable the accurate assessment of the action.
> For instance, one could model things like the property being  
> assessed, who is doing the assessment, the qualifiers of the  
> assessment, etc.
> The CEM approack followed by IHC seems to adopt this strategy. From  
> what I can see, there doesn't appear a need to model all the aspects  
> of an action.
>
> On the other hand, if there is indeed a need for more contextual  
> information related to the action of performing the assessment, it  
> is probably a good idea to
> model these two things separately and then link them via  
> approporiate relationships modeling the context, but this likely to  
> happen in an application specific manner.
>
> Cheers,
>
> ---Vipul
> The information transmitted in this electronic communication is  
> intended only
> for the person or entity to whom it is addressed and may contain  
> confidential
> and/or privileged material. Any review, retransmission,  
> dissemination or other
> use of or taking of any action in reliance upon this information by  
> persons or
> entities other than the intended recipient is prohibited. If you  
> received this
> information in error, please contact the Compliance HelpLine at  
> 800-856-1983 and
> properly dispose of this information.
>
>
Received on Sunday, 27 April 2008 13:59:23 UTC