Re: [BIONT-DSE] Inclusion versus exclusion criteria from Kavitha Srinivas on 2007-09-12 (public-semweb-lifesci@w3.org from September 2007)

From: Kavitha Srinivas <ksrinivs@gmail.com>
Date: Wed, 12 Sep 2007 13:37:50 -0400
To: "Kashyap, Vipul" <VKASHYAP1@PARTNERS.ORG>
Cc: <wangxiao@musc.edu>, "Alan Ruttenberg" <alanruttenberg@gmail.com>, "Andersson, Bo H" <Bo.H.Andersson@astrazeneca.com>, "Landen Bain" <lbain@topsailtech.com>, "Rachel Richesson" <Rachel.Richesson@epi.usf.edu>, "public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>, <public-hcls-dse@w3.org>, "Stanley Huff" <Stan.Huff@intermountainmail.org>, "Yan Heras" <Yan.Heras@intermountainmail.org>, "Oniki, Tom (GE Healthcare, consultant)" <Tom.Oniki@ge.com>, "Joey Coyle" <joey@xcoyle.com>, "Bron W. Kisler" <bkisler@earthlink.net>, "Ida Sim" <sim@medicine.ucsf.edu>
Message-Id: <EE330C5D-A5B5-4E79-8299-C4B401ADB42D@gmail.com>

> [VK] It will be great if you could share specific examples of some  
> criteria that
> were not expressible in SQL. We can then incorporate those into the  
> use
> case and help make a case for SW technologies. On the other hand,  
> taking a quick
> look at the SHER project at IBM, looks like you are using a  
> polynomial time
> reasoner (CEL) for the matching. I may be mistaken, but my initial  
> sense is that
> any CEL expression is likely expressed in SQL/Relational Algebra or  
> vice versa.
Just a quick correction -- the SHER reasoner is different from the  
CEL reasoner, because it is built on
the standard tableau algorithm (internally SHER uses Pellet).  It  
supports the SHIN subset of DL
(in OWL DL terms, no nominals).

So for instance, SHER handles cardinality constraints which can  
change the
nature of the graph that is stored in the relational DB.
E.g., a R b (a has an R relationship to b)
and a R c, with a maximum cardinality of 1.
Lets say b has a P edge to d.  The reasoner will merge b and c to be  
the same node in the graph.
Let's say you now want to know if c has a P edge to something.   A  
simple SQL query will not be able to find this edge because it is a  
function of the merger
that happened in the process of reasoning.  That's just a general  
example.

In the clinical trials data, we model negations in the lab data  
(e.g., lab results ruled out the presence of an organism, A) as  
saying that for this particular lab event, any
causative agent it might have cannot be A.  In DL terms, this is a  
universal restriction, that propagates a concept (not A) along the  
causative Agent edge.  If you now want to find
a lab event which indicated the presence of some Agent (X) and not A,  
you will again miss things using SQL, because all you will have in  
the actual database is that a lab event has a causative Agent X, and  
the lab Event is a member of a universal restriction  
forAll.causativeAgent.not(A).  One might argue that you can do  
syntactic checks on it etc., but it gets hairy quite fast when you  
consider that the negation may be on a concept that is itself a  
complex concept (e.g., a radiological report ruled out the presence  
of a colon neoplasm).

Hope this helps?
Kavitha

On Sep 12, 2007, at 11:30 AM, Kashyap, Vipul wrote:

>
>> However, if someone is not explicitly asserted to be on
>> some prescription drug, it is fair to assume that they are not taking
>> the drug (closed world assumption).
>
> [VK] The key issue is how well this assumption is likely to work in  
> practice.
> Guess we need some experimentation to get at this.
>
>> 2.  I tend to think this comes from an understanding of the domain
>> (unfortunately), and what you are modeling rather than the data
>> characteristics per se.
>
> [VK] I agree that whether you need to use OWA/CWA come from an  
> understanding of
> the domain. However, sometimes it could also be an artifact of the  
> data
> representation scheme. For instance, in Chintan's example above,  
> one could have
> negative assertions for drugs, i.e., patient not on drug X, in  
> which case one
> would use OWA instead of CWA.
>
>> In terms of whether you can do this using SQL querying
>> alone, based on our experience, its unlikely.  The problem is that
>> the types of clinical exclusion and inclusion criteria we saw on
>> clinicalTrials.gov cannot be easily reduced to SQL querying (at least
>> with the structured medical records we got from Columbia).  From
>> discussions with other institutions, we know this isn't unique to
>> Columbia (i.e., there is a substantial "semantic gap" between what's
>> in the structured record and what is being queried by investigators
>> for clinical trials).
>> this information.
>
> [VK] It will be great if you could share specific examples of some  
> criteria that
> were not expressible in SQL. We can then incorporate those into the  
> use
> case and help make a case for SW technologies. On the other hand,  
> taking a quick
> look at the SHER project at IBM, looks like you are using a  
> polynomial time
> reasoner (CEL) for the matching. I may be mistaken, but my initial  
> sense is that
> any CEL expression is likely expressed in SQL/Relational Algebra or  
> vice versa.
>
> ---Vipul
>
>
> The information transmitted in this electronic communication is  
> intended only for the person or entity to whom it is addressed and  
> may contain confidential and/or privileged material. Any review,  
> retransmission, dissemination or other use of or taking of any  
> action in reliance upon this information by persons or entities  
> other than the intended recipient is prohibited. If you received  
> this information in error, please contact the Compliance HelpLine  
> at 800-856-1983 and properly dispose of this information.
>

Received on Wednesday, 12 September 2007 17:38:02 UTC