- From: Chimezie Ogbuji <ogbujic@ccf.org>
- Date: Wed, 12 Sep 2007 08:52:00 -0400
- To: wangxiao@musc.edu
- cc: public-semweb-lifesci@w3.org, "Alan Ruttenberg" <alanruttenberg@gmail.com>, "Vipul Kashyap" <VKASHYAP1@partners.org>, "Andersson, Bo H" <Bo.H.Andersson@astrazeneca.com>, "Landen Bain" <lbain@topsailtech.com>, "Rachel Richesson" <Rachel.Richesson@epi.usf.edu>, public-hcls-dse@w3.org, "Stanley Huff" <Stan.Huff@intermountainmail.org>, "Yan Heras" <Yan.Heras@intermountainmail.org>, "Oniki, Tom (GE Healthcare, consultant)" <Tom.Oniki@ge.com>, "Joey Coyle" <joey@xcoyle.com>, "Bron W. Kisler" <bkisler@earthlink.net>, "Ida Sim" <sim@medicine.ucsf.edu>
On Wed, 2007-09-12 at 09:31 +0100, Xiaoshu Wang wrote: > You SHOULD not choose and you have to use open world reasoning because > how someone can tell which part of the world is closed and which part is > not. Sorry, Xiaoshu, but I don't agree that you *have* to use open world reasoning. That suggests that there is an underlying assumption that medical record content (or bioinformatic content in general) is a natural fit for monotonic logics such as OWL. This is not entirely the case, and as we know medical record content is often inconsistent. The choice isn't binary. You *can* store your data as RDF, perform monotonic inference (and querying matching) where it makes sense to *and* perform non-monotonic inference (i.e., default negation/negation as failure/closed world assumptions) where it make sense to as well. Both of the RDF querying languages I am familiar with (and use on a daily basis to match against medical record datasets) do not require entailment (Versa and SPARQL). Default negation (closed world assumptions) only come into play where the query asks for the absence of an assertion and where logical entailment is understood to apply. In SPARQL, the combined use of FILTER/!/BOUND effectively gives you a mechanism for matching records with non-monotonic mechanisms without an entailment regime. This is how we are able to *explicitly* ask for the absence of an assertion based only on what the RDF dataset has in persistence. > > For example, in pharmacy data, if the patient record does not mention > > a drug, we can be reasonably sure that the patient is not on that drug > > -- a case for closed world reasoning, whereas for other datasets such > > as lab or radiology, often things are explicitly asserted to be > > negative if not present, for example, negative MRSA results, hence > > requiring an open world reasoning approach. > Let's use you example. According to your logic, if someone says that > > _:someone a pha:Patient; > pha:medicine pha:aspirin. > > It triggers a closed world reasoning so that no more properties exist. This is not true. In a CWA scenario, where you don't have an assertion P(X,Y), you imply ~ P(X,Y) where it is understood that the '~' is a different operator than the 'classic' *not* operator used in OWL and monotonic logics. I think a proper definition of scoped negation as failure would help show how SPARQL can be used to match the absence of an assertion against an RDF dataset that can also be subject to open world assumption s at the same time: [[ Related to the notion of scoped inference is an extension of the concept of default negation, called scoped default negation. The idea is that the default negation inference rule must also be performed within the scope of an explicitly specified knowledge base. That is, not q is said to be true with respect to a knowledge base K if q is not derivable from K. ]] -- "A Realistic Architecture for the Semantic Web" [1] In the case of SPARQL, the RDF dataset is the knowledge base. So, you can have an open world 'view' on the triples above while at the same time ask questions such as: SELECT ?otherMeds WHERE { ?patient a pha:Patient. OPTIONAL { ?patient pha:medicine pha:betablocker; pha:medicalRecordNumber ?no } FILTER (!BOUND(?no)) } To explicitly match the absence of an assertion without having to do convoluted things like introduce an epistemic operator to OWL, etc.. I think the most important first question is if entailment is necessary at the point of query. If it isn't, then you don't necessarily have a OWA/CWA conflict. We've been able to get pretty decent mileage out of entailment-free SPARQL evaluation. However, this does not prevent us from 1) performing monotonic DL-inference and 2) using rules where the expressiveness of OWL is insufficient over the *same* dataset. > But do you mean that _:someone does not have a birthday or doesn't have > a name either? I sincerely doubt that is what you want. This is an unfair characterization. In most non-monotonic KBs, you explicitly assert what you know (and what is relevant for the questions you most likely will ask) and leave the derivation of negation to the default negation rule. So, at the point of query, you will have an idea of what is explicitly asserted. Open world assumption works in an web environment where you may not know what is explicitly asserted, but I don't think medical records (especially curated medical records) should be thought of in the same way. They are typically populated via very controlled mechanisms and are subject to various policies over the nature of the content (especially where the data feeds research). > If you want to imply specifically that there is no more > pha:medicine, > you should design your ontology accordingly. > For instance, making the > pha:medicine to range over an rdf:List. Or design another property > say, ... or use a closed, value partition [2] > pha:numOfMedicine and uses rules to suggest that the numOfMedicine > must > be consistent with the pha:medicine applied to a given person. > But do not embed closed world reasoning into your ontology. > Otherwise, > you break the foundation of RDF. The suggestion (at least mine) isn't to 'embed' CWAs in an ontology (in fact this is not possible given the nature of OWL), but rather to allow a scenario where you can use either assumption when appropriate. The tools we have at our disposal allow us to have our cake and eat it too. [1] ftp://ftp.cs.sunysb.edu/pub/TechReports/kifer/msa-ruleml05.pdf [2] http://www.w3.org/TR/swbp-specified-values/ -- Chimezie Ogbuji Lead Systems Analyst Thoracic and Cardiovascular Surgery Cleveland Clinic Foundation 9500 Euclid Avenue/ W26 Cleveland, Ohio 44195 Office: (216)444-8593 ogbujic@ccf.org =================================== Cleveland Clinic is ranked one of the top hospitals in America by U.S. News & World Report (2007). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
Received on Wednesday, 12 September 2007 12:52:37 UTC