Benefits of RDF/SPARQL [Was Re: Using RDF, Datalog, OWL-RL and a "RIM lite ERPA Ontology" to calculate HEDIS Quality measures] from David Booth on 2015-09-30 (public-semweb-lifesci@w3.org from September 2015)

From: David Booth <david@dbooth.org>
Date: Wed, 30 Sep 2015 10:49:37 -0400
To: Peter.Hendler@kp.org
Cc: its@lists.HL7.org, public-semweb-lifesci@w3.org
Message-ID: <560BF681.7070002@dbooth.org>

Excellent!   Let's talk offline about scheduling.

Also, a question: what benefits did you experience in using an 
RDF/SPARQL approach as opposed to a relational/SQL approach?  (Playing 
devil's advocate) Wouldn't the query that you described have been 
relatively simple in SQL?   If not, why not?

Thanks,
David Booth

On 09/29/2015 11:30 PM, Peter.Hendler@kp.org wrote:
> Even better.  I can just present it to you on a webex or another kind of
> call.
>
> From: David Booth <david@dbooth.org>
> To: Peter Hendler/CA/KAIPERM@KAIPERM
> Cc: its@lists.HL7.org, public-semweb-lifesci@w3.org
> Date: 09/29/2015 03:15 PM
> Subject: Re: Using RDF, Datalog, OWL-RL and a "RIM lite ERPA Ontology"
> to  calculate HEDIS Quality measures
> ------------------------------------------------------------------------
>
> Excellent!   Unfortunately I will miss your presentation.  Can we get
> your slides?  Even better, if your presentation is recorded that would
> be awesome.
>
> Thanks,
> David Booth
>
> On 09/29/2015 05:38 PM, Peter.Hendler@kp.org wrote:
>  > At KP, and working with Ian Horrock's group at Oxford, we have been
>  > experimenting with their new RDF, Datalog, OWL-RL triple store called
>  > "RDFox".
>  >
>  > We have calculated the HEDIS Diabetes quality measure on a population of
>  > over 400,000 patients real data.
>  >
>  > We still have to compare our numerators and denominators to results
>  > calculated with SQL and traditional DB tables.
>  >
>  > I will be presenting a very simple version of this at HL7 at the AID
>  > work group in Atlanta on Monday Q3.
>  >
>  > I believe this is the first time a complex HEDIS quality measure has
>  > been calculated with RDF, OWL and Datalog and SPARQL on a large
>  > population of real patients.
>  >
>  > I will not be presenting the complete complex HEDIS measure (which would
>  > take days), but a smaller example to explain how it all works.
>  >
>  > We used SNOMED subsumption to generate a small value set of SNOMED codes
>  > that are "kinds of Diabetes".  Using that SNOMED VS, we found all the
>  > patients who had a visit coded for Diabetes.  Then we searched all of
>  > their HgBA1C values and then found the "last value".  We could then look
>  > at the numerical results of the HgBA1C and find how many of them were
>  > below 7% (good control).
>  >
>  > In order to do this we had previously created an OWL ontology based on
>  > Entities in Roles that Participate in Acts.  It is not the full HL7 V3
>  > RIM, but only what was needed for this exercise.
>  > This "KCOM" model is what we presented before at HL7 AID meetings.
>  > This entire project would not have been possible to do without first
>  > mapping the raw clinical data to this ERPA OWL backbone ontology.  All
>  > of our queries were based on this ERPA (Entities in Roles Participating
>  > in Acts).
>  >
>  > RDFox is multi threaded and we were able to run the data materialization
>  > on 8 threads on an 8 core machine with 64 Gig RAM.  It ran in only a few
>  > hours and we have already found ways to speed it up further.
>  >
>  > Hope to see you at HL7 Atlanta.
>  >
>  > *NOTICE TO RECIPIENT:* If you are not the intended recipient of this
>  > e-mail, you are prohibited from sharing, copying, or otherwise using or
>  > disclosing its contents.  If you have received this e-mail in error,
>  > please notify the sender immediately by reply e-mail and permanently
>  > delete this e-mail and any attachments without reading, forwarding or
>  > saving them.  Thank you.
>  >
>  >
>
>
>

Received on Wednesday, 30 September 2015 14:50:05 UTC