W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > March 2012

follow-up biosurveillance meeting: Friday March 30

From: Eric Prud'hommeaux <eric@w3.org>
Date: Fri, 30 Mar 2012 09:05:13 -0400
To: "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Cc: "Mead, Charlie (NIH/NCI) [C]" <meadch@mail.nih.gov>, "cecil.o.lynch@accenture.com" <cecil.o.lynch@accenture.com>
Message-ID: <20120330130511.GA4188@w3.org>
Note below the follow-up on Fri 30 March 11AM EST on #hcls.

We'll be digging more deeply into the SemWeb mechansims used to
classify and dispatch incident (in this case, tests or diagnoses of
TB) reports.

Conference Details:

Date of Call: Friday, March 30, 2012
Time of Call: 11:00 am Eastern Time, 3 pm UK, 4 pm CET
Dial-In #: +1.617.761.6200 (Cambridge, MA)
VoIP address: sip:zakim@voip.w3.org
Participant Access Code: 4257 ("@@HCLS")
IRC Channel: irc.w3.org<http://irc.w3.org> port 6665 channel HCLS (see W3C IRC page for
details, or see Web IRC), Quick Start: Use
  http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls
Duration: ~1 hour
Convener: Charlie Mead
Scribe: TBD


* Eric Prud'hommeaux <eric@w3.org> [2012-03-27 17:59-0400]
> http://www.w3.org/2012/03/27-HCLS-minutes
> 
>                             HCLS Health Care Domain
> 
> 27 Mar 2012
> 
>    See also: [2]IRC log
> 
> Attendees
> 
>    Chair
>           Charlie Mead
> 
>    Scribe
>           ericP
> 
>    Special Guest Star
>           Cecil Lynch
> 
>    Follow-up Meeting
>           Fri 30 March 11AM EST on #hcls
> 
> Contents
> 
>      * [3]OWL in CDC's Tuberculosis Surveillance/Response
>      __________________________________________________________________
> 
>    [slide 3]
> 
>    <Joanne_Luciano> slides aren't numbered :-(
> 
>    <Joanne_Luciano> ah, but the browser numbers them!
> 
>    <egombocz> If you look at them not in show mode, you can see the
>    numbers on the side thumbnails
> 
>    Cecil: antibiotic-resistent airline passenger promted review on
>    Tuberculosis Information Management System (TIMS)
>    ... reporting a TB case required passing a brittle set of messaging and
>    business rules
> 
>    [slide 4: Message Processing Integration]
> 
>    Joanne_Luciano: each state wanted their own standard?
> 
>    Cecil: CDC wanted a standard
>    ... states would take anything which makes reporting easier
>    ... [re: slide 4]
>    ... choices about how to import messages to CDC
>    ... .. after message had some processing
>    ... .. as a Web Service RPC
> 
>    [slide 5: Deployment Architecture]
> 
>    Cecil: going with existing CDC infrastructure
>    ... staring from left:
>    ... .. some source, usually state or large counties (53 jurisdictions)
>    reports
> 
>    <Joanne_Luciano> is going with the CDC one of those three options on
>    slide 4 or is it another one (not listed on slide 4)?
> 
>    Cecil: .. goes into data messaging broker, which validates syntax
> 
>    <Joanne_Luciano> looks like it's option 1 on slide 4
> 
>    Cecil: .. if a valid TB message, off to content validation queue
>    ... .. also split into components for e.g. line listing of incoming
>    cases
>    ... .. after validation, email with contents of alert sent to CDC's TB
>    group
> 
>    Joanne_Luciano: this is slide 3 option 1?
> 
>    Cecil: this is slide option 3 (RPC)
>    ... we had tried driving real-time alerting from biosense
>    ... we took messages off the first transport, never queued in DMB
>    [slide 5 left]
>    ... the HL7 2.x standard is fairly loose
>    ... flexible, can take any payload
>    ... can be structured in any way
>    ... segments are well-defined, but segment structure requires point to
>    point negotiation
>    ... p2p neg is a guideline
> 
>    charlie: HL7 2.x is a syntactic standard and a semantics guideline
> 
>    [slide 6: Message Content Validation Architecture]
> 
>    <Joanne_Luciano> JMS?
> 
>    Cecil: after leaving broker, falls into JMS interface
>    ... because this has the 2.5 validation, we don't need the 2.x
>    syntactic validation
>    ... so we don't do the validation
>    ... before we went live, we validated and found 2 errors in HL7
>    messaging
>    ... (was a benefit of 2-tier validation)
>    ... once live, we don't do syntacit validation
>    ... but we do parse out components
>    ... questions like birthday and date of problem were found via OBX
>    extractions
>    ... an OWL ontology tells us how to process a message
>    ... the ontology links all the knowledge
>    ... it guides parsing the message by aligning the OBX-extracted facts
>    with an RDF graph
>    ... we can then use the JESS reasoner for evaluating these facts
>    ... JESS (Java Expert System Shell) is a rules FW/BW chaining rules
>    engine
>    ... has a protege plugin, interprets SWRL
>    ... good commercial tool for high-volume processing
>    ... paid for by tax dollars, only free for government use
>    ... $75K otherwise
> 
>    <Stuart> Drools
> 
>    <iker> DROOLS
> 
>    <mr_sticky> Drools is from JBoss
> 
>    <mr_sticky> [4]http://www.jboss.org/drools
> 
>    Cecil: we tried Drools, which has FW/BW chaining and similar fact
>    structure
>    ... use JESS if you're processing millions of facts
> 
>    Joanne_Luciano: and Jena?
> 
>    Cecil: no experience with it
>    ... at OTR, we pass what we expect to see and what we got as two graphs
>    ... the choreography of the OTR framework works out that something is a
>    question about an e.g. resistance pattern of anitbiotic
>    ... we have a set of "listeners" (patterns)
>    ... we built this on V3 semantics, but mapped back to V2 syntax
>    ... once we've matched the graph against the patterns, we pass it to
>    jess
>    ... we give jess the profile for an e.g. normal patient, MDR (multi
>    drug resistant) patient, XDR (extensive drug resistant) (potential
>    super-spreader)
>    ... the reasoning framework decides if an event needs action
>    ... another listener strains through alerts from JESS for outbound
>    messaging
>    ... we also use the output for visualization
>    ... folks don't need to need to use SAS to extract this data from
>    mid-tier, instead just using graph representations
>    ... with agreement from CDC, we could have sent output messages back to
>    reporters
>    ... output:
>    ... .. drug resistant
>    ... .. appropriateness of drugging (per WHO codes)
>    ... .. predictive analysis of whether someone is likely to fall off
>    treatment based on patient history
> 
>    [slide 7: Types of problems that could be solved by extending the TB
>    framework]
> 
>    Cecil: had to bend to time and budget limitations
>    ... we could have added a d2rq interface to retrofit the pre-existing
>    data
>    ... a lot we could have done
> 
>    [slide 8: The use of an OWL ontology]
> 
>    Cecil
> 
>    [slide 9: HL7 Message Artifact Taxonomy]
> 
>    Cecil: this is how we mapped the OBX structure to the ontology
> 
>    [slide 11: Rule Processing]
> 
>    [slide 12: Message Content Validation Rule Implementation]
> 
>    Cecil: this demonstrates the advantage of using OWL
>    ... the blue is what we deleted
>    ... (from TIMS)
>    ... went from 358 to 175
>    ... reduces frustration of reporters facing conflicting rules
>    ... beyond OWL being able to do syntax, vocabulary, rule processing, we
>    see the advantage of declarative rules
> 
>    [slde 13: Message Content Validation Rules]
> 
>    Cecil: with tons of volume and response time requirements, you need a
>    more efficient bw-chaining system (JESS)
> 
>    [slide 14: Message Content Validation Results View]
> 
>    Cecil: sample output
> 
>    [slide 15: Processing Results]
> 
>    Cecil: average processing time 3.5s round trip
>    ... far faster than a human, and more accurate
>    ... scales up to ~350k messages/day
>    ... ~300K TB messages/year
>    ... could scale to influenza
>    ... at worst case (4 month window), 50-75M, so ~ 200K message/day
>    ... in a surveillance, you're also looking at folks who don't have it
>    ... feeds from 800 VA hospitals, + laps a quest and labcore, ...
>    ... congress says we need response in 2 mins
>    ... had to put everything in memory
>    ... biosense lost funding
> 
>    mscottm: summary of SemWeb advantages is very different from our usual
>    tech demos in HCLS
>    ... what are your SemWeb wins?
>    ... what could be improved?
> 
>    charlie: would like formal continuation
>    ... to help us find focal points in HCLS
> 
>    Cecil: SemWeb is a flexible way to extract knowledge
>    ... we were given a TB messaging system and a deadline
>    ... 7 days before deadline, CDC said we'd like to upgrade a 1.2 of our
>    implementation guideline
>    ... had around 35 new rules and 100 terminology changes
>    ... because everything CDC gave us was in the OWL. expected to do it in
>    4 days
>    ... made it on 4 days with no additional charge to CDC
>    ... big commercial motivation is the flexibility at responding to
>    rapidly changing knowledge
>    ... at NCI, i wanted to build an EMR system
>    ... NCO SHARP projects kind of get to this
>    ... win 1: rapid software engineering
>    ... win 2: rule validation
>    ... win 3: can infer things that a human has problems inspecting
> 
>    <mscottm> Nice to hear that experience in the field confirms my main
>    sales pitch about advantage of SemWeb tech for software: easier
>    maintenance and change, agile development, effectively lower cost.
> 
>    Cecil: .. (large systems (e.g. BRIDG's UML) hard to swap into a brain)
>      __________________________________________________________________
> 
> 
>     Minutes formatted by David Booth's [5]scribe.perl version 1.136
>     ([6]CVS log)
>     $Date: 2012/03/27 21:57:08 $
> 
> References
> 
>    1. http://www.w3.org/
>    2. http://www.w3.org/2012/03/27-HCLS-irc
>    3. http://www.w3.org/2012/03/CSTE_TB.ppt
>    4. http://www.jboss.org/drools
>    5. http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
>    6. http://dev.w3.org/cvsweb/2002/scribe/
> 
> -- 
> -ericP

-- 
-ericP
Received on Friday, 30 March 2012 13:05:47 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:01:06 GMT