Re: [TMO] coding XML patient data in the Indivo schema from conor dowling on 2010-07-21 (public-semweb-lifesci@w3.org from July 2010)

From: conor dowling <conor-dowling@caregraf.com>
Date: Wed, 21 Jul 2010 09:23:13 -0700
To: "Eric Prud'hommeaux" <eric@w3.org>
Cc: public-semweb-lifesci@w3.org, Ben Adida <ben@adida.net>, "Zabak, Steve" <Steve.Zabak@childrens.harvard.edu>, Thomas Gambet <tgambet@w3.org>, Michel_Dumontier <Michel_Dumontier@carleton.ca>
Message-ID: <AANLkTikxI6JMPySl9mb92Tg1SdakNrbGtG0yJnnNaq-W@mail.gmail.com>

> > take a graph of patient data, it chief
> > nodes, its bnodes and subordinate nodes and serialize into some flavor of
> > schema'ed XML.
>
> is it reasonable to do that with SPARQL queries and then an XSLT of
> the resulting XML structure?
>

Eric,

yep. You can walk a graph of data picking up the observations and findings,
serializing each to some canonical XML and then XSLT that to whatever.

SPARQL an EHR: I have an endpoint called 'NextVistA' (
http://nextvista.caregraf.org/ ) that has raw data from a VA VistA. It
follows the VistA scheme and it let's you see/SPARQL EHR data in the raw.)

But the meat-in-the-sandwich is translation, mainly code translation, from
the system used in the graph to that required in the serialization.
Generally systems do code - they don't just have literals. They identify
from enumerations - a country code, not the name of a country, a drug code,
not ... etc. How much translation depends on how much of a code mismatch
there is.

Let's say a system and so its graph has its own set of codes for "functional
status" - can someone walk or are they bed-ridden etc. Just dumping these
ids as-is is close to useless. It's moving bits. A third system won't
understand these identifiers. So the extraction process has to translate.

This browser demo - http://vista.caregraf.org/veteranChristopherTIE.html -
queries a live VistA (the VistA supports a SPARQL subset dubbed FMQL) and
then tries to translate that EHR data to canonical equivalents.

We tend to get fixated on form - this or that xml schema (I'm in that with -
"oh let's have a CCD") - and not on content (what's in the CCD - or
equivalent. Is it coded and how?).

The key questions are - how coded is the data? If you like, the more
literals you see, the worse the data. And what codes? In-house - who else
can understand these? Enumerations like CVX vaccine codes or NDC codes.
Well, these lack structure. They don't entail like their SNOMED peers. Or do
you have the LOINCs and SNOMEDs and RxNORMs etc. Best.

Ok I rambled. Yes: SPARQL over a graph (and translate?) -> canonical XML
---- [XSLTs] ---> XML you need.

Conor

>
> > Conor
> >
> > --
> > Caregraf promotes "Semantic Health":  http://www.caregraf.org
> >
> > On Sun, Jul 18, 2010 at 7:56 PM, Eric Prud'hommeaux <eric@w3.org> wrote:
> >
> > > Thomas Gambet and I have been transforming the XML patients (ordinary
> > > citizens like you and me, tragically afflicted with XML) to follow the
> > > Indivo schema. Indivo uses a bunch of small schemas to represent
> > > e.g. contacts and problems, so we've put together an envelope schema
> > > which references the Indivo schema for most of its meaty data. We
> > > still have some coding to go, but folks can go take a look at
> > >  data:
> > >
> > >
> http://dvcs.w3.org/hg/TMO-Indivo/file/tip/syntheticPatients/AD_PCHR_1-indivo.xml
> > >  schema:
> > >
> > >
> http://dvcs.w3.org/hg/TMO-Indivo/file/tip/syntheticPatients/indivo-schemas/envelope.xsd
> > >
> > > Places where the envelope schema reference other schmeme types, e.g.
> > >  <xs:complexType name="PrescriptionsType">
> > >        <xs:sequence>
> > >                <xs:element name="Prescription"
> type="indivo:Prescription"
> > > minOccurs="0" maxOccurs="unbounded"/>
> > >        </xs:sequence>
> > >  </xs:complexType>
> > > , have been mapped to Indivo. Places where we have lots of elements
> > > defined didn't have a pre-existing Indivo schema. Lots of thanks to
> > > Thomas for working on this stuff.
> > >
> > > Folks in the HCLS IG have commit privileges on this Mercurial
> > > repository. Once we finish coding the patient encounters, we'll get
> > > serious about mapping out the patient RDF ontology. The XSLT we use
> > > for this will also be useful for mapping anyone's Indivo data to RDF.
> > >
> > > Thoughts? Suggestions?
> > > --
> > > -ericP
> > >
> > >
>
> --
> -ericP
>

Received on Wednesday, 21 July 2010 16:23:51 UTC