- From: David Booth <david@dbooth.org>
- Date: Thu, 22 Jul 2021 20:33:59 -0400
- To: "its@lists.hl7.org" <its@lists.HL7.org>, w3c semweb HCLS <public-semweb-lifesci@w3.org>
Minutes from today's teleconference are here:
https://www.w3.org/2021/07/22-hcls-minutes.html
and also below in plain text.
David Booth
--------------------------------------------------
Attendees
Present
Brad Simons, Darrell Woelk, David Booth, Emily Pfaff,
Gaurav Vaidya, Gerhard kober, Gopi Chandrasekharan,
Guoqian Jiang, Sajjad Hussein, Samson Tu
Chair
David Booth
Scribe
dbooth
Contents
1. [3]Introductions
2. [4]UNC work with FHIR RDF -- Emily Pfaff
3. [5]Desired R5 features
4. [6]Summary of action items
Meeting minutes
Introductions
Samson: Bio ontologies
Guoqian: Mayo Clinic, FHIRCat project.
UNC work with FHIR RDF -- Emily Pfaff
Emily's slides:
https://lists.w3.org/Archives/Public/www-archive/2021Jul/att-0005/5-15-RDF_FHIR.pptx
emily: Assistant Prof UNC Chapel Hill, clinical informatics
background. Used FHIR RDF in two projects.
emily: Joining FHIR data with external datasets.
… Main interest in computable phenotyping -- idenitify cohorts
of patients based on inclusion/exclusion criteria.
Traditionally those criteria are defined by clinicians, but
computable phenotyping means translating those criteria into
computable code to identify massive numbers of candidates
quickly.
… But there are big difference between datasets from different
institutions.
… In an ont perfect work, I could pull all patients related to
COVID-16, but that's not current reality.
… What would it take to write one script that would work in
multiple institutions and result in consistent and accurate
cohort?
… This is where I brought in FHIR RDF and SNOMED and other data
models.
… Without ont we need to use exact code matches. This is
especially bad with ICD-9.
emily: Opioid triplestore project investigated patterns among
patients that had surgery and got opioids. Wanted to find out
what might lead to dependence on opioids.
… Med data is very hard to deal with in EHRs. The EHR mainly
only tells you that a med was prescribed, not if the pt picked
it up or even took it or how much.
… We tried to join insurance data to at least find out if the
pt picked up the prescription. Also wanted to find out if any
pts got their meds from outside UNC network.
… But ins data looks very different from EHR data. Brought in
FHIR RDF.
… Selected a cohort, converted to FHIR R4, then converted that
to RDF.
… Then did the same thing w ins data. Needed a custom
conversion to FHIR (using python).
… Also linked in some public data, such as unemployment data in
the pt's region.
… This allowed us to find hundreds more patients that we
otherwise could.
… Determined that it was worth the effort.
sajjad: What was the value prop to going into RDF?
emily: Because we wanted to bring in public datasets, in
addition to the ins and EHR data, RDF gave us a common
denominator.
brad: I don't think you could have done this without RDF,
because you need inference also.
darrell: Did you use SPARQL? Emily: Yes.
darrell: Papers? Emily: Yes, in submission, but will also share
the submitted work.
sajjad: When converting to RDF, inference comes to mind. For
public datasets, for maintainability, this was done under R4,
when changes come you'll need some tuning. But also the public
datasets might make changes also. Scalability? Maintenance
issues?
emily: Don't know yet, because this was a one-time pilot.
… We tried to build the triplestore in a way that allowed is to
easiliy rebuild it.
sajjad: Another motivating factor for RDF, might be that you
could use sameAs relations instead of refurbishing the whole
thing.
guoqian: You have FHIR RDF and non-FHIR RDF data. How did you
link them?
emily: Had to make custom predicates to link them. When we did
the project I was very new to RDF. Needed a relation between
census tract and the county where the pt lives. Some datasets
are at either level. Wanted to infer county based on census
tract, so we build custom relationships.
darrell: HL7 defined CQL language that maps to FHIR, for
quality measures. Wondering if they can be expressed in FHIR
RDF.
guoqian: Previously looked at translating CQL to SPARQL.
Paper from Guoqian on CQL and SPARQL: [7]http://
www.swat4ls.org/wp-content/uploads/2017/11/
SWAT4LS-2017_paper_40.pdf
[7]
http://www.swat4ls.org/wp-content/uploads/2017/11/SWAT4LS-2017_paper_40.pdf
emily: Second project was an extension of the first.
… Wanted to see if adding SNOMED ont.
… Looked at depreseeion and rheumatoid arthritis. Wanted to
compare the coverage of computable phenotype defined by ICD0-10
codes vs using SNOMED or HPO ont.
… But we could not interview patients to find out their actual
results, so we ended up measureing the degree of overlap.
… But SNOMED to ICD-10 mapping files are not machine
processable.
… Also, for ICD-10 there are a lot of required digits, but
SNOMED puts in xx?, which will not directly match.
… We ended up removing the concept mappings because we couldn't
use them, and used the xx? rules.
… Ended up having a lot of SNOMED codes mapped from a single
ICD-10 code.
… Used OWL reasoning, added HPO ontology, then ran SPARQL
queries. Some queries looked for exact matches, others used
inference.
… SNOMED and ICD-10 have very different ideas of what codes
constituted depression. Best cohort might be the superset of
cohorts found by both techniques.
… For fheum arth we discovered an error in the SNOMED ont, due
to a missing subclass relation. They were missing knees,
wrists, hips and ankles!
… Overall utility of this work is to help inform researchers
about their choice of codes to use.
guoqian: For triplestore, what codes do you use?
emily: We had to use the mappings, because most EHRs do not use
SNOMED -- they use ICD-10. But that's a huge limitation,
because the phenotype can only be as good as the ICD-10 code.
Desired R5 features
[8]https://github.com/w3c/hcls-fhir-rdf/issues/69
[8] https://github.com/w3c/hcls-fhir-rdf/issues/69
brad: the long property names were a problem for us, because we
could not go beyond 4 levels.
[9]https://github.com/w3c/hcls-fhir-rdf/issues/75
[9] https://github.com/w3c/hcls-fhir-rdf/issues/75
david: Harold drafted a list of things we may want to change in
FHIR RDF R5, but he's on vacation and I have not been able to
find it!
ADJOURNED
Received on Friday, 23 July 2021 00:34:12 UTC