- From: David Booth <david@dbooth.org>
- Date: Mon, 26 Jul 2021 16:55:48 -0400
- To: "its@lists.hl7.org" <its@lists.HL7.org>, w3c semweb HCLS <public-semweb-lifesci@w3.org>
Canceling this week's teleconference, due to multiple vacations. Thanks, David Booth -------- Forwarded Message -------- Subject: Re: FHIR RDF - 11am (Boston) Thur July 22 - UNC's work on FHIR RDF; Objectives for R5 Date: Thu, 22 Jul 2021 20:33:59 -0400 From: David Booth <david@dbooth.org> To: its@lists.hl7.org <its@lists.HL7.org>, w3c semweb HCLS <public-semweb-lifesci@w3.org> Minutes from today's teleconference are here: https://www.w3.org/2021/07/22-hcls-minutes.html and also below in plain text. David Booth -------------------------------------------------- Attendees Present Brad Simons, Darrell Woelk, David Booth, Emily Pfaff, Gaurav Vaidya, Gerhard kober, Gopi Chandrasekharan, Guoqian Jiang, Sajjad Hussein, Samson Tu Chair David Booth Scribe dbooth Contents 1. [3]Introductions 2. [4]UNC work with FHIR RDF -- Emily Pfaff 3. [5]Desired R5 features 4. [6]Summary of action items Meeting minutes Introductions Samson: Bio ontologies Guoqian: Mayo Clinic, FHIRCat project. UNC work with FHIR RDF -- Emily Pfaff Emily's slides: https://lists.w3.org/Archives/Public/www-archive/2021Jul/att-0005/5-15-RDF_FHIR.pptx emily: Assistant Prof UNC Chapel Hill, clinical informatics background. Used FHIR RDF in two projects. emily: Joining FHIR data with external datasets. … Main interest in computable phenotyping -- idenitify cohorts of patients based on inclusion/exclusion criteria. Traditionally those criteria are defined by clinicians, but computable phenotyping means translating those criteria into computable code to identify massive numbers of candidates quickly. … But there are big difference between datasets from different institutions. … In an ont perfect work, I could pull all patients related to COVID-16, but that's not current reality. … What would it take to write one script that would work in multiple institutions and result in consistent and accurate cohort? … This is where I brought in FHIR RDF and SNOMED and other data models. … Without ont we need to use exact code matches. This is especially bad with ICD-9. emily: Opioid triplestore project investigated patterns among patients that had surgery and got opioids. Wanted to find out what might lead to dependence on opioids. … Med data is very hard to deal with in EHRs. The EHR mainly only tells you that a med was prescribed, not if the pt picked it up or even took it or how much. … We tried to join insurance data to at least find out if the pt picked up the prescription. Also wanted to find out if any pts got their meds from outside UNC network. … But ins data looks very different from EHR data. Brought in FHIR RDF. … Selected a cohort, converted to FHIR R4, then converted that to RDF. … Then did the same thing w ins data. Needed a custom conversion to FHIR (using python). … Also linked in some public data, such as unemployment data in the pt's region. … This allowed us to find hundreds more patients that we otherwise could. … Determined that it was worth the effort. sajjad: What was the value prop to going into RDF? emily: Because we wanted to bring in public datasets, in addition to the ins and EHR data, RDF gave us a common denominator. brad: I don't think you could have done this without RDF, because you need inference also. darrell: Did you use SPARQL? Emily: Yes. darrell: Papers? Emily: Yes, in submission, but will also share the submitted work. sajjad: When converting to RDF, inference comes to mind. For public datasets, for maintainability, this was done under R4, when changes come you'll need some tuning. But also the public datasets might make changes also. Scalability? Maintenance issues? emily: Don't know yet, because this was a one-time pilot. … We tried to build the triplestore in a way that allowed is to easiliy rebuild it. sajjad: Another motivating factor for RDF, might be that you could use sameAs relations instead of refurbishing the whole thing. guoqian: You have FHIR RDF and non-FHIR RDF data. How did you link them? emily: Had to make custom predicates to link them. When we did the project I was very new to RDF. Needed a relation between census tract and the county where the pt lives. Some datasets are at either level. Wanted to infer county based on census tract, so we build custom relationships. darrell: HL7 defined CQL language that maps to FHIR, for quality measures. Wondering if they can be expressed in FHIR RDF. guoqian: Previously looked at translating CQL to SPARQL. Paper from Guoqian on CQL and SPARQL: [7]http:// www.swat4ls.org/wp-content/uploads/2017/11/ SWAT4LS-2017_paper_40.pdf [7] http://www.swat4ls.org/wp-content/uploads/2017/11/SWAT4LS-2017_paper_40.pdf emily: Second project was an extension of the first. … Wanted to see if adding SNOMED ont. … Looked at depreseeion and rheumatoid arthritis. Wanted to compare the coverage of computable phenotype defined by ICD0-10 codes vs using SNOMED or HPO ont. … But we could not interview patients to find out their actual results, so we ended up measureing the degree of overlap. … But SNOMED to ICD-10 mapping files are not machine processable. … Also, for ICD-10 there are a lot of required digits, but SNOMED puts in xx?, which will not directly match. … We ended up removing the concept mappings because we couldn't use them, and used the xx? rules. … Ended up having a lot of SNOMED codes mapped from a single ICD-10 code. … Used OWL reasoning, added HPO ontology, then ran SPARQL queries. Some queries looked for exact matches, others used inference. … SNOMED and ICD-10 have very different ideas of what codes constituted depression. Best cohort might be the superset of cohorts found by both techniques. … For fheum arth we discovered an error in the SNOMED ont, due to a missing subclass relation. They were missing knees, wrists, hips and ankles! … Overall utility of this work is to help inform researchers about their choice of codes to use. guoqian: For triplestore, what codes do you use? emily: We had to use the mappings, because most EHRs do not use SNOMED -- they use ICD-10. But that's a huge limitation, because the phenotype can only be as good as the ICD-10 code. Desired R5 features [8]https://github.com/w3c/hcls-fhir-rdf/issues/69 [8] https://github.com/w3c/hcls-fhir-rdf/issues/69 brad: the long property names were a problem for us, because we could not go beyond 4 levels. [9]https://github.com/w3c/hcls-fhir-rdf/issues/75 [9] https://github.com/w3c/hcls-fhir-rdf/issues/75 david: Harold drafted a list of things we may want to change in FHIR RDF R5, but he's on vacation and I have not been able to find it! ADJOURNED
Received on Monday, 26 July 2021 20:56:02 UTC