- From: Jiang, Guoqian, M.D., Ph.D. <Jiang.Guoqian@mayo.edu>
- Date: Fri, 20 Mar 2020 17:24:57 +0000
- To: Chris Mungall <cjmungall@lbl.gov>
- Cc: W3c Semweb HCLS <public-semweb-lifesci@w3.org>, Harold Solbrig <solbrig@jhu.edu>, David Booth <david@dbooth.org>, Eric Prud'hommeaux <eric@w3.org>
- Message-Id: <28fddd$dfijl0@ironport10.mayo.edu>
Chris, Thanks much for your notes. We are using FHIR Composition resource to model the publication including title, abstract and full text. We are leveraging our open source NLP2FHIR pipeline (https://github.com/BD2KOnFHIR/NLP2FHIR) for generating semantic annotations. The underlying NLP engines are based on cTAKES, MedXN and MedTime. As you correctly indicated, the current pipeline can just process clinical concepts like conditions, procedures and medications. We are looking for collaborations to leverage bioNLP tools within or outside Mayo to see if we can add semantic annotations for genes or mechanisms. Do you have any bioNLP tools we can leverage with OBO ontologies? Thanks, -Guoqian From: Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>> Date: Friday, March 20, 2020 at 12:11 PM To: "Jiang , Guoqian, M.D., Ph.D." <jiang.guoqian@mayo.edu<mailto:jiang.guoqian@mayo.edu>> Cc: W3c Semweb HCLS <public-semweb-lifesci@w3.org<mailto:public-semweb-lifesci@w3.org>>, Harold Solbrig <solbrig@jhu.edu<mailto:solbrig@jhu.edu>>, David Booth <david@dbooth.org<mailto:david@dbooth.org>>, Eric Prud'hommeaux <eric@w3.org<mailto:eric@w3.org>> Subject: [EXTERNAL] Re: Announcing CORD-19-on-FHIR: A FHIR RDF dataset for COVID-19 research This looks great! Is there a dummies guide on how to use this? It looks like there is a FHIR datamodel for modeling textual spans and you are using this to represent the results of NER. What NER tool was used? It looks like snomed was the main vocabulary used - this makes sense for marking up conditions, medications, and procedures. But what about genes (human and viral), mechanisms (e.g. viral gene function) - do you plan to run this with any OBO ontologies? On Thu, Mar 19, 2020 at 8:44 PM Jiang, Guoqian, M.D., Ph.D. <Jiang.Guoqian@mayo.edu<mailto:Jiang.Guoqian@mayo.edu>> wrote: We are pleased to announce an initial version of the CORD-19-on-FHIR >dataset for COVID-19 research, a dataset of 13202 journal articles >relevant to novel coronavirus research. This dataset extends the >CORD-19 dataset (on which it is based) by adding several semantic >annotations. It is represented in FHIR RDF to facilitate semantic >linkage with other biomedical datasets. > >CORD-19-on-FHIR dataset currently adds the following semantic annotations: > >- Conditions: 103,968 instances >- Medications: 16,406 instances >- Procedures: 54,720 instances > >CORD-19-on-FHIR is available on github, and collaboration is invited: >https://github.com/fhircat/CORD-19-on-FHIR > >It is licensed to encourage open COVID-19 research. See specific terms: >https://github.com/fhircat/CORD-19-on-FHIR/blob/master/LICENSE > >CORD-19-on-FHIR was funded by the FHIRCat research grant, which seeks to >enable the semantics of FHIR and terminologies for clinical and >translational research: >https://github.com/fhircat/FHIRCat > >Sincerely, >Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University), >and FHIRCat team
Received on Friday, 20 March 2020 17:25:15 UTC