Re: Announcing CORD-19-on-FHIR: A FHIR RDF dataset for COVID-19 research

In collaboration with NLM/NCBI BioNLP Research Group, we were able to
produce 
a linked data version of the LitCord datasets with semantic
annotations from Pubtator. LitCovid
<https://www.ncbi.nlm.nih.gov/research/coronavirus/>
 is a curated literature hub for tracking up-to-date scientific
information about the 2019 novel Coronavirus. It is the most
comprehensive resource on the subject, providing a central access to
3062 (and growing) relevant articles in PubMed. The counts for semantic
annotations
Of each datatype are listed as follows.


* Species	34,945
* Disease	29,106
* Chemical	4,718
* Gene	43,82
* CellLine	460
* Mutation	117

The linked data LitCovid dataset, along with other datasets, is available
at
https://github.com/fhircat/CORD-19-on-FHIR

A number of sparql queries to identify the instances of different
data types are aviable at:
https://github.com/fhircat/CORD-19-on-FHIR/wiki/LitCovid-Dataset



Feel free to let us know if you have any questions.

Sincerely,

Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University),
and FHIRCat team



On 3/26/20, 5:40 PM, "Jiang, Guoqian, M.D., Ph.D."
<Jiang.Guoqian@mayo.edu> wrote:

>
>We have updated the release of CORD-19-on-FHIR in accordance with the
>latest CORD-19 dataset, and we have expanded the set of semantic
>annotations.
>
>The latest CORD-19 dataset contains metadata on 44,000
>coronavirus-related research articles through 2020-03-20.  Of these,
>29,000 are full text.  The latest CORD-19-on-FHIR release now has
>semantic annotations for:
>
>- Condition: 182231 instances
>- Medication: 32069 instances
>- Procedure: 100260 instances
>
>This release also adds semantic annotations produced by Pubtator:
>
>- Species       2030458 instances
>- Gene          1235829 instances
>- Disease       1036954 instances
>- Chemical      778872 instances
>- CellLine      76816 instances
>- Mutation      33413 instances
>- Strain        26573 instances
>
>More details and download URL are in the original announcement below.
>
>Sincerely,
>
>Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University),
>and FHIRCat team
>
>
>On 3/19/20, 10:42 PM, "Jiang, Guoqian, M.D., Ph.D."
><Jiang.Guoqian@mayo.edu> wrote:
>
>>
>>  We are pleased to announce an initial version of the CORD-19-on-FHIR
>>>dataset for COVID-19 research, a dataset of 13202 journal articles
>>>relevant to novel coronavirus research.  This dataset extends the
>>>CORD-19 dataset (on which it is based) by adding several semantic
>>>annotations.  It is represented in FHIR RDF to facilitate semantic
>>>linkage with other biomedical datasets.
>>>
>>>CORD-19-on-FHIR dataset currently adds the following semantic
>>>annotations:
>>>
>>>- Conditions: 103,968 instances
>>>- Medications: 16,406 instances
>>>- Procedures:  54,720 instances
>>>
>>>CORD-19-on-FHIR is available on github, and collaboration is invited:
>>>https://github.com/fhircat/CORD-19-on-FHIR
>>>
>>>It is licensed to encourage open COVID-19 research.  See specific terms:
>>>https://github.com/fhircat/CORD-19-on-FHIR/blob/master/LICENSE
>>>
>>>CORD-19-on-FHIR was funded by the FHIRCat research grant, which seeks to
>>>enable the semantics of FHIR and terminologies for clinical and
>>>translational research:
>>>https://github.com/fhircat/FHIRCat
>>>
>>>Sincerely,
>>
>>>Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University),
>>>and FHIRCat team
>>
>

Received on Wednesday, 8 April 2020 17:40:35 UTC