- From: Jiang, Guoqian, M.D., Ph.D. <Jiang.Guoqian@mayo.edu>
- Date: Wed, 08 Apr 2020 17:40:06 +0000
- To: W3c Semweb HCLS <public-semweb-lifesci@w3.org>
- Cc: Harold Solbrig <solbrig@jhu.edu>, David Booth <david@dbooth.org>, Eric Prud'hommeaux <eric@w3.org>, "Stone, Daniel J." <Stone.Daniel@mayo.edu>, Brian Alper <BAlper@EBSCO.COM>, "Lu, Zhiyong (NIH/NLM/NCBI) [E]" <luzh@ncbi.nlm.nih.gov>
In collaboration with NLM/NCBI BioNLP Research Group, we were able to produce a linked data version of the LitCord datasets with semantic annotations from Pubtator. LitCovid <https://www.ncbi.nlm.nih.gov/research/coronavirus/> is a curated literature hub for tracking up-to-date scientific information about the 2019 novel Coronavirus. It is the most comprehensive resource on the subject, providing a central access to 3062 (and growing) relevant articles in PubMed. The counts for semantic annotations Of each datatype are listed as follows. * Species 34,945 * Disease 29,106 * Chemical 4,718 * Gene 43,82 * CellLine 460 * Mutation 117 The linked data LitCovid dataset, along with other datasets, is available at https://github.com/fhircat/CORD-19-on-FHIR A number of sparql queries to identify the instances of different data types are aviable at: https://github.com/fhircat/CORD-19-on-FHIR/wiki/LitCovid-Dataset Feel free to let us know if you have any questions. Sincerely, Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University), and FHIRCat team On 3/26/20, 5:40 PM, "Jiang, Guoqian, M.D., Ph.D." <Jiang.Guoqian@mayo.edu> wrote: > >We have updated the release of CORD-19-on-FHIR in accordance with the >latest CORD-19 dataset, and we have expanded the set of semantic >annotations. > >The latest CORD-19 dataset contains metadata on 44,000 >coronavirus-related research articles through 2020-03-20. Of these, >29,000 are full text. The latest CORD-19-on-FHIR release now has >semantic annotations for: > >- Condition: 182231 instances >- Medication: 32069 instances >- Procedure: 100260 instances > >This release also adds semantic annotations produced by Pubtator: > >- Species 2030458 instances >- Gene 1235829 instances >- Disease 1036954 instances >- Chemical 778872 instances >- CellLine 76816 instances >- Mutation 33413 instances >- Strain 26573 instances > >More details and download URL are in the original announcement below. > >Sincerely, > >Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University), >and FHIRCat team > > >On 3/19/20, 10:42 PM, "Jiang, Guoqian, M.D., Ph.D." ><Jiang.Guoqian@mayo.edu> wrote: > >> >> We are pleased to announce an initial version of the CORD-19-on-FHIR >>>dataset for COVID-19 research, a dataset of 13202 journal articles >>>relevant to novel coronavirus research. This dataset extends the >>>CORD-19 dataset (on which it is based) by adding several semantic >>>annotations. It is represented in FHIR RDF to facilitate semantic >>>linkage with other biomedical datasets. >>> >>>CORD-19-on-FHIR dataset currently adds the following semantic >>>annotations: >>> >>>- Conditions: 103,968 instances >>>- Medications: 16,406 instances >>>- Procedures: 54,720 instances >>> >>>CORD-19-on-FHIR is available on github, and collaboration is invited: >>>https://github.com/fhircat/CORD-19-on-FHIR >>> >>>It is licensed to encourage open COVID-19 research. See specific terms: >>>https://github.com/fhircat/CORD-19-on-FHIR/blob/master/LICENSE >>> >>>CORD-19-on-FHIR was funded by the FHIRCat research grant, which seeks to >>>enable the semantics of FHIR and terminologies for clinical and >>>translational research: >>>https://github.com/fhircat/FHIRCat >>> >>>Sincerely, >> >>>Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University), >>>and FHIRCat team >> >
Received on Wednesday, 8 April 2020 17:40:35 UTC