W3C home > Mailing lists > Public > semantic-web@w3.org > April 2020

Announcing dataset "CORD-19 Named Entities KG": an RDF dataset of named entities identified in the CORD-19 corpus

From: Franck Michel <franck.michel@cnrs.fr>
Date: Thu, 2 Apr 2020 12:18:30 +0200
To: public-lod <public-lod@w3.org>, semantic-web@w3.org, rda-covid19@rda-groups.org
Message-ID: <5d22f409-e1f1-4b61-685a-4866ba4b3b05@cnrs.fr>
Dear colleagues,

In order to foster innovative work based on the cross-linking of 
COVID-19 literature with the Data Web, we (Wimmics team, Inria 
<https://team.inria.fr/wimmics/>) are in the process of generating an 
RDF dataset describing the named entities identified in the research 
papers of the CORD-19 
<https://pages.semanticscholar.org/coronavirus-research> corpus.

To identify and disambiguate the named entities, we are using NCBO 
BioPortal annotator <http://bioportal.bioontology.org/annotatorplus>, 
Entity-fishing <https://github.com/kermitt2/entity-fishing> (links to 
Wikidata) and DBpedia Spotlight <https://www.dbpedia-spotlight.org/> 
(links to DBpedia). We are also taking care of linking to other related 
works such as CORD-19-on-FHIR 
<https://github.com/fhircat/CORD-19-on-FHIR> and COVID-19 Literature KG 
<https://www.kaggle.com/group16/covid19-literature-knowledge-graph>.

We shall release this dataset soon, as n RDF dump as well as through a 
dedicated SPARQL endpoint. Stay tuned!

Regards,
     Franck.
-- 
signature

	Franck MICHEL - CNRS research engineer
Université Côte d’Azur, CNRS, Inria
I3S laboratory (UMR 7271)
franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr> - +33 (0)4 8915 4277 	
Received on Thursday, 2 April 2020 10:18:48 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 2 April 2020 10:18:49 UTC