Re: [EXTERNAL] Re: Announcing CORD-19-on-FHIR: A FHIR RDF dataset for COVID-19 research

On 3/29/20 5:36 PM, Jiang, Guoqian, M.D., Ph.D. wrote:
> Hi, Kingsley,
>
> Thanks much for your notes.
>
> We have updated the README.md file, adding links to the turtle file
> examples and SPARQL query examples. Hope this would be helpful for
> identifying URLs for the datasets. Feel free to let us know if you need
> any specific information about the datasets.
>
> Thanks,
>
> -Guoqian


Hi Guoqian,

We've loaded the COVID-19 research dataset into our LOD Cloud cache.

Here are some query examples, based on tweaks of your sample queries [1]:

[1]Diseases and MESH Cross References
<http://lod.openlinksw.com/sparql?default-graph-uri=&query=PREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+fhir%3A+%3Chttp%3A%2F%2Fhl7.org%2Ffhir%2F%3E%0D%0APREFIX+pmc%3A+%3Chttps%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%23%3E%0D%0A%0D%0Aselect+distinct+iri%28concat%28%27https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fmesh%2F%27%2Creplace%28%3Fpmc_identifier%2C%27MESH%3A%27%2C%27%27%29%2C%27%23%27%29%29+as+%3Fmesh_id+%0D%0A++++++++++++++++%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%28count%28%3Ftext%29+as+%3Fcount%29+%3Fg%0D%0Awhere+%7B+%0D%0A++++graph+%3Fg+%7B%3Fpmc+pmc%3Aannotations+%3Fannotations+.%0D%0A++++%3Fannotations+pmc%3Ainfons+%3Fpmc_infons+.%0D%0A++++%3Fpmc_infons+pmc%3Aidentifier+%3Fpmc_identifier+.%0D%0A++++%3Fpmc_infons+pmc%3Atype+%3Fpmc_type+.%0D%0A++++%3Fannotations+pmc%3Atext+%3Ftext+.%0D%0A++++FILTER+%28%3Fpmc_type%3D%27Disease%27%29.+%7D%0D%0A++++%0D%0A++++%0D%0A%7D+%0D%0Agroup+by+%3Fg+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%0D%0Aorder+by+DESC%28%3Fcount%29%0D%0Alimit+100&format=text%2Fhtml&timeout=30000&debug=on&run=+Run+Query+>
-- note that I had to mint a Linked Data URI for live MESH cross references

[2] Source Query Text for the above
<http://lod.openlinksw.com/sparql?default-graph-uri=&qtxt=PREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+fhir%3A+%3Chttp%3A%2F%2Fhl7.org%2Ffhir%2F%3E%0D%0APREFIX+pmc%3A+%3Chttps%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%23%3E%0D%0A%0D%0Aselect+distinct+iri%28concat%28%27https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fmesh%2F%27%2Creplace%28%3Fpmc_identifier%2C%27MESH%3A%27%2C%27%27%29%2C%27%23%27%29%29+as+%3Fmesh_id+%0D%0A++++++++++++++++%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%28count%28%3Ftext%29+as+%3Fcount%29+%3Fg%0D%0Awhere+%7B+%0D%0A++++graph+%3Fg+%7B%3Fpmc+pmc%3Aannotations+%3Fannotations+.%0D%0A++++%3Fannotations+pmc%3Ainfons+%3Fpmc_infons+.%0D%0A++++%3Fpmc_infons+pmc%3Aidentifier+%3Fpmc_identifier+.%0D%0A++++%3Fpmc_infons+pmc%3Atype+%3Fpmc_type+.%0D%0A++++%3Fannotations+pmc%3Atext+%3Ftext+.%0D%0A++++FILTER+%28%3Fpmc_type%3D%27Disease%27%29.+%7D%0D%0A++++%0D%0A++++%0D%0A%7D+%0D%0Agroup+by+%3Fg+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%0D%0Aorder+by+DESC%28%3Fcount%29%0D%0Alimit+100&format=text%2Fhtml&timeout=30000&debug=on&run=+Run+Query+>


[3] Top Gene Instances and NCBI Cross References
<http://lod.openlinksw.com/sparql?default-graph-uri=&query=%0D%0A%0D%0APREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+fhir%3A+%3Chttp%3A%2F%2Fhl7.org%2Ffhir%2F%3E%0D%0APREFIX+pmc%3A+%3Chttps%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%23%3E%0D%0Aselect+distinct+iri%28concat%28%27https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgene%2F%27%2Creplace%28%3Fpmc_identifier%2C%27MESH%3A%27%2C%27%27%29%2C%27%23%27%29%29+as+%3Fncbi_id%0D%0A++++++++++++++++%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%28count%28%3Ftext%29+as+%3Fcount%29+%0D%0Awhere+%7B+%0D%0A++++%3Fpmc+pmc%3Aannotations+%3Fannotations+.%0D%0A++++%3Fannotations+pmc%3Ainfons+%3Fpmc_infons+.%0D%0A++++%3Fpmc_infons+pmc%3Aidentifier+%3Fpmc_identifier+.%0D%0A++++%3Fpmc_infons+pmc%3Atype+%3Fpmc_type+.%0D%0A++++%3Fannotations+pmc%3Atext+%3Ftext+.%0D%0A++++FILTER+%28%3Fpmc_type%3D%27Gene%27%29.%0D%0A++++%0D%0A++++%0D%0A%7D+%0D%0Agroup+by+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%0D%0Aorder+by+DESC%28%3Fcount%29%0D%0Alimit+100&format=text%2Fhtml&timeout=30000&debug=on&run=+Run+Query+>
-- I had to mint a Linked Data URI for live NCBI cross references

[4] Source Query Text for the above
<http://lod.openlinksw.com/sparql?default-graph-uri=&qtxt=%0D%0A%0D%0APREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+fhir%3A+%3Chttp%3A%2F%2Fhl7.org%2Ffhir%2F%3E%0D%0APREFIX+pmc%3A+%3Chttps%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%23%3E%0D%0Aselect+distinct+iri%28concat%28%27https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgene%2F%27%2Creplace%28%3Fpmc_identifier%2C%27MESH%3A%27%2C%27%27%29%2C%27%23%27%29%29+as+%3Fncbi_id%0D%0A++++++++++++++++%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%28count%28%3Ftext%29+as+%3Fcount%29+%0D%0Awhere+%7B+%0D%0A++++%3Fpmc+pmc%3Aannotations+%3Fannotations+.%0D%0A++++%3Fannotations+pmc%3Ainfons+%3Fpmc_infons+.%0D%0A++++%3Fpmc_infons+pmc%3Aidentifier+%3Fpmc_identifier+.%0D%0A++++%3Fpmc_infons+pmc%3Atype+%3Fpmc_type+.%0D%0A++++%3Fannotations+pmc%3Atext+%3Ftext+.%0D%0A++++FILTER+%28%3Fpmc_type%3D%27Gene%27%29.%0D%0A++++%0D%0A++++%0D%0A%7D+%0D%0Agroup+by+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%0D%0Aorder+by+DESC%28%3Fcount%29%0D%0Alimit+100&format=text%2Fhtml&timeout=30000&debug=on&run=+Run+Query+>

[5] Top Mutation Instances and Gene Card References
<http://lod.openlinksw.com/sparql?default-graph-uri=&query=%0D%0A%0D%0APREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+fhir%3A+%3Chttp%3A%2F%2Fhl7.org%2Ffhir%2F%3E%0D%0APREFIX+pmc%3A+%3Chttps%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%23%3E%0D%0Aselect+distinct+iri%28concat%28%27https%3A%2F%2Fwww.genecards.org%2FSearch%2FKeyword%3FqueryString%3D%27%2Creplace%28%3Fpmc_identifier%2C%27MESH%3A%27%2C%27%27%29%2C%27%23%27%29%29+as+%3Fgene_card_id+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%28count%28%3Ftext%29+as+%3Fcount%29+where+%7B+%0D%0A++++%3Fpmc+pmc%3Aannotations+%3Fannotations+.%0D%0A++++%3Fannotations+pmc%3Ainfons+%3Fpmc_infons+.%0D%0A++++%3Fpmc_infons+pmc%3Aidentifier+%3Fpmc_identifier+.%0D%0A++++%3Fpmc_infons+pmc%3Atype+%3Fpmc_type+.%0D%0A++++%3Fannotations+pmc%3Atext+%3Ftext+.%0D%0A++++FILTER+%28%3Fpmc_type%3D%27Mutation%27%29.%0D%0A++++%0D%0A++++%0D%0A%7D+%0D%0Agroup+by+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%0D%0Aorder+by+DESC%28%3Fcount%29%0D%0Alimit+100&format=text%2Fhtml&timeout=30000&debug=on&run=+Run+Query+>
-- I had to mint a Linked Data URI for live Gene Card cross references

[6] Source Query Text for the above
<http://lod.openlinksw.com/sparql?default-graph-uri=&qtxt=%0D%0A%0D%0APREFIX+rdf%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23%3E%0D%0APREFIX+fhir%3A+%3Chttp%3A%2F%2Fhl7.org%2Ffhir%2F%3E%0D%0APREFIX+pmc%3A+%3Chttps%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%23%3E%0D%0Aselect+distinct+iri%28concat%28%27https%3A%2F%2Fwww.genecards.org%2FSearch%2FKeyword%3FqueryString%3D%27%2Creplace%28%3Fpmc_identifier%2C%27MESH%3A%27%2C%27%27%29%2C%27%23%27%29%29+as+%3Fgene_card_id+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%28count%28%3Ftext%29+as+%3Fcount%29+where+%7B+%0D%0A++++%3Fpmc+pmc%3Aannotations+%3Fannotations+.%0D%0A++++%3Fannotations+pmc%3Ainfons+%3Fpmc_infons+.%0D%0A++++%3Fpmc_infons+pmc%3Aidentifier+%3Fpmc_identifier+.%0D%0A++++%3Fpmc_infons+pmc%3Atype+%3Fpmc_type+.%0D%0A++++%3Fannotations+pmc%3Atext+%3Ftext+.%0D%0A++++FILTER+%28%3Fpmc_type%3D%27Mutation%27%29.%0D%0A++++%0D%0A++++%0D%0A%7D+%0D%0Agroup+by+%3Fpmc_type+%3Fpmc_identifier+%3Ftext+%0D%0Aorder+by+DESC%28%3Fcount%29%0D%0Alimit+100&format=text%2Fhtml&timeout=30000&debug=on&run=+Run+Query+>


Links:

[1] https://github.com/fhircat/CORD-19-on-FHIR/wiki/PubTator-Dataset --
Your Query Examples Page.


Kingsley

>
>
>
>
>
>
> On 3/28/20, 5:40 PM, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>
>> On 3/26/20 6:40 PM, Jiang, Guoqian, M.D., Ph.D. wrote:
>>> We have updated the release of CORD-19-on-FHIR in accordance with the
>>> latest CORD-19 dataset, and we have expanded the set of semantic
>>> annotations.
>>>
>>> The latest CORD-19 dataset contains metadata on 44,000
>>> coronavirus-related research articles through 2020-03-20.  Of these,
>>> 29,000 are full text.  The latest CORD-19-on-FHIR release now has
>>> semantic annotations for:
>>>
>>> - Condition: 182231 instances
>>> - Medication: 32069 instances
>>> - Procedure: 100260 instances
>>>
>>> This release also adds semantic annotations produced by Pubtator:
>>>
>>> - Species       2030458 instances
>>> - Gene          1235829 instances
>>> - Disease       1036954 instances
>>> - Chemical      778872 instances
>>> - CellLine      76816 instances
>>> - Mutation      33413 instances
>>> - Strain        26573 instances
>>>
>>> More details and download URL are in the original announcement below.
>>>
>>> Sincerely,
>>>
>>> Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins University),
>>> and FHIRCat team
>>
>> Hi Guoqian,
>>
>> This is a great contribution!
>>
>> BTW -- would you be able to add a little clarity to the URLs that
>> identify RDF documents so that we can quickly load into various LOD
>> Cloud instances? 
>>
>> For example identify the RDF docs (or their zip archive) URLs associated
>> with:
>>
>> - Species       2030458 instances
>> - Gene          1235829 instances
>> - Disease       1036954 instances
>> - Chemical      778872 instances
>> - CellLine      76816 instances
>> - Mutation      33413 instances
>> - Strain        26573 instances
>>
>>
>> Kingsley
>>
>>
>>>
>>> On 3/19/20, 10:42 PM, "Jiang, Guoqian, M.D., Ph.D."
>>> <Jiang.Guoqian@mayo.edu> wrote:
>>>
>>>>  We are pleased to announce an initial version of the CORD-19-on-FHIR
>>>>> dataset for COVID-19 research, a dataset of 13202 journal articles
>>>>> relevant to novel coronavirus research.  This dataset extends the
>>>>> CORD-19 dataset (on which it is based) by adding several semantic
>>>>> annotations.  It is represented in FHIR RDF to facilitate semantic
>>>>> linkage with other biomedical datasets.
>>>>>
>>>>> CORD-19-on-FHIR dataset currently adds the following semantic
>>>>> annotations:
>>>>>
>>>>> - Conditions: 103,968 instances
>>>>> - Medications: 16,406 instances
>>>>> - Procedures:  54,720 instances
>>>>>
>>>>> CORD-19-on-FHIR is available on github, and collaboration is invited:
>>>>> https://github.com/fhircat/CORD-19-on-FHIR
>>>>>
>>>>> It is licensed to encourage open COVID-19 research.  See specific
>>>>> terms:
>>>>> https://github.com/fhircat/CORD-19-on-FHIR/blob/master/LICENSE
>>>>>
>>>>> CORD-19-on-FHIR was funded by the FHIRCat research grant, which seeks
>>>>> to
>>>>> enable the semantics of FHIR and terminologies for clinical and
>>>>> translational research:
>>>>> https://github.com/fhircat/FHIRCat
>>>>>
>>>>> Sincerely,
>>>>> Guoqian Jiang (Mayo Clinic), Harold Solbrig (Johns Hopkins
>>>>> University),
>>>>> and FHIRCat team
>> -- 
>> Regards,
>>
>> Kingsley Idehen   
>> Founder & CEO 
>> OpenLink Software 
>> Home Page: http://www.openlinksw.com
>> Community Support: https://community.openlinksw.com
>> Weblogs (Blogs):
>> Company Blog: https://medium.com/openlink-software-blog
>> Virtuoso Blog: https://medium.com/virtuoso-blog
>> Data Access Drivers Blog:
>> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>>
>> Personal Weblogs (Blogs):
>> Medium Blog: https://medium.com/@kidehen
>> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>>              http://kidehen.blogspot.com
>>
>> Profile Pages:
>> Pinterest: https://www.pinterest.com/kidehen/
>> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
>> Twitter: https://twitter.com/kidehen
>> Google+: https://plus.google.com/+KingsleyIdehen/about
>> LinkedIn: http://www.linkedin.com/in/kidehen
>>
>> Web Identities (WebID):
>> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>>        : 
>> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#th
>> is
>>
>>
>
>

-- 
Regards,

Kingsley Idehen       
Founder & CEO 
OpenLink Software   
Home Page: http://www.openlinksw.com
Community Support: https://community.openlinksw.com
Weblogs (Blogs):
Company Blog: https://medium.com/openlink-software-blog
Virtuoso Blog: https://medium.com/virtuoso-blog
Data Access Drivers Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

Personal Weblogs (Blogs):
Medium Blog: https://medium.com/@kidehen
Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
              http://kidehen.blogspot.com

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
        : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

Received on Monday, 30 March 2020 16:04:38 UTC