- From: M. Scott Marshall <mscottmarshall@gmail.com>
- Date: Wed, 11 Aug 2010 12:32:46 -0700
- To: HCLS <public-semweb-lifesci@w3.org>
FYI ---------- Forwarded message ---------- From: M. Scott Marshall <mscottmarshall@gmail.com> Date: Mon, Aug 9, 2010 at 10:26 AM Subject: Fwd: RDF from Atlas To: Christoph Grabmueller <grabmuel@ebi.ac.uk>, Kei Cheung <kei.cheung@yale.edu>, Satya Sahoo <satyasahoo@gmail.com>, Matthias Samwald <samwald@gmx.at>, crockey@io-informatics.com Cc: Jun Zhao <jun.zhao@zoo.ox.ac.uk>, Helena Deus <helenadeus@gmail.com>, Rebholz <rebholz@ebi.ac.uk> Note to BioRDF members: Christoph Grabmeuller provided us with example microarray RDF from Rebholz's text mining group at EBI using EFO (see below). Notice that Christoph would like feedback and guidance. It could be informative to compare our approaches. Christoph, [Would you mind if I CC the HCLS mailing list "HCLS" <public-semweb-lifesci@w3.org> ? There are many others in HCLS that would like to know about this work and could contribute advice/opinions.] Thanks for your example RDF. It looks like a good start. The teleconference that Jun and Lena and I had with James was very useful, but several in the BioRDF task force couldn't attend (I organized it during my vacation but several others including Kei were travelling). I'm looking forward to helping each other find a satisfying approach to microarray data in RDF and hopefully arriving at a consensus that results in similar RDF being served directly from ArrayExpress and GEO. It would then be possible to perform some basic bioinformatics work in SPARQL without having to create special ontologies and namespaces. Maybe this link will help you to understand what we are doing: http://esw.w3.org/HCLSIG_BioRDF_Subgroup/QueryFederation2 Any questions you may have will help us to improve our wiki page. Some of our latest work is being 'staged' in DropBox at the moment but should be available from the wiki soon.. Cheers, Scott -- M. Scott Marshall, W3C HCLS IG co-chair Leiden University Medical Center / University of Amsterdam http://staff.science.uva.nl/~marshall ---------- Forwarded message ---------- From: Christoph Grabmueller <grabmuel@ebi.ac.uk> Date: Mon, Jul 19, 2010 at 10:34 AM Subject: Re: RDF from Atlas To: James Malone <malone@ebi.ac.uk> Cc: Dietrich Rebholz-Schuhmann <rebholz@ebi.ac.uk>, "M. Scott Marshall" <mscottmarshall@gmail.com>, Jun Zhao <jun.zhao@zoo.ox.ac.uk>, Helena Deus <helenadeus@gmail.com>, Helen Parkinson <parkinson@ebi.ac.uk>, Misha Kapushesky <ostolop@ebi.ac.uk> One subtask of SESL is the integration of Gene Expression Atlas data into the RDF based so called information brokering system. I created a very simple representation that covers the needs of the project: differential gene expression under disease conditions; and by far doesn't cover all existing information. Here is one example in Notation3 (or rather Turtle). Please don't slam me too hard for the relations and name spaces, I'm pretty sure they are all wrong :) Any input as how to do it properly is welcome. @prefix ae: <http://www.ebi.ac.uk/gxa/> . @prefix aeExp: <http://www.ebi.ac.uk/gxa/experiment/> . @prefix efo: <http://www.ebi.ac.uk/efo/> . @prefix skos: <http://www.w3.org/2008/05/skos#> . @prefix uniprot: <http://purl.uniprot.org/uniprot/> . aeExp:E-GEOD-1869 ae:hasGeneExpression [ae:condition efo:EFO_0000319; skos:exactMatch uniprot:P30542; ae:expression "DOWN"; ae:pval 0.00503674] . Doing this for all 343 disease factors (under EFO_0000408), produces around 180k triples. So far, so good. Now I only have to link to UMLS, which is the basis for disease named entitiy recognition in our group. Should be simple since the disease bits of EFO are based on the Disesae Ontology, and DO references UMLS. But the efo.owl looks like this: <efo:definition_citation rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DOID:5485</efo:definition_citation> and to be able to enjoy any semanticness, I have to convert those citations to this (probably not necessary for future versions of EFO) <efo:definition_citation rdf:resource="http://purl.org/obo/owl/DOID#DOID_5485"/> The do.owl can be used directly to get to the UMLS CUIs, and after creating my own RDF version of UMLS which also includes references in the format DO decided to use for UMLS (http://purl.org/obo/owl/UMLS_CUI#UMLS_CUI_C0010674), I can query for differentially expressed genes via UMLS CUIs and disease strings. Federated query example using Jena's ARQ engine further below. In this whole process only the DO could be used as it was, but only after adopting my UMLS representation to fit its needs; and I had to create two RDF representations from scratch. The life sciences semantic web certainly has room for improvement... Christoph PREFIX dc:<http://purl.org/dc/elements/1.1/> PREFIX owl:<http://www.w3.org/2002/07/owl#> PREFIX oboInOwl:<http://www.geneontology.org/formats/oboInOwl#> PREFIX efo:<http://www.ebi.ac.uk/efo/> PREFIX umls:<http://umlsks.nlm.nih.gov/> PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> PREFIX pdo:<http://purl.org/obo/owl/DOID#> PREFIX ae:<http://www.ebi.ac.uk/gxa/> PREFIX skos:<http://www.w3.org/2008/05/skos#> select distinct ?experiment ?uniprot ?updown ?pval where { service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/umls> { ?umls dc:name "Cystic Fibrosis" . #umls:C0010674 ?umls owl:sameAs ?umlsuri . } service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/do> { ?do oboInOwl:hasDbXref ?refblank . ?refblank oboInOwl:hasURI ?umlsuri . } service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/efo> { ?efo efo:definition_citation ?do . } service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/arrayexpress> { ?expression ae:condition ?efo . ?expression skos:exactMatch ?uniprot . ?expression ae:expression ?updown . ?expression ae:pval ?pval . ?experiment ae:hasGeneExpression ?expression . } } James Malone wrote: > > Hi Dietrich, Christoph, > > On an HCLS [1] call today we discussed an RDF representation of some of the Gene Expression Atlas. Scott, Jun and Lena were very interested to hear you had been working on producing some of this already in one of your other projects and since this represents the most RDF we have about the Atlas right now I thought I would put you guys in touch with one another. Their use cases can be found here if you are interested [2]. They are particularly interested in obtaining any rdf that you may have extracted using EFO. > > Many thanks, > > James > > [1] http://www.w3.org/blog/hcls > [2] http://esw.w3.org/HCLSIG_BioRDF_Subgroup/QueryFederation2 >
Received on Wednesday, 11 August 2010 19:33:19 UTC