- From: Sudeshna Das <sudeshna_das@harvard.edu>
- Date: Wed, 11 Aug 2010 17:23:11 -0400
- To: "M. Scott Marshall" <mscottmarshall@gmail.com>
- Cc: HCLS IG <public-semweb-lifesci@w3.org>, Stéphane Corlosquet <scorlosquet@gmail.com>
Hi Scott, Thank you forwarding this thread. We have developed a repository for stem cell microarray data, and are currently in the process of exporting RDF data. We are planning to export only the metadata of the experiment - not the expression data or fold change. Although that data also exists in a structured format and could be also exported as RDF. It would be a nice exercise to check the interoperability of the 2 data repositories. We talked to Helen P and James M, about using EFO to describe the biomaterial characteristics and submitting term requests for missing ones. Sudeshna On Aug 11, 2010, at 3:32 PM, M. Scott Marshall wrote: > FYI > > ---------- Forwarded message ---------- > From: M. Scott Marshall <mscottmarshall@gmail.com> > Date: Mon, Aug 9, 2010 at 10:26 AM > Subject: Fwd: RDF from Atlas > To: Christoph Grabmueller <grabmuel@ebi.ac.uk>, Kei Cheung > <kei.cheung@yale.edu>, Satya Sahoo <satyasahoo@gmail.com>, Matthias > Samwald <samwald@gmx.at>, crockey@io-informatics.com > Cc: Jun Zhao <jun.zhao@zoo.ox.ac.uk>, Helena Deus > <helenadeus@gmail.com>, Rebholz <rebholz@ebi.ac.uk> > > > Note to BioRDF members: Christoph Grabmeuller provided us with example > microarray RDF from Rebholz's text mining group at EBI using EFO (see > below). Notice that Christoph would like feedback and guidance. It > could be informative to compare our approaches. > > Christoph, > > [Would you mind if I CC the HCLS mailing list "HCLS" > <public-semweb-lifesci@w3.org> ? There are many others in HCLS that > would like to know about this work and could contribute > advice/opinions.] > > Thanks for your example RDF. It looks like a good start. The > teleconference that Jun and Lena and I had with James was very useful, > but several in the BioRDF task force couldn't attend (I organized it > during my vacation but several others including Kei were travelling). > I'm looking forward to helping each other find a satisfying approach > to microarray data in RDF and hopefully arriving at a consensus that > results in similar RDF being served directly from ArrayExpress and > GEO. It would then be possible to perform some basic bioinformatics > work in SPARQL without having to create special ontologies and > namespaces. > > Maybe this link will help you to understand what we are doing: > http://esw.w3.org/HCLSIG_BioRDF_Subgroup/QueryFederation2 > Any questions you may have will help us to improve our wiki page. Some > of our latest work is being 'staged' in DropBox at the moment but > should be available from the wiki soon.. > > Cheers, > Scott > > -- > M. Scott Marshall, W3C HCLS IG co-chair > Leiden University Medical Center / University of Amsterdam > http://staff.science.uva.nl/~marshall > > ---------- Forwarded message ---------- > From: Christoph Grabmueller <grabmuel@ebi.ac.uk> > Date: Mon, Jul 19, 2010 at 10:34 AM > Subject: Re: RDF from Atlas > To: James Malone <malone@ebi.ac.uk> > Cc: Dietrich Rebholz-Schuhmann <rebholz@ebi.ac.uk>, "M. Scott > Marshall" <mscottmarshall@gmail.com>, Jun Zhao > <jun.zhao@zoo.ox.ac.uk>, Helena Deus <helenadeus@gmail.com>, Helen > Parkinson <parkinson@ebi.ac.uk>, Misha Kapushesky <ostolop@ebi.ac.uk> > > > One subtask of SESL is the integration of Gene Expression Atlas data > into the RDF based so called information brokering system. > I created a very simple representation that covers the needs of the > project: differential gene expression under disease conditions; and by > far doesn't cover all existing information. > > Here is one example in Notation3 (or rather Turtle). Please don't slam > me too hard for the relations and name spaces, I'm pretty sure they > are all wrong :) Any input as how to do it properly is welcome. > > @prefix ae: <http://www.ebi.ac.uk/gxa/> . > @prefix aeExp: <http://www.ebi.ac.uk/gxa/experiment/> . > @prefix efo: <http://www.ebi.ac.uk/efo/> . > @prefix skos: <http://www.w3.org/2008/05/skos#> . > @prefix uniprot: <http://purl.uniprot.org/uniprot/> . > > aeExp:E-GEOD-1869 ae:hasGeneExpression [ae:condition efo:EFO_0000319; > skos:exactMatch uniprot:P30542; ae:expression "DOWN"; ae:pval > 0.00503674] . > > Doing this for all 343 disease factors (under EFO_0000408), produces > around 180k triples. So far, so good. Now I only have to link to UMLS, > which is the basis for disease named entitiy recognition in our group. > Should be simple since the disease bits of EFO are based on the > Disesae Ontology, and DO references UMLS. > > But the efo.owl looks like this: > <efo:definition_citation > rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DOID:5485</ > efo:definition_citation> > and to be able to enjoy any semanticness, I have to convert those > citations to this (probably not necessary for future versions of EFO) > <efo:definition_citation > rdf:resource="http://purl.org/obo/owl/DOID#DOID_5485"/> > The do.owl can be used directly to get to the UMLS CUIs, and after > creating my own RDF version of UMLS which also includes references in > the format DO decided to use for UMLS > (http://purl.org/obo/owl/UMLS_CUI#UMLS_CUI_C0010674), I can query for > differentially expressed genes via UMLS CUIs and disease strings. > Federated query example using Jena's ARQ engine further below. > > > In this whole process only the DO could be used as it was, but only > after adopting my UMLS representation to fit its needs; and I had to > create two RDF representations from scratch. The life sciences > semantic web certainly has room for improvement... > > Christoph > > > PREFIX dc:<http://purl.org/dc/elements/1.1/> > PREFIX owl:<http://www.w3.org/2002/07/owl#> > PREFIX oboInOwl:<http://www.geneontology.org/formats/oboInOwl#> > PREFIX efo:<http://www.ebi.ac.uk/efo/> > PREFIX umls:<http://umlsks.nlm.nih.gov/> > PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> > PREFIX pdo:<http://purl.org/obo/owl/DOID#> > PREFIX ae:<http://www.ebi.ac.uk/gxa/> > PREFIX skos:<http://www.w3.org/2008/05/skos#> > > select distinct ?experiment ?uniprot ?updown ?pval where { > service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/umls > > { > ?umls dc:name "Cystic Fibrosis" . #umls:C0010674 > ?umls owl:sameAs ?umlsuri . > } service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/do > > { > ?do oboInOwl:hasDbXref ?refblank . > ?refblank oboInOwl:hasURI ?umlsuri . > } > service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/efo > > { > ?efo efo:definition_citation ?do . > } > service <http://jweb-2b:21380/Rebholz-srv/openrdf-sesame/repositories/arrayexpress > > > { > ?expression ae:condition ?efo . ?expression skos:exactMatch ? > uniprot . > ?expression ae:expression ?updown . > ?expression ae:pval ?pval . ?experiment ae:hasGeneExpression > ?expression . } > } > > James Malone wrote: >> >> Hi Dietrich, Christoph, >> >> On an HCLS [1] call today we discussed an RDF representation of >> some of the Gene Expression Atlas. Scott, Jun and Lena were very >> interested to hear you had been working on producing some of this >> already in one of your other projects and since this represents the >> most RDF we have about the Atlas right now I thought I would put >> you guys in touch with one another. Their use cases can be found >> here if you are interested [2]. They are particularly interested in >> obtaining any rdf that you may have extracted using EFO. >> >> Many thanks, >> >> James >> >> [1] http://www.w3.org/blog/hcls >> [2] http://esw.w3.org/HCLSIG_BioRDF_Subgroup/QueryFederation2 >> >
Received on Wednesday, 11 August 2010 21:23:44 UTC