RE: GXA triples

hi helen,

good point!

the actually purpose of the note/paper, i think, is to delineate this.

i think the main use case for this is to provide provenance about the
experiment so that various searches on the facets of the experiment are
possible, such as from similar lab, similar conditions, similar dates,
same species, same experimental factors and so on.  i actually don't see a
reason for not capturing all the idf and sdrf information through a
transformation (i certainly don't think you want to store two
representations) to rdf.  you're right, it would be quite a project to
build a sparql endpoint that would take the sparql and translate it to a
query against the native store then send it back in a triple
representation so thinking about use cases is the practical approach.

but what i think we're looking for first for the note/paper is thoughts on
how the RDF representation would look.  the BioRDF group took a stab at
that with our provenance paper (the pdf power point is particularly good),
http://ceur-ws.org/Vol-670/paper_6.pdf,
https://wiki.nbic.nl/images/a/ab/W3C_BioRDF_microarray_provenance.pdf and
http://www.w3.org/wiki/HCLSIG_BioRDF_Subgroup/MicroarrayExperimentContext.

cheers,
michael

> -----Original Message-----
> From: Helen Parkinson [mailto:parkinson@ebi.ac.uk]
> Sent: Friday, July 08, 2011 7:44 AM
> To: Michael Miller
> Cc: tomasz.adamusiak@ebi.ac.uk; public-semweb-lifesci@w3.org
> Subject: Re: GXA triples
>
> Hi Michael
>
> it would help to prioritize what to represent if we had a set of
> concrete use cases for the points you mention. We can generate RDF for
> lots of elements of the experiments of course -
>
> cheers
>
> Helen
>
> On 08/07/2011 15:37, Michael Miller wrote:
> > hi tomasz,
> >
> > thanks for the update, nice!  and again to jun for hosting.
> >
> > the documentation is appreciated, interestingly enough, in the
> diagram, it
> > looks like it is only the relationship between the experiment, the
> set of
> > sequences and their differential measurements per experimental factor
> > value.  is there an overall, across the factors, differential
> measurement
> > (e.g. limma) or am i missing something?
> >
> > are you also planning on RDFizing the information on the experiment
> (the
> > idf and sdrf info) that is in ArrayExpress?
> >
> > cheers,
> > michael
> >
> > Michael Miller
> > Software Engineer
> > Institute for Systems Biology
> >
> >> -----Original Message-----
> >> From: public-semweb-lifesci-request@w3.org [mailto:public-semweb-
> >> lifesci-request@w3.org] On Behalf Of Tomasz Adamusiak
> >> Sent: Friday, July 08, 2011 3:57 AM
> >> To: public-semweb-lifesci@w3.org
> >> Subject: GXA triples
> >>
> >> Hi,
> >>
> >> [Resending as my original email seems to be lost in moderation.]
> >>
> >> Jun has kindly reloaded a new batch of Atlas triples that James and
> I
> >> generated. In this release there are up to 500 top differentially
> >> expressed genes per experiment available and a total number of
> triples
> >> increased from ~37 to ~170 million.
> >>
> >> More information is available at:
> >> http://www.ebi.ac.uk/efo/semanticweb/atlas
> >>
> >> SPARQL examples are collated at:
> >> http://code.google.com/p/open-biomed/wiki/GeneExpressionAtlas
> >>
> >> We've also deployed a secondary endpoint that mirrors the data:
> >> http://wwwdev.ebi.ac.uk/microarray-srv/openrdf-
> sesame/repositories/gxa
> >>
> >> Feedback very much welcome.
> >>
> >> Cheers
> >> Tomasz
> >>
> >>
> >> --
> >> Tomasz Adamusiak, MD, PhD
> >> European Bioinformatics Institute
> >> +44 (0) 1223 492 562
> >> tomasz.adamusiak@ebi.ac.uk
> >>
>
> --
> Helen Parkinson, PhD
> Team Leader
> Functional Genomics Group
> EBI
>
> EBI 01223 494672
> Skype: helen.parkinson.ebi
> www.ebi.ac.uk/fgpt/

Received on Friday, 8 July 2011 15:13:54 UTC