Re: GXA triples from Peter Ansell on 2011-07-10 (public-semweb-lifesci@w3.org from July 2011)

From: Peter Ansell <ansell.peter@gmail.com>
Date: Mon, 11 Jul 2011 09:09:46 +1000
To: Helena Deus <helenadeus@gmail.com>
Cc: tomasz.adamusiak@ebi.ac.uk, Jun Zhao <jun.zhao@zoo.ox.ac.uk>, Tomasz Adamusiak <tomasz@ebi.ac.uk>, James Malone <malone@ebi.ac.uk>, expressionrdf@googlegroups.com, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
Message-ID: <CAGYFOCRNZ6K8XDKNm8ODr1nY6q=YfrGZWSLKd4BRL2piawGbHw@mail.gmail.com>

On 11 July 2011 04:55, Helena Deus <helenadeus@gmail.com> wrote:
> Hi,
> I am trying to find where the description of the genes in the GXA sparql
> endpoint can be found (in RDF to that it can be pulled into the query):
> SPARQLing GXA, I get a list of genes from an experiment; each gene looks
> something like this:
> "http://www.ensembl.org/Gene/Summary?g=ENSMUSG00000058207"
> There does not seem to be more info about each of the genes in the
> endpoint.
> I can follow the link and find out that this is the same
> as http://www.ncbi.nlm.nih.gov/gene/20714 , with linked data
> equivalent http://bio2rdf.org/geneid:20714
> Is this information anywhere in a machine processable format?
> I am looking got a SPARQL endpoint/RDF doc (or even other format conversible
> to RDF) where I can discover that :
> ENSMUSG00000058207 == geneid:20714
> Thanks
> (sorry about the blunteness, trying to see if such mappings can be included
> in the methods for the paper);

You can try http://bio2rdf.org/ensembl:ENSMUSG00000058207

It is not from the actual ensembl dataset, as we compile it on the fly
on a best effort basis from other datasets, in this case, NCBI Gene
and Uniprot. The actual set of SPARQL queries that were used to make
up that page can be found as debug annotations in the HTML code for
the page.

It contains a link to both the bare record, with no semantic
annotations, http://bio2rdf.org/geneid_record:20714 .

It also contains a link to an annotation on geneid:20714 that wouldn't
otherwise have a URI, so we made one up by hashing the components of
the annotation and connecting that the the geneid,
http://bio2rdf.org/geneid_resource:20714-043190a58f86adcc1a1d8740a7ddd710
. That URI is linked to from http://bio2rdf.org/geneid:20714 (the
bio2rdf:linkedToFrom annotation is added on the fly based on a
separate query to the rest of the information in that resource, hence
you won't find it using a SPARQL query on any of our endpoints)

Cheers,

Peter

Received on Sunday, 10 July 2011 23:10:13 UTC