Role of RDF with GO?

Hi,
 
It's apparent that the role of RDF as part of the use of GO has been growing
(based on email discussions), but I have not been able to find any clear
description on it's intended or future uses from the GeneOntology web site.
Since there are a bunch of different applications possible by properly
utilizing an RDF framework, I wanted to find out if anyone would be willing
to discuss/develop some broader use cases (around GO) based on RDF
approaches. 
 
First, let me clarify that RDF is about specifying statements about facts,
relations and other statements (including hypotheses) in an XML-formatted
way--but it doesn't require DTDs or schemas to be specified anymore. RDF is
also useful for helping specify web-services in a fine-grained way (e.g.,
S-MOBY), but RDF should not be limited to only this kind of application (ala
tool stds, OMG etc.). An interest group has recently been formed at W3C to
address such potential uses: Semantic Web for Life Science IG. This is NOT a
standards group, nor will it define ontologies-- it's goal is to find timely
approaches/solutions for the best use of Semantic Web (SW) technology, and
will be a forum for discussion, demonstration, integration, and just good ole
fashioned brainstorming!
 
For example: Currently GO is used to annotate/classify genes and proteins,
but why not also consider using it to link to 2 alternative splice variants
to subsets of the gene's GO ids using Affx's semantic namespace (I'm using N3
notation here to since it is more legable than RDF but isomorphic with it) :
 
Gene5 a affx:Gene ;
    affx:chr "chr1" ;
    affx:hasVariant [affx:representedBy :gi9887088; affx:process GO:0006306;
affx:localized GO:0016021 ] ;
    affx:hasVariant [affx:representedBy :gi9887090; affx:process GO:0006218;
affx:localizized GO:0005887].
 
Now that I can associate different GO process id's that were associated with
the same gene to different splice variants, the semantics of variant function
can be explored using "my:" ontology for relating process with location:
 
Gene5_Context my:for Gene5 ;
    my:withTranscript :gi9887088 ;  
    my:constrains {GO:0006306 my:occursIn GO:0016021 .} .
 
I also can make claims or hypotheses about additional functions (represented
here by the curly brackets, rdf:parseType="Quote"), which I can share without
the need for it to be first curated:
 
A9311 a Annotation ;
    dc:creator "Jonathan Smythe" ;
        affx:hypothesizes { 
        Gene5 a affx:Gene ; affx:hasVariant [affx:representedBy :gi9887088;
affx:process GO:0006306; affx:associatedWith omim:209920] .
        } .
 
Note the quoted content within the curly bracket is structured exactly as the
top level fact before, but since it is quoted, it is not "reified" until
someone "accepts" the hypothesis using a rule (in RDF) to make it an explicit
assertion. 
 
We have specified several use cases like this one (e.g,
http://www.affymetrix.com/community/publications/affymetrix/tmsplice/index.af

fx), and there are dozens more of use cases that can be easily considered and
tried to extend descriptive use/power of GO based annotations.  Of cource,
once the document is in an RDF form, it is also accessible to reasoners and
aggregators... Here is where I think the GO community would benefit immensely
from such RDF applications.
 
I'd be interested to hear what plans the GeneOntology group has for utilizing
RDF towards GO's goals, and whether the SWLS-IG can help coordinate efforts
through the W3C technologies and forums.
 
best,
Eric
 
Eric Neumann, Ph.D.
Global Head of Knowledge Management
Aventis - DI&A
Tel: 908-231-3510
Fax: 908-231-3307
Eric.Neumann@Aventis.com

Received on Wednesday, 14 April 2004 14:34:45 UTC