Re: One ontology schema - heterogeneous instance bases

> Using RDF as an "exchange format" is just outright wrong.  How do  
> you decide if an RDF document is in BioPAX format or not? I don't  
> know how active BioPAX is now (their website shows the last  
> conference call was more than two years ago). But such line of  
> thought will doom (and have perhaps already doomed) their fate

Hi, let me add my two cents o this thread.

*) BioPAX is active, but its current "real" website is the wiki, not  
the main site (link under "community"). Unfortunately, nobody had  
time untill now to make this more explicit.

That RDF is "outright wrong" as an exchange format is questionable.  
At least we had the experience of RDF 1.0 in this direction. But I  
agree that the distiinction between data and meta-data is not so clear.

As for the validation of a "valid" biopax RDF fragment, this cannot  
be achieved by RDF "per se", but this doesn't mean this can be  
achieved through additional logic.

At the moment, BioPAX assigns its own interpretation to some OWL  
construct to performs validity checking (tools implementing this  
semamantics have been developed).
This is a temporary situtation, but there are alternatives as using  
new properties for this (whose semantics is custom specific), or  
using some more Semantic Web standard approach (check this thread for  
fome discussion with a lot of confusion: http://lists.w3.org/Archives/ 
Public/semantic-web/2007Jun/0171.html).

Anyway, I'm a supporter of the idea that an exchange format should  
address what a "valid unit of information" is, and not what a "valid  
set of assertoon on the world" is.


> For example, the BioPAX ontology is a community effort to  
> standardize the representation of pathway data. But, there are  
> serious differences in the way concepts and relationships, defined  
> in the BioPAX ontology schema, are interpreted to create instances.  
> This leads to heterogeneous instance bases for the same ontology!
>
> Specifically, the BioPAX ontology concept 'pathway' is instantiated  
> in the following manner using data from three pathway databases  
> namely KEGG, Reactome and HumanCyc:
> 1. KEGG: <pathway  
> rdf:ID="BioPAX-30a84dcd-2a16-481d-9337-8185077c4658"> (source:  
> hsa04020.owl, as of July 2007)
> 2. Reactome: <bp:pathway  
> rdf:ID="Glucose_6_phosphate_is_isomerized_to_form_fructose_6_phosphate 
> "> (source: Homo sapiens.owl, as of July 2007)
> 3. HumanCyc: <bp:pathway rdf:ID="pathway142286"> (source:  
> biopax.owl, as of July 2007)
>
> The only potential reconciliation approach we see, is by lexical  
> comparison of the textual description, typed as XML schema strings,  
> associated with each ‘pathway’ instance. For example, "Calcium  
> signaling pathway" which would be consistent across all the three  
> instance bases. But, this would constitute a purely syntactic  
> approach to reconcile the three instance bases.

BioPAX tries to standardize how pathway are represented, not pathway  
names. For this, other reosurces are more appropriate.
A proper pathway reconciliation based on biopax (reconciliation  
doens't need to end in sameAs...) should make use of the structure of  
pathways (reactions, elements, order...) more than lexical properties  
anmd testual descriptions.
There was some experience in this dierction ( http://bio.freelogy.org/ 
wiki/Debugging_the_bug ).

best,
Andrea Splendiani

-----------
Andrea Splendiani
post-doc, bootstrep project (www.bootstrep.eu)

UPRES-EA 3888 - Laboratoire d'Informatique Médicale
CHU de Pontchaillou
2, rue Henri Le Guilloux
35033 Rennes - France
Tel : +33 2 99 28 92 45 / +33 2 99 28 42 15 (secr.)
Fax : +33 2 99 28 41 60

48° 07.275N
1° 41.643W

Received on Monday, 10 September 2007 08:41:53 UTC