W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > September 2007

RE: One ontology schema - heterogeneous instance bases

From: Nigam Shah <nigam@stanford.edu>
Date: Fri, 7 Sep 2007 20:04:53 -0700
To: <satya30@uga.edu>, <public-semweb-lifesci@w3.org>, <public-semweb-lifesci-request@w3.org>
Message-ID: <000901c7f1c5$07e68e30$17b3aa90$@edu>

Hi Satya,

See below...

>For example, the BioPAX ontology is a community effort to standardize
>the representation of pathway data. But, there are serious differences
>in the way concepts and relationships, defined in the BioPAX ontology
>schema, are interpreted to create instances. This leads to heterogeneous
>instance bases for the same ontology!

Well, BioPAX is an exchange format as of now. It gives you a consistent way of describing a pathway structure. It does not claim to provide a consistent terminology of pathway names that works across all sources.
 
>Specifically, the BioPAX ontology concept 'pathway' is instantiated in
>the following manner using data from three pathway databases namely
>KEGG, Reactome and HumanCyc:
>1. KEGG: <pathway rdf:ID="BioPAX-30a84dcd-2a16-481d-9337-8185077c4658">
>(source: hsa04020.owl, as of July 2007)
>2. Reactome: <bp:pathway
>rdf:ID="Glucose_6_phosphate_is_isomerized_to_form_fructose_6_phosphate">
>(source: Homo sapiens.owl, as of July 2007)
>3. HumanCyc: <bp:pathway rdf:ID="pathway142286"> (source: biopax.owl, as
>of July 2007)
>
>The only potential reconciliation approach we see, is by lexical
>comparison of the textual description, typed as XML schema strings,
>associated with each pathway instance. 

(1) You could build the set of participants in a pathway (recursively expanding any member pathway till you ground out in proteins/small molecules) and then intersect those sets for a pathway from KEGG, Reactome, BioCyC ... to augment your string matching. (also see the separate message about extensions to PKB).

(2) You could use the GO-biological process mapping of pathways as a means of reconciling pathways from different sources (assuming they provide a mapping into GO biological processes)

>For example, "Calcium signaling
>pathway" which would be consistent across all the three instance bases.
>But, this would constitute a purely syntactic approach to reconcile the
>three instance bases.

The real problem is that "pathways" can be described at various levels of granularity and it is hard to have consistent names that work across all sources. You might want to look at the Event Ontology from INOH as a source for consistent names for pathways and pathway steps.

Regards,
Nigam.
Received on Saturday, 8 September 2007 03:05:21 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 10 December 2014 20:09:36 UTC