Re: ontology specs for self-publishing experiment from Trish Whetzel on 2006-07-06 (public-semweb-lifesci@w3.org from July 2006)

From: Trish Whetzel <whetzel@pcbi.upenn.edu>
Date: Thu, 6 Jul 2006 14:27:05 -0400 (EDT)
To: William Bug <William.Bug@drexelmed.edu>
cc: "Miller, Michael D (Rosetta)" <Michael_Miller@Rosettabio.com>, Tim Clark <twclark@nmr.mgh.harvard.edu>, w3c semweb hcls <public-semweb-lifesci@w3.org>, SWAN Team <swan-team@mind-informatics.org>, chris mungall <cjm@fruitfly.org>
Message-ID: <Pine.LNX.4.61.0607061349050.23964@hera.pcbi.upenn.edu>

> Two quick questions:
>
> 	1) If two labs are doing microarray experiments and each seeks to 
> represent the data all the way back to the digital image acquired (so as to 
> enable others to reanalyze the data, and modify the pooling and/or statistics 
> applied in this new, shared context), if both are using the exact same assay, 
> instrument, and reagents but decide to specify the experimental observation 
> provenance via two separate ontologies, how can an algorithm unambiguously 
> determine even approximate semantic equivalence of things such as fluorescent 
> indicator, sequence probe, optical elements, image acquisition elements?
I don't think that it can be done unless the relationship/mapping between 
the two ontologies is known and computationally acessible. For 
microarray experiments, many of the annotation terms were provided by the 
MGED Ontology (MO) and in cases where an existing, freely available 
resource was available the MO pointed/listed these resources. Of course 
this does not solve the problem of determining semantic equivalence when 2 
different anatomy ontologies are used, for example, to annotate each 
experiment.

> 	2) Doesn't this lead down a road similar to that of MIAME, only now 
> you've shifted the border of incommensurateness beyond the level for data 
> format and into the semantic domain?  What I mean is, won't there still be 
> difficulty determining even approximate semantic equivalency for all of the 
> details of data provenance - many of which absolutely must be resolved in 
> order to perform large-scale re-pooling of related observations made in the 
> context of different studies - even if nearly identical 
> assays/instruments/reagents are used?
I agree this seems to be the case. Although now, one problem of 
different formats for functional genomics experiments is resolved by FuGE. 
As for the semantic meaning of the data, I would say this is where having 
ontologies that meet the OBO Foundry guidelines comes into play (oh, just 
noticed that the principles do not state that the reference ontologies 
should be orthogonal - will need to check that). In the case of different 
anatomy ontologies being used to annotate microarray experiments, SAEL 
(http://www.sofg.org/sael/) was developed as an entry mechanism for 
computational access to anatomy resources and provides a mapping 
between the various anatomy resources. So in this case, where overlaps 
existed, the ontology developers worked to resolve these as much as 
possible. FuGO is an effort to build an ontology where the terms needed, 
for at least functional genomics investigations perhaps also true for 
high level descriptors for investigations in general, exist in a resource 
where duplications do not exist.

Cheers,
Trish

Received on Thursday, 6 July 2006 18:27:20 UTC