RE: Experiment Ontology from Miller, Michael D (Rosetta) on 2007-12-11 (public-semweb-lifesci@w3.org from December 2007)

From: Miller, Michael D (Rosetta) <Michael_Miller@Rosettabio.com>
Date: Tue, 11 Dec 2007 14:06:21 -0800
To: "Susie M Stephens" <STEPHENS_SUSIE_M@LILLY.COM>
cc: public-semweb-lifesci@w3.org, "Bill Bug" <wbug@ncmir.ucsd.edu>
Message-ID: <E1J2DFR-000377-D9@maggie.w3.org>
hi susie,

> The folks at Lilly who developed the ontology did review a number of
> existing ontologies, but they didn't meet our needs. 

this is the hard part of getting standardization accepted.  "but they didn't meet our needs" will always seem to be true because the most expedient way to organize ones data is based on how it is already organized.  no standard will look exactly like the way a particular organization choose to organize their information.

looking at the ExperimentOntology it is pretty easy to deduce how Lilly views experiment organization and i can tell you from experience that it is not like any of the pharma or biotechs way of doing things that i've seen in gene expression.  in fact, there are few details that overlap amongst any of them.

but there are common themes and we've been relatively successful in mapping to MAGE (which is UML, not an ontology, but that's a different discussion) for all these different organizations in order to import and export out of our product.

the trick is not in changing your ways but in mapping to a common language and then unmapping back into your datastore.  it actually looks like it wouldn't take much to map into FuGE with ontology terms coming from OBI for the most part.

cheers,
michael

Michael Miller
Lead Software Developer
Rosetta Biosoftware Business Unit
www.rosettabio.com

 

> -----Original Message-----
> From: public-semweb-lifesci-request@w3.org 
> [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of 
> Susie M Stephens
> Sent: Tuesday, December 11, 2007 9:21 AM
> To: Bill Bug
> Cc: public-semweb-lifesci@w3.org hcls
> Subject: Re: Experiment Ontology
> 
> 
> Hi Bill,
> 
> Thanks for all of your great feedback. :-)
> 
> The folks at Lilly who developed the ontology did review a number of
> existing ontologies, but they didn't meet our needs. I don't 
> have the full
> list of ontologies that they explored, but they definitely 
> took a look at
> OBI. We are very interested in working with the community to further
> develop the ontology, and are in the process of scheduling a 
> call with some
> of the OBI folks.
> 
> Cheers,
> 
> Susie
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>                                                               
>              
>              Bill Bug                                         
>              
>              <wbug@ncmir.ucsd.                                
>              
>              edu>                                             
>           To 
>                                        Susie Stephens         
>              
>              12/06/2007 11:16          
> <STEPHENS_SUSIE_M@LILLY.COM>        
>              PM                                               
>           cc 
>                                        Matthias Samwald 
> <samwald@gmx.at>,  
>                                        
> "public-semweb-lifesci@w3.org hcls" 
>                                        
> <public-semweb-lifesci@w3.org>, Kei 
>                                        Cheung 
> <kei.cheung@yale.edu>,       
>                                        "Karen (NIH/NIDA) [E] 
> Skinner"      
>                                        
> <kskinner@nida.nih.gov>, Alan       
>                                        Ruttenberg             
>              
>                                        
> <alanruttenberg@gmail.com>          
>                                                               
>      Subject 
>                                        Re: Experiment 
> Ontology             
>                                                               
>              
>                                                               
>              
>                                                               
>              
>                                                               
>              
>                                                               
>              
>                                                               
>              
> 
> 
> 
> 
> Hi Susie,
> 
> We certainly do need an "Experiment Ontology" - or Ontology 
> of Biomedical
> Investigation (OBI).
> 
> I believe Matthias, Michael, and Kei have all made exactly 
> the points I
> think are most important to consider:
> 1) Matthias's comments
> Are you following "best practices" in creating the ontology.  
> I believe
> Matthias gives many instructive examples on how to adjust 
> what is here to
> bring it much more in sync with the emerging "best practices" that are
> coming out of the community development surrounding a variety of OBO
> Foundry ontologies.  Matthias also makes the point that its 
> important to
> seek to re-use (or directly contribute to) the emerging community
> ontologies to cover the required domains.  In the case of 
> this particular
> Experiment Ontology, the ontologies to consider are Ontology 
> of Biomedical
> Investigation (OBI), the OBO Relations Ontology, the Gene Ontology
> (specifically the Molecular Function and Cellular Component 
> branches, the
> latter of which is designed to capture components down to the level of
> macromolecular complexes), the Sequence Ontology, Protein 
> Ontology (nascent
> - but proceeding rapidly), the Cell Ontology - at a minimum.  
> As many on
> this list know - and I'm certain the talented folks at Lilly 
> who invested
> time in assembling this ontology also learned - many of these 
> are not fully
> ready for prime-time, and/or may not FULLY cover the breadth 
> and depth of
> the domains a specific application requires.  However, if one 
> doesn't seek
> to work with these community efforts, you cannot expect to achieve the
> ultimately goal, which is to make your data maximally "semantically
> sticky", so as to ensure the least amount of custom logic and 
> human effort
> will be required to get the most value from your data.  Otherwise, you
> stand the chance of creating what may be a useful ontology 
> that meets your
> specific requirements (as has been true of "investigation"-oriented
> ontologies that have come before such as the MAGE Ontology, 
> ExperiBase,
> EXPO, myGRID KAVE, etc.), but don't help the community at-large to
> appropriately re-use your data.  In each case, these ontologies or KR
> frameworks have been extremely useful in the local 
> application context for
> which they were constructed, but they cannot be effectively 
> employed as the
> basis for semantically-driven integration across data sets 
> that may not be
> able to accept the constraints (or lack thereof) of this
> application-oriented ontology.
> Would you know off-hand, Susie, whether the folks who worked on this
> ontology at Lilly have both reviewed the relevant community 
> efforts cited
> above and/or have sought to interact with those groups to get 
> some input on
> how best to meet the overall requirements that underlie this 
> particular
> Experiment Ontology with the minimal required effort and in a 
> manner that
> could help to ensure Lilly's sunk investment could be of 
> benefit to us all.
> 
> 2) Michael's comments
> It's very helpful to know what the target is when it comes to
> exporting/exchanging the actual data.  As Michael points out, 
> a great deal
> of work has gone into the production of FuGE (and MaGE before 
> it) to come
> up with the appropriate division of labor between the 
> semantically-opaque,
> syntactical requirements as represented in a data model such 
> as MaGE or
> FuGE and the explicit semantics as captured in the ontology.  
> For those
> using FuGE, as Michael states, in the realm of syntax, the 
> intention for
> FuGE is to provide a shared structure for universal elements such as
> biomaterials, experiment populations/pools/groups, protocol details,
> reagents details, etc..  Built on that shared, generic foundation, any
> specific discipline - e.g., microarray expression, GC-MS, 
> FISH, MRI, etc. -
> can sub-class FuGE components and add what additional detail 
> required in
> their discipline.  In parallel with this effort on data 
> structure, the OBI
> ontology cooperative seeks to provide that same foundation 
> for the shared
> semantic domains, and a clear set of recommended practices for how to
> re-use entities from other OBO Foundry ontologies such as 
> ChEBI, Sequence
> Ontology, Protein Ontology, OBO Cell, Organism Taxonomy (OWL 
> versions of
> NCBI Tax), etc. to specify the critical biomedical entities and their
> complex relations.  As I say above, these are works in 
> progress.  For those
> of us who must have something working now, the recommended 
> practice is to
> actively participate in these projects with an eye toward 
> following their
> practice - and replacing any "proxy" you create in the 
> interim with the
> community ontology, when it is ready for use.  This is what 
> we have done in
> the BIRN ontology BIRNLex.  We actually have an OWL module called
> "BIRNLex-OBI-Proxy.owl" which we fully intend to replace with 
> OBI entities,
> when they are ready for use.  We also have 
> "BIRNLex-Investigation.owl" that
> builds on this "proxy" to cover entities BIRN researchers 
> must capture.  We
> expect to eventually see the contents of 
> "BIRNLex-Investigation" in OBI in
> some form.  We intend to "contribute" those elements from 
> this OWL file
> directly to OBI, when OBI is ready for them, and we have the time work
> through this migration process.
> 
> 3) Kei's comments
> Examples - examples - examples.  This is critical.  Working 
> through the
> example Kei cites from the NIH Neuroscience Microarray Consortium is a
> wonderful way to determine whether:
> - there are existing community ontologies that can meet the KR and
> processing requirements
> - where the gaps are in those community ontologies
> - whether the ontology you are creating effectively fills 
> those gaps (if it
> does, that makes it very clear how the community effort can 
> make effective
> use of your ontology)
> In regards to Gene Lists, Kei is certainly correct.  If these 
> are captured
> through algorithmic means, it's critical to capture the 
> details on that
> algorithm - typically both the version of the algorithm as well as the
> version of the data repository you ran it against.
> Also - where gene entities are concerned - there is ongoing 
> work between
> the GO groups, the Sequence Ontology, and the Protein Ontology that is
> particularly targeted toward capturing the specific relations 
> between types
> of genomic sequence elements and types of biologically active 
> protein-based
> molecules (e.g., macromolecular complexes composed of a collection of
> proteins in a variety of post-translationally modified states 
> - e.g., GPC
> receptors, ion channels, transporters, pathway enzymes, etc. 
> - i.e., Rx
> drug targets).  These are the details we'll all require in order to do
> round-trip pharmacogenetics - i.e.,effects of genetic constructs on
> target susceptibility to drugs - AND - the ways in which 
> drugs ultimately
> alter macromolecular complexes by leading to changes in gene 
> expression.
> 
> Just my $0.02 filtering on these helpful comments from 
> Matthias, Michael,
> and Kei.
> 
> Cheers,
> Bill
> 
> On Dec 3, 2007, at 1:00 PM, Kei Cheung wrote:
> 
> 
>       This is great!
> 
>       I have a microarray experiment description (that has to do with
>       Alzheimer Disease) extracted from NINDS microarray consortium:
> 
>       
> http://arrayconsortium.tgen.org/np2/viewProject.do?action=view
> Project&projectId=433773
> 
>       I just wonder how this example would fit this 
> experiment ontology (as
>       well as others such as OBI) As shown in this example, we record
>       information such as organ type, organ region, cell type 
> (layer II
>       pyramidal neuron), etc. NINDS microarry consortium uses 
> different
>       array platforms (e.g., agilent, Affymetrix, and cDNA)  
> for different
>       organisms so one may need to divide chips into groups 
> corresponding
>       to different platform types. Each group can then be 
> further divided
>       into subgroups corresponding to different organisms.
> 
>       We also would like to capture gene lists (not the raw 
> gene lists but
>       the ones (much shorter) that indicate what genes are over/under
>       expressed under certain experimental conditions). Such 
> gene lists
>       would usually be extracted from the literature. Also 
> the analysis
>       package (including version) that was used to generate a 
> gene list
>       should be identified. One possible use of these gene lists is to
>       compare them to identify genes are differentially 
> expressed under the
>       same/similar experimental condition across different microarray
>       experiments. This would help identify true signals from noises.
> 
>       Hope it helps.
> 
>       Cheers,
> 
>       -Kei
> 
> 
> 
>       Matthias Samwald wrote:
> 
>             Hi Susie,
> 
>             Susie wrote:
>                   It would be great if you could take a look at it and
>                   provide comments. The
>                   ontology is available at:
>                   
> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/Experimen
> t_Ontology
> 
>             * Some of the entities/properties are missing a 
> rdfs:label or
>             have an empty label (a string with lenght 0).
>             * Some of the entities could be taken from 
> existing ontologies
>             like OBI, RO or some of the OBO Foundry 
> ontologies. This would
>             save work and makes integration with other data 
> sources and
>             ontologies much easier. By the way, there seem to 
> be several
>             groups working on ontologies for mircoarray 
> experiments, or are
>             at least planning to do that. It would be great 
> if these groups
>             could work together.
>             * The class 'Chip type' should be removed and be 
> replaced by
>             subclasses of 'chip', e.g., 'chip (human)', 'chip 
> (mouse)' etc.
>             * Some of the object properties appear like they 
> are intended
>             to be datatype properties (e.g., 'has proteome id').
>             * Many of the datatype properties could be 
> replaced with object
>             properties, possibly referring to third party 
> ontologies -- of
>             course this would require a richer ontology and 
> more work spent
>             on creating mappings. 'has molecular function' 
> could refer to
>             entities from the gene ontology, 'has associated 
> organ' could
>             refer to an ontology about anatomy and so on.
>             * Object properties and their ranges are quite redundant.
>             Property 'has reagent' has range 'Reagent', property 'has
>             treatment' has range'Treatment' and so on. Maybe 
> the ontology
>             could be designed in such a way that there are only some
>             generic properties such as 'has part'. This would make the
>             ontology much easier to maintain, query and 
> understand in the
>             long term.
>             * It is unclear how 'Gene list' is intended to be used.
>             * 'Hardware' and 'Software' should not be subclasses of
>             'Protocol'.
> 
> 
>             Many of the datatype properties in this ontology look very
>             interesting and might provide requirements for other
>             ontologies. It would be great if some of them could be
>             described/commented in more detail so that we 
> know more about
>             the requirements that motivated the creation of these
>             properties.
> 
>             I hope that was somewhat helpful.
> 
>             cheers,
>             Matthias Samwald
> 
> 
> 
> 
> 
> 
> 
> 
> 
> William Bug, M.S., M.Phil.                                    
>       email:
> wbug@ncmir.ucsd.edu
> Ontological Engineer (Programmer Analyst III) work: (610) 457-0443
> Biomedical Informatics Research Network (BIRN)
> and
> National Center for Microscopy & Imaging Research (NCMIR)
> Dept. of Neuroscience, School of Medicine
> University of California, San Diego
> 9500 Gilman Drive
> La Jolla, CA 92093
> 
> Please note my email has recently changed
> 
> 
> 
> 
>
Received on Tuesday, 11 December 2007 22:06:48 UTC