Re: Experiment Ontology from Tim Clark on 2007-12-11 (public-semweb-lifesci@w3.org from December 2007)

From: Tim Clark <twclark@nmr.mgh.harvard.edu>
Date: Tue, 11 Dec 2007 13:50:45 -0500
To: Susie M Stephens <STEPHENS_SUSIE_M@LILLY.COM>
Cc: Bill Bug <wbug@ncmir.ucsd.edu>, "public-semweb-lifesci@w3.org hcls" <public-semweb-lifesci@w3.org>
Message-Id: <3183B5D5-B50E-4663-83FF-304CA2A1A7E9@nmr.mgh.harvard.edu>
Hi Susie,

I think it might be worthwhile to arrange a discussion with the SWAN  
team about this ontology.  Could we invite you to one of our regular  
meetings in January to discuss?

Best

Tim


On Dec 11, 2007, at 12:21 PM, Susie M Stephens wrote:

>
> Hi Bill,
>
> Thanks for all of your great feedback. :-)
>
> The folks at Lilly who developed the ontology did review a number of
> existing ontologies, but they didn't meet our needs. I don't have  
> the full
> list of ontologies that they explored, but they definitely took a  
> look at
> OBI. We are very interested in working with the community to further
> develop the ontology, and are in the process of scheduling a call  
> with some
> of the OBI folks.
>
> Cheers,
>
> Susie
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>             Bill Bug
>             <wbug@ncmir.ucsd.
>              
> edu>                                                       To
>                                       Susie Stephens
>             12/06/2007 11:16          <STEPHENS_SUSIE_M@LILLY.COM>
>              
> PM                                                         cc
>                                       Matthias Samwald  
> <samwald@gmx.at>,
>                                       "public-semweb-lifesci@w3.org  
> hcls"
>                                       <public-semweb- 
> lifesci@w3.org>, Kei
>                                       Cheung <kei.cheung@yale.edu>,
>                                       "Karen (NIH/NIDA) [E] Skinner"
>                                       <kskinner@nida.nih.gov>, Alan
>                                       Ruttenberg
>                                       <alanruttenberg@gmail.com>
>                                                                    
> Subject
>                                       Re: Experiment Ontology
>
>
>
>
>
>
>
>
>
>
> Hi Susie,
>
> We certainly do need an "Experiment Ontology" - or Ontology of  
> Biomedical
> Investigation (OBI).
>
> I believe Matthias, Michael, and Kei have all made exactly the  
> points I
> think are most important to consider:
> 1) Matthias's comments
> Are you following "best practices" in creating the ontology.  I  
> believe
> Matthias gives many instructive examples on how to adjust what is  
> here to
> bring it much more in sync with the emerging "best practices" that are
> coming out of the community development surrounding a variety of OBO
> Foundry ontologies.  Matthias also makes the point that its  
> important to
> seek to re-use (or directly contribute to) the emerging community
> ontologies to cover the required domains.  In the case of this  
> particular
> Experiment Ontology, the ontologies to consider are Ontology of  
> Biomedical
> Investigation (OBI), the OBO Relations Ontology, the Gene Ontology
> (specifically the Molecular Function and Cellular Component  
> branches, the
> latter of which is designed to capture components down to the level of
> macromolecular complexes), the Sequence Ontology, Protein Ontology  
> (nascent
> - but proceeding rapidly), the Cell Ontology - at a minimum.  As  
> many on
> this list know - and I'm certain the talented folks at Lilly who  
> invested
> time in assembling this ontology also learned - many of these are  
> not fully
> ready for prime-time, and/or may not FULLY cover the breadth and  
> depth of
> the domains a specific application requires.  However, if one  
> doesn't seek
> to work with these community efforts, you cannot expect to achieve the
> ultimately goal, which is to make your data maximally "semantically
> sticky", so as to ensure the least amount of custom logic and human  
> effort
> will be required to get the most value from your data.  Otherwise, you
> stand the chance of creating what may be a useful ontology that  
> meets your
> specific requirements (as has been true of "investigation"-oriented
> ontologies that have come before such as the MAGE Ontology,  
> ExperiBase,
> EXPO, myGRID KAVE, etc.), but don't help the community at-large to
> appropriately re-use your data.  In each case, these ontologies or KR
> frameworks have been extremely useful in the local application  
> context for
> which they were constructed, but they cannot be effectively employed  
> as the
> basis for semantically-driven integration across data sets that may  
> not be
> able to accept the constraints (or lack thereof) of this
> application-oriented ontology.
> Would you know off-hand, Susie, whether the folks who worked on this
> ontology at Lilly have both reviewed the relevant community efforts  
> cited
> above and/or have sought to interact with those groups to get some  
> input on
> how best to meet the overall requirements that underlie this  
> particular
> Experiment Ontology with the minimal required effort and in a manner  
> that
> could help to ensure Lilly's sunk investment could be of benefit to  
> us all.
>
> 2) Michael's comments
> It's very helpful to know what the target is when it comes to
> exporting/exchanging the actual data.  As Michael points out, a  
> great deal
> of work has gone into the production of FuGE (and MaGE before it) to  
> come
> up with the appropriate division of labor between the semantically- 
> opaque,
> syntactical requirements as represented in a data model such as MaGE  
> or
> FuGE and the explicit semantics as captured in the ontology.  For  
> those
> using FuGE, as Michael states, in the realm of syntax, the intention  
> for
> FuGE is to provide a shared structure for universal elements such as
> biomaterials, experiment populations/pools/groups, protocol details,
> reagents details, etc..  Built on that shared, generic foundation, any
> specific discipline - e.g., microarray expression, GC-MS, FISH, MRI,  
> etc. -
> can sub-class FuGE components and add what additional detail  
> required in
> their discipline.  In parallel with this effort on data structure,  
> the OBI
> ontology cooperative seeks to provide that same foundation for the  
> shared
> semantic domains, and a clear set of recommended practices for how to
> re-use entities from other OBO Foundry ontologies such as ChEBI,  
> Sequence
> Ontology, Protein Ontology, OBO Cell, Organism Taxonomy (OWL  
> versions of
> NCBI Tax), etc. to specify the critical biomedical entities and their
> complex relations.  As I say above, these are works in progress.   
> For those
> of us who must have something working now, the recommended practice  
> is to
> actively participate in these projects with an eye toward following  
> their
> practice - and replacing any "proxy" you create in the interim with  
> the
> community ontology, when it is ready for use.  This is what we have  
> done in
> the BIRN ontology BIRNLex.  We actually have an OWL module called
> "BIRNLex-OBI-Proxy.owl" which we fully intend to replace with OBI  
> entities,
> when they are ready for use.  We also have "BIRNLex- 
> Investigation.owl" that
> builds on this "proxy" to cover entities BIRN researchers must  
> capture.  We
> expect to eventually see the contents of "BIRNLex-Investigation" in  
> OBI in
> some form.  We intend to "contribute" those elements from this OWL  
> file
> directly to OBI, when OBI is ready for them, and we have the time work
> through this migration process.
>
> 3) Kei's comments
> Examples - examples - examples.  This is critical.  Working through  
> the
> example Kei cites from the NIH Neuroscience Microarray Consortium is a
> wonderful way to determine whether:
> - there are existing community ontologies that can meet the KR and
> processing requirements
> - where the gaps are in those community ontologies
> - whether the ontology you are creating effectively fills those gaps  
> (if it
> does, that makes it very clear how the community effort can make  
> effective
> use of your ontology)
> In regards to Gene Lists, Kei is certainly correct.  If these are  
> captured
> through algorithmic means, it's critical to capture the details on  
> that
> algorithm - typically both the version of the algorithm as well as the
> version of the data repository you ran it against.
> Also - where gene entities are concerned - there is ongoing work  
> between
> the GO groups, the Sequence Ontology, and the Protein Ontology that is
> particularly targeted toward capturing the specific relations  
> between types
> of genomic sequence elements and types of biologically active  
> protein-based
> molecules (e.g., macromolecular complexes composed of a collection of
> proteins in a variety of post-translationally modified states -  
> e.g., GPC
> receptors, ion channels, transporters, pathway enzymes, etc. - i.e.,  
> Rx
> drug targets).  These are the details we'll all require in order to do
> round-trip pharmacogenetics - i.e.,effects of genetic constructs on
> target susceptibility to drugs - AND - the ways in which drugs  
> ultimately
> alter macromolecular complexes by leading to changes in gene  
> expression.
>
> Just my $0.02 filtering on these helpful comments from Matthias,  
> Michael,
> and Kei.
>
> Cheers,
> Bill
>
> On Dec 3, 2007, at 1:00 PM, Kei Cheung wrote:
>
>
>      This is great!
>
>      I have a microarray experiment description (that has to do with
>      Alzheimer Disease) extracted from NINDS microarray consortium:
>
>      http://arrayconsortium.tgen.org/np2/viewProject.do?action=viewProject&projectId=433773
>
>      I just wonder how this example would fit this experiment  
> ontology (as
>      well as others such as OBI) As shown in this example, we record
>      information such as organ type, organ region, cell type (layer II
>      pyramidal neuron), etc. NINDS microarry consortium uses different
>      array platforms (e.g., agilent, Affymetrix, and cDNA)  for  
> different
>      organisms so one may need to divide chips into groups  
> corresponding
>      to different platform types. Each group can then be further  
> divided
>      into subgroups corresponding to different organisms.
>
>      We also would like to capture gene lists (not the raw gene  
> lists but
>      the ones (much shorter) that indicate what genes are over/under
>      expressed under certain experimental conditions). Such gene lists
>      would usually be extracted from the literature. Also the analysis
>      package (including version) that was used to generate a gene list
>      should be identified. One possible use of these gene lists is to
>      compare them to identify genes are differentially expressed  
> under the
>      same/similar experimental condition across different microarray
>      experiments. This would help identify true signals from noises.
>
>      Hope it helps.
>
>      Cheers,
>
>      -Kei
>
>
>
>      Matthias Samwald wrote:
>
>            Hi Susie,
>
>            Susie wrote:
>                  It would be great if you could take a look at it and
>                  provide comments. The
>                  ontology is available at:
>                  http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/Experiment_Ontology
>
>            * Some of the entities/properties are missing a  
> rdfs:label or
>            have an empty label (a string with lenght 0).
>            * Some of the entities could be taken from existing  
> ontologies
>            like OBI, RO or some of the OBO Foundry ontologies. This  
> would
>            save work and makes integration with other data sources and
>            ontologies much easier. By the way, there seem to be  
> several
>            groups working on ontologies for mircoarray experiments,  
> or are
>            at least planning to do that. It would be great if these  
> groups
>            could work together.
>            * The class 'Chip type' should be removed and be replaced  
> by
>            subclasses of 'chip', e.g., 'chip (human)', 'chip  
> (mouse)' etc.
>            * Some of the object properties appear like they are  
> intended
>            to be datatype properties (e.g., 'has proteome id').
>            * Many of the datatype properties could be replaced with  
> object
>            properties, possibly referring to third party ontologies  
> -- of
>            course this would require a richer ontology and more work  
> spent
>            on creating mappings. 'has molecular function' could  
> refer to
>            entities from the gene ontology, 'has associated organ'  
> could
>            refer to an ontology about anatomy and so on.
>            * Object properties and their ranges are quite redundant.
>            Property 'has reagent' has range 'Reagent', property 'has
>            treatment' has range'Treatment' and so on. Maybe the  
> ontology
>            could be designed in such a way that there are only some
>            generic properties such as 'has part'. This would make the
>            ontology much easier to maintain, query and understand in  
> the
>            long term.
>            * It is unclear how 'Gene list' is intended to be used.
>            * 'Hardware' and 'Software' should not be subclasses of
>            'Protocol'.
>
>
>            Many of the datatype properties in this ontology look very
>            interesting and might provide requirements for other
>            ontologies. It would be great if some of them could be
>            described/commented in more detail so that we know more  
> about
>            the requirements that motivated the creation of these
>            properties.
>
>            I hope that was somewhat helpful.
>
>            cheers,
>            Matthias Samwald
>
>
>
>
>
>
>
>
>
> William Bug, M.S., M.Phil.                                           
> email:
> wbug@ncmir.ucsd.edu
> Ontological Engineer (Programmer Analyst III) work: (610) 457-0443
> Biomedical Informatics Research Network (BIRN)
> and
> National Center for Microscopy & Imaging Research (NCMIR)
> Dept. of Neuroscience, School of Medicine
> University of California, San Diego
> 9500 Gilman Drive
> La Jolla, CA 92093
>
> Please note my email has recently changed
>
>
>
>
Received on Tuesday, 11 December 2007 18:51:14 UTC