Re: protein entities (was Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1) from Darren Natale on 2007-07-19 (public-semweb-lifesci@w3.org from July 2007)

From: Darren Natale <dan5@georgetown.edu>
Date: Thu, 19 Jul 2007 11:23:59 -0400
To: Eric Jain <Eric.Jain@isb-sib.ch>
CC: Alan Ruttenberg <alanruttenberg@gmail.com>, Chris Mungall <cjm@fruitfly.org>, Bijan Parsia <bparsia@cs.man.ac.uk>, public-semweb-lifesci hcls <public-semweb-lifesci@w3.org>
Message-ID: <469F820F.6090109@georgetown.edu>

We don't yet have formal definitions for many of the classes and 
relations (the effort only began in earnest a few months ago).  But, 
basically, there is a distinction made between the full-length (in terms 
of amino acid sequence) protein and the sub-length parts of proteins 
(commonly called domains by protein scientists, unfortunately).  The 
term "whole protein" is somewhat of a placeholder; it is used to signify 
the evolutionary classes (families) of full-length proteins as opposed 
to the evolutionary classes of domains.  Sequence form is again a 
placeholder term used to denote the initial translation product from an 
mRNA, which itself might be based on a "normal" gene or a mutant 
thereof, or which might be one of several possible alternatively spliced 
transcripts from the normal or mutant gene.  The cleaved or modified 
product is a further breakdown of those initial translation products, 
and allows one to distinguish between a phosphorylated version of a 
protein and the non-phosphorylated version (as an example).  The need 
for the latter derives from the fact that the two versions might have 
different functions.

Eric Jain wrote:
> Darren Natale wrote:
>> We recently began a new Protein Ontology (PRO) effort geared precisely 
>> toward the formal definition of the "smaller entities" referred to by 
>> Alan.  By "we" I mean the PRO Consortium, comprising the PIs Cathy Wu 
>> of PIR (which is also a member organization of the UniProt 
>> Consortium), Barry Smith of SUNY Buffalo, and Judy Blake of Jackson 
>> Labs.  PRO is being developed within the framework of the OBO Foundry, 
>> and aims to specify protein entities at the level mentioned by Chris 
>> (accounting for splice variation and post-translational modification 
>> and cleavage). Where appropriate, PRO will indeed make reference to 
>> both other ontologies and to UniProt Knowledgebase (UniProtKB) 
>> records. Furthermore, we are also undertaking the "wildly ambitious" 
>> job of representing broader, more-inclusive classes of similar 
>> proteins based on evolutionary relatedness.
>>
>> A further description of PRO (with examples and link to a paper) can 
>> be found at http://pir.georgetown.edu/pro
> 
> This will no doubt be interesting to quite a few people here! For the 
> sake of this discussion, could you elaborate a bit more on how the 
> different concepts in PRO are defined, i.e. what is a "protein", "whole 
> protein", "sequence form" and "cleaved and/or modified product"?

Received on Thursday, 19 July 2007 19:14:01 UTC