Re: protein entities (was Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

We don't yet have formal definitions for many of the classes and 
relations (the effort only began in earnest a few months ago).  But, 
basically, there is a distinction made between the full-length (in terms 
of amino acid sequence) protein and the sub-length parts of proteins 
(commonly called domains by protein scientists, unfortunately).  The 
term "whole protein" is somewhat of a placeholder; it is used to signify 
the evolutionary classes (families) of full-length proteins as opposed 
to the evolutionary classes of domains.  Sequence form is again a 
placeholder term used to denote the initial translation product from an 
mRNA, which itself might be based on a "normal" gene or a mutant 
thereof, or which might be one of several possible alternatively spliced 
transcripts from the normal or mutant gene.  The cleaved or modified 
product is a further breakdown of those initial translation products, 
and allows one to distinguish between a phosphorylated version of a 
protein and the non-phosphorylated version (as an example).  The need 
for the latter derives from the fact that the two versions might have 
different functions.

Eric Jain wrote:
> Darren Natale wrote:
>> We recently began a new Protein Ontology (PRO) effort geared precisely 
>> toward the formal definition of the "smaller entities" referred to by 
>> Alan.  By "we" I mean the PRO Consortium, comprising the PIs Cathy Wu 
>> of PIR (which is also a member organization of the UniProt 
>> Consortium), Barry Smith of SUNY Buffalo, and Judy Blake of Jackson 
>> Labs.  PRO is being developed within the framework of the OBO Foundry, 
>> and aims to specify protein entities at the level mentioned by Chris 
>> (accounting for splice variation and post-translational modification 
>> and cleavage). Where appropriate, PRO will indeed make reference to 
>> both other ontologies and to UniProt Knowledgebase (UniProtKB) 
>> records. Furthermore, we are also undertaking the "wildly ambitious" 
>> job of representing broader, more-inclusive classes of similar 
>> proteins based on evolutionary relatedness.
>> A further description of PRO (with examples and link to a paper) can 
>> be found at
> This will no doubt be interesting to quite a few people here! For the 
> sake of this discussion, could you elaborate a bit more on how the 
> different concepts in PRO are defined, i.e. what is a "protein", "whole 
> protein", "sequence form" and "cleaved and/or modified product"?

Received on Thursday, 19 July 2007 19:14:01 UTC