Re: protein entities (was Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1) from June Kinoshita on 2007-07-19 (public-semweb-lifesci@w3.org from July 2007)

From: June Kinoshita <junekino@media.mit.edu>
Date: Thu, 19 Jul 2007 16:48:14 -0400
To: Darren Natale <dan5@georgetown.edu>
Cc: Eric Jain <Eric.Jain@isb-sib.ch>, Alan Ruttenberg <alanruttenberg@gmail.com>, Chris Mungall <cjm@fruitfly.org>, Bijan Parsia <bparsia@cs.man.ac.uk>, public-semweb-lifesci hcls <public-semweb-lifesci@w3.org>
Message-Id: <43828DD1-148D-47E0-8510-06E30D8C405D@media.mit.edu>

If I may put forward a key protein in Alzheimer disease as an example  
that we are grappling with, there is full-length APP (which itself  
has a number of forms as well as mutations); various peptides derived  
from cleavage of APP; and then multimeric forms of the peptides,  
particularly Abeta42, which is known to form soluble dimer, trimer,  
tetramer, hectamer, and dodecamer, each of which may have different  
functions or toxicities, as well as "misfolded" protofibrillar and  
insoluble fibrillar forms, and possibly a pore-like form consisting  
of I-forget-how-many Abetas. In addition, proteins form complexes  
that have functions that are different from those of the non- 
complexed protein. I look forward to seeing how the Protein Ontology  
unfolds, so to speak! - June

On Jul 19, 2007, at 11:23 AM, Darren Natale wrote:

>
> We don't yet have formal definitions for many of the classes and  
> relations (the effort only began in earnest a few months ago).   
> But, basically, there is a distinction made between the full-length  
> (in terms of amino acid sequence) protein and the sub-length parts  
> of proteins (commonly called domains by protein scientists,  
> unfortunately).  The term "whole protein" is somewhat of a  
> placeholder; it is used to signify the evolutionary classes  
> (families) of full-length proteins as opposed to the evolutionary  
> classes of domains.  Sequence form is again a placeholder term used  
> to denote the initial translation product from an mRNA, which  
> itself might be based on a "normal" gene or a mutant thereof, or  
> which might be one of several possible alternatively spliced  
> transcripts from the normal or mutant gene.  The cleaved or  
> modified product is a further breakdown of those initial  
> translation products, and allows one to distinguish between a  
> phosphorylated version of a protein and the non-phosphorylated  
> version (as an example).  The need for the latter derives from the  
> fact that the two versions might have different functions.
>
> Eric Jain wrote:
>> Darren Natale wrote:
>>> We recently began a new Protein Ontology (PRO) effort geared  
>>> precisely toward the formal definition of the "smaller entities"  
>>> referred to by Alan.  By "we" I mean the PRO Consortium,  
>>> comprising the PIs Cathy Wu of PIR (which is also a member  
>>> organization of the UniProt Consortium), Barry Smith of SUNY  
>>> Buffalo, and Judy Blake of Jackson Labs.  PRO is being developed  
>>> within the framework of the OBO Foundry, and aims to specify  
>>> protein entities at the level mentioned by Chris (accounting for  
>>> splice variation and post-translational modification and  
>>> cleavage). Where appropriate, PRO will indeed make reference to  
>>> both other ontologies and to UniProt Knowledgebase (UniProtKB)  
>>> records. Furthermore, we are also undertaking the "wildly  
>>> ambitious" job of representing broader, more-inclusive classes of  
>>> similar proteins based on evolutionary relatedness.
>>>
>>> A further description of PRO (with examples and link to a paper)  
>>> can be found at http://pir.georgetown.edu/pro
>> This will no doubt be interesting to quite a few people here! For  
>> the sake of this discussion, could you elaborate a bit more on how  
>> the different concepts in PRO are defined, i.e. what is a  
>> "protein", "whole protein", "sequence form" and "cleaved and/or  
>> modified product"?
>
>

Received on Thursday, 19 July 2007 20:48:45 UTC