- From: Alan Ruttenberg <alanruttenberg@gmail.com>
- Date: Mon, 16 Jul 2007 11:37:30 -0400
- To: Phillip Lord <phillip.lord@newcastle.ac.uk>
- Cc: public-semweb-lifesci <public-semweb-lifesci@w3.org>
On Jul 16, 2007, at 10:19 AM, Phillip Lord wrote: > >>>>>> "MK" == Marijke Keet <keet@inf.unibz.it> writes: > > MK> Lack of sufficient knowledge about a particular (biological) > entity is > MK> a sideshow, not an argument, to the issue of distinguishing > real proteins from > MK> their records. > > I agree. The argument is that it's very hard to describe what you > mean by a > "protein". We almost certainly don't mean a protein molecule. We > might mean a type of > protein. But then we don't know whether two protein molecule are > actually of a given > type. I'm confused. I think we all would agree that there are instances of proteins and we have a good idea of what they are. We also know that there are groups of proteins that are built off the same template and share certain properties. If we define classes using such properties, then we can in principle, decide whether these proteins are members of a given class (subject to experimental limitations). For instance we can define a class of proteins that have a certain primary structure (aa sequence), and then, via assay, measure what fraction of the proteins in some sample have that structure. > My questions are how often do we want to refer to a protein, rather > than a record > about a protein? Any time we want to make a scientific statement about proteins. In my work, that means virtually all the time. For example, I have a body of work that is the target of text mining at the moment - If the text mining worked well enough to understand the articles, what should it generate for semantic web consumption? > And who is responsible for ascribing a ID to a specific type of > protein. In practice, in bioinformatics, the answer to this is a) > we don't and b) uniprot. I agree with a) - we mostly don't and when we do we do it in an unclear and nonstandard way. I disagree with b) Exactly what the class of proteins described by a uniprot record is not clear (though Eric started to make a theory of what it could be). I have seen uniprot ids used even to identify antibodies to a protein. As for who is responsible, I would say that our community is responsible. I expect that there will be efforts along this line in the OBO Foundry and I would hope that there would be broad participation from the people who are interested in following this list. > So, while distinguishing between a uniprot record and a protein > seems like a good > idea, I'm not convinced it brings you anything. What are you going > to do with your > protein ID? I would like to be able to have Invitrogen be able to say that product xxxyyy is an antibody to some specific class of phosphoproteins in a way that a semantic web agent could do some shopping for me if I needed such a reagent. I could go on and name a long list of such cases, but I'm pretty sure you could do the same thing, notwithstanding your playing dumb here. -Alan ps. Hi Phil - glad you're joining the party! > > Phil > > > -- > Phillip Lord, Phone: +44 (0) 191 222 7827 > Lecturer in Bioinformatics, Email: > phillip.lord@newcastle.ac.uk > School of Computing Science, http:// > homepages.cs.ncl.ac.uk/phillip.lord > Claremont Tower Room 909, skype: russet_apples > Newcastle University, > NE1 7RU >
Received on Monday, 16 July 2007 15:37:42 UTC