Re: Ambiguous names. was: Re: URL +1, LSID -1

Summary: Continued discussion of whether we need to have identifiers  
for protein classes in addition to those for records.       Example  
finding is given to support my view that we do need them, in response  
to Phil's suggestion I examine my scenarios.
[yah, I know I'm not being consistent about the summaries yet]

On Jul 18, 2007, at 2:22 PM, Phillip Lord wrote:

>>>>>> "Alan" == Alan Ruttenberg <alanruttenberg@gmail.com> writes:
>
>   Alan> Summary: Answering Phil's questions, and clarifying one  
> thing he
>   Alan> asserts about what I said.
>
>>> What if they have a polymorphism?
>   Alan> No.
>>> Are two isoforms from an alternate splice the same protein?
>   Alan> No.
>
> In both of these you differ from uniprot.

Well, if I am restricted to using such Uniprot classes I will have  
trouble representing important scientific findings. If Uniprot only  
has one name for the two molecules, one of which has a snp that leads  
to a loss of function that is the initiating factor of a disease,  
then we have a problem, no? How do we say things about the disease  
related form?

>
>>> Unsatisfying, maybe. Clear definitions are important. But
>>> interoperability, and the lack of duplication are more so.
>
>   Alan> Forgive my confusion, but how exactly will we achieve  
> interoperability
>   Alan> and lack of duplication if we don't have definitions? How  
> would we
>   Alan> know that we don't have duplication, for example?
>
> If you create identifiers to describe proteins rather than protein  
> records
> (like uniprot) then you have created a whole new set of IDs. When  
> anyone wants
> to talk about a protein, they will have to look up the ID.

As they will when they want to talk about a record. Of course perhaps  
we all will add some links of the sort that say the record is about  
some set of classes of proteins, and that aspects of the protein in a  
class can be described by pieces of the record.

But at least we'll know what we are talking about.

>
>>> <snip>
>
>>> And, yet, you just told me that you could buy a antibody with just a
>>> swissprot ID. So, let me restate the question, what are you going  
>>> to do
>>> with a protein ID that you are not going to do with a swissprot  
>>> ID, or
>>> "the protein formally known as OPSD_HUMAN".
>
>   Alan> I did not say that. I've said some people have identified  
> antibodies
>   Alan> by such ids. Unfortunately this information is of limited  
> use when
>   Alan> actually ordering an antibody, where I am interested in  
> much more
>   Alan> information, such as how specific it is, how it has been  
> validated,
>   Alan> and other properties related to how it behaves in certain  
> experimental
>   Alan> settings. I *want* to be able to have identifiers(URIs)  
> that are up to
>   Alan> the job of ordering reagents.
>
> Well, I am not sure that you are going to achieve this with an  
> identifier. You
> need significant extra amounts of metadata.

By that reasoning I don't need DOIs for publications. All I need is  
the URI for the journal and some metadata.

> My point here is simple. Separating out the informatics and biology  
> conform
> better to our notion of reality, sure. But you are talking about  
> modelling
> what makes a protein and, more, a type of protein. Work through  
> your scenarios
> and see whether you need a protein ID for this. If not, you are  
> introducing a
> layer of abstraction that you don't need.

I'm trying to be able to make statements that capture, among other  
things,  the conclusions that one finds in journal articles. In  
http://www.nature.com/onc/journal/v21/n46/full/1205845a.html there is  
a description of different isoforms of BAG-1. The different isoforms  
have names, e.g. "BAG-1 p29"  This name indicates a class of protein  
instances. I expect I need a name and a definition for "BAG-1 p29"  
and the others, so that I don't get confused and think there is a  
contradiction between the statement that "BAG-1 p29 failed to protect  
the transfected cells from apoptosis" and "BAG-1 p50, p46 and p33  
isoforms enhanced the resistance to apoptosis"

But I'm open to discussing  suggestions for representing these  
statements by only making use of the Uniprot records ids, if you have  
any.

-Alan

Received on Friday, 20 July 2007 05:51:29 UTC