Re: Ambiguous names. was: Re: URL +1, LSID -1

Summary: Answering Phil's questions, and clarifying one thing he  
asserts about what I said.


On Jul 16, 2007, at 12:22 PM, Phillip Lord wrote:

>>>>>> "Alan" == Alan Ruttenberg <alanruttenberg@gmail.com> writes:
>
> Take these rhethorical questions:
I am interpreting these as questions of fact, that "same" means  
instances of the same class, with the classes you name considered  
narrowly construed. That doesn't mean that we can't define broader  
classes in which instances of these two types are considered to be  
members of the same class.
>
> Is Red Opsin in human the same as Red Opsin in Cattle?
No.
> Is Red Opsin in me, necessarily the same as Red Opsin in you?
No.
> What if they have a polymorphism?
No.
> Are two isoforms from an alternate splice the same protein?
No.
> If a protein has been partly digested, is it still the same?
No.
> Are haemoglobin alpha and beta the same?
No.
> The point is that you can't deal with a protein computationally.  
> You can't
> resolve it, analyze it computationally. It's always second hand  
> information
> that you want to deal with.

Yes, but we generalize and boldly make statements about what we can  
directly see, and find that these are supported by further  
experiments or not, and possibly revise our statements. I *think* we  
want to be able to capture such statements on the semantic web, no?

> Yes, exactly. A uniprot record defines a class of proteins  
> extensionally. This
> means, antibodies to the proteins described by OPSD_HUMAN (for  
> example).

Well, if I tell my agent to go order some OPSD_HUMAN from Invitrogen,  
what will you expect to get back. Or do you deny that I will want to  
use identifiers such as this for this kind of purpose.

> <snip in the interest of brevity>

> If we have the ability to express "the class of protein molecules  
> defined by the swissprot record OPSD_HUMAN"
> then I think we have all we need.

That would be a good start. How will we see if we've succeeded? I  
have some ideas, like picking two people who work in the field,  
asking them to describe what the set of proteins are that are  
described by the swissprot record OPSD_HUMAN, and then comparing what  
they say. How would you know when we've succeeded at this?

I think that if we were there, then we could effectively start to  
build formal statements.

> If we make our own definitions, all that we have done is duplicate  
> what the uniprot team are already doing. And we will, almost  
> inevitably, do it somewhat differently. All we would do is create  
> confusion. The only way that we ensure that we do the same thing as  
> uniprot is say "yeah, what they said".
>
> Unsatisfying, maybe. Clear definitions are important. But  
> interoperability, and the lack of duplication are more so.

Forgive my confusion, but how exactly will we achieve  
interoperability and lack of duplication if we don't have  
definitions? How would we know that we don't have duplication, for  
example?

> <snip>

> And, yet, you just told me that you could buy a antibody with just  
> a swissprot ID. So, let me restate the question, what are you going  
> to do with a protein ID that you are not going to do with a  
> swissprot ID, or "the protein formally known as OPSD_HUMAN".

I did not say that. I've said some people have identified antibodies  
by such ids. Unfortunately this information is of limited use when  
actually ordering an antibody, where I am interested in much more  
information, such as how specific it is, how it has been validated,  
and other properties related to how it behaves in certain  
experimental settings. I *want* to be able to have identifiers(URIs)  
that are up to the job of ordering reagents.

-Alan

Received on Monday, 16 July 2007 18:16:55 UTC