Re: Less strong equivalences (was Re: blog: semantic dissonance in uniprot)

On Mar 26, 2009, at 11:28 AM, eric neumann wrote:

> Pat,
>
> Basically I'm in agreement with all of your points, but need to  
> correct some mis-interpretations you made of my comments...

Sure, thanks for clarifying.

>
> On Thu, Mar 26, 2009 at 3:13 AM, Pat Hayes <phayes@ihmc.us> wrote:
>
> On Mar 25, 2009, at 5:27 PM, eric neumann wrote:
>
> Several different issues here.
>
>>
>>
>> On Wed, Mar 25, 2009 at 5:47 PM, Bijan Parsia <bparsia@cs.manchester.ac.uk 
>> > wrote:
>> Eric,
>>
>> Thanks for the use case!
>>
>>
>> On 25 Mar 2009, at 21:31, eric neumann wrote:
>> <snip>
>>
>>
>> This is the kind of "similar" used in most internal genomic/ 
>> compound systems...
>>
>> <http://myOrg.com/sw/mxid/PHLP0005>  :isIdentifiedwith  <http://www.uniprot.org/uniprot/P16233 
>> >
>>
>> Can you explicate this a bit more for me? I.e., could you present  
>> what you expect this to do or not do?
>>
>> Certainly... I want look up what myOrg knows about a uniprot  
>> protein, but since they do their own internal data-keeping on  
>> things like "druggability" which aren't included (yet) in uniprot,  
>> I need to make sure my extra data is mapped to the public protein  
>> object.
>>
>> Does this help you?
>
> It doesn't help me. We need to have a 'semantic' answer. What kinds  
> of thing are being talked about here? What do the URIs refer to?  
> (Records or chemicals?) Because the use of sameAs depends on the  
> answer to this question very crucially.
>
> In my company I have a ProteinDictionary table populated will all  
> 'known human proteins' (this is the conceptual part that is easy for  
> all biologists, but is causing some confusion in the thread); each  
> entry is identified (sameAs?) with a protein in Uniprot (as well as  
> a protein in NCBI-Entrez)
>
> In ProteinDictionary I include a lot of additional data (not found  
> in Uniprot) on what antibodies exists for that protein (structure) .  
> Therefore, the records "refer" to the same protein, but do not have  
> identical properties

The _records_ don't have identical properties, sure. But (if I follow  
you), the names in the table refer to the proteins, not to the  
records. Therefore, no problem. It is fine for you to have more  
information about the same thing that someone else is talking about.

Well, maybe a practical problem. After reading Michel's post, its not  
at all obvious that Uniprot and NCBI-Entrez are actually talking about  
the same kinds of thing, which is the practical reason why using  
sameAs might be problematic. I'd suggest (inventing and) using  
something like sameProteinAs, which is reflexive and symmetric and  
probably transitive but not substitutive. Think of it as a topic- 
specific version of seeAlso.

> My company has more knowledge about the protein, but it is not  
> common to everyone; case of Open World assumptions...

Exactly. But that is not a problem (unless you identify proteins with  
table entries... nah, bad idea.)

>
> Is that clearer?
>>
>> (Of course, in a SW world this could have all been done with  
>> internal triples added to the uniprot URI locally...)
>>
>>
>>
>> It really isn't probabilistic anymore since the scientists have all  
>> agreed and defined their entry based on some of the info from the  
>> public entity; for most situations it is an 'exact mapping' to the  
>> referred molecules.
>>
>> Is it that most, but not all of the time, you can treat is as  
>> sameAs but sometimes you don't want to?
>>
>> Well, the question we ask of experts like you is: should we are  
>> should we not use owl:sameAs for exact mappings to entities with  
>> different records?
>
> If your URIs are referring to the entities, then use sameAs when you  
> are sure you are talking about the same entity, no matter what your  
> records say about it. If they are referring to the records, then I  
> would guess that sameAs would be true only when two URIs resolve to  
> the same resource using GET.
>
> If we all agree we are referring to the protein in question, but the  
> Uniprot and Entrez URIs may have different (hopefully consistent up  
> to open-world assumptions) information.

Of course, but thats not an ontological problem. In fact, it is to be  
expected.

>>
>> I agree owl:sameAs was not intended for this kind of relation, but  
>> is is extremely common, and a specialized relation for this would  
>> be very much desired. : )
>>
>> We need to make me understand the relation :)
>>
>> There are other "identiity" or "similar" relations
>
> Braaagh! Semantic alarm!  Identity is NOT similarity. Identity  
> really does mean being EXACTLY the same thing. If A similarTo B,  
> then we are talking about two things which are similar. If A sameAs  
> B, then we are talking about ONE THING which happens to have two  
> names.
>
> I did not intend to equate   "identity" and "similar"

Sorry, I have a hair trigger on that issue. Think of me as an annoying  
car alarm.

<snip>

> Certainly, but how best should we apply OWL so that this can be well  
> represented?

Good question. I need to know more than I do about protein chemistry,  
but from reading the stuff on this thread, seems to me that being  
explicit about proteins being a kind of substance or material would be  
a good start, and then asking what kinds of mass-term relationships  
one might need (mixture, L-isomeric form of, whatever) between protein- 
stuffs.

> Dare we promote meta-classing at this point?

Its in OWL 2, and I think many tools allow it already or will very  
soon. So yes, dare :-) But be very clear that it really does what you  
want. I'm not yet convinced, myself.

> I'd rather use OWL to accurately represent "a Molecule Class means  
> this...., and an instance means that ...." whether its structure  
> patterns, property groupings, or mind-conceptual objects ("I can  
> create a specific and novel chemical with this structure and these  
> properties")
>
>
> ...If this discussion is beginning to settle onto a commonly agreed  
> set of principles, I'd like to suggest we capture it and circulate  
> for comment, perhaps through HCLS.

Even if its only a draft for discussion, sounds like a good idea.

>
> cheers,
> -Eric

Pat

Received on Thursday, 26 March 2009 18:18:28 UTC