Re: blog: semantic dissonance in uniprot

Oliver Ruebenacker <curoli@gmail.com> writes:
>   Is it possible that referring to records instead of things is not
> the result of confusion, but rather of cost-benefit considerations -
> that records are cheap and identification is costly and open-ended?
> What is it that can not be achieved by having better records instead?


I never said confusing, I said conflating. I also gave a reason why this
conflation was, at times, a good thing. 


>   And what does it take to identify something? We may have thought we
> know what a couch is, until we realize that we have no consensus over
> whether the pillows are part of the couch or not, and that it would be
> more accurate to distinguish between bare couches (without pillows)
> and fully featured couches (with pillows). How far are we going to go?

I don't know, but I do think that we have a reasonable handle on the
engineering decisions that we need to make; the problem is that these
are application dependent. This is a problem if you want to integrate
data for a purpose that is was not originally intended.

My own feeling is taht "identifying the underlying biology" is
attractive, but not that plausible, because we have no good way of
understanding identity at this level without a record. So, when talking
about a proteins, how do we know when we have one and when we have two?
It's not obvious. But uniprot have a mechanism for making this
judgement. 

So when refereing to a uniprot record we mostly mean "the record, and
the extensional set of proteins that are defined by it". 

Phil

Received on Tuesday, 24 March 2009 16:53:36 UTC