RE: blog: semantic dissonance in uniprot

Pursuant to my email, and in light of several other comments, if our
goal is to now rectify what Uniprot:Protein _actually_ means in our
domain, and how it can be semantically mapped to other bio-ontologies,
then I might also suggest that instances of Uniprot:Protein are
aggregates of proteins (err... :ProteinAggregate anyone?), possibly
separated by both space and time, having a similar (base sequence +
mutations / ptms) composition, sharing certain characteristics (e.g.
functionality, domains) and observed to participate in biological
processes. Clearly not a type of protein of the single molecule form,
but again, certainly not a Record.

-=Michel=-



> 
>  If however, what we've been talking about is that identifiers like
>  	http://purl.uniprot.org/uniprot/Q16665
> 
> are actually database records, and not molecular entities, then we can
> settle this quickly:
> 
> Uniprot RDF file: http://www.uniprot.org/uniprot/Q16665.rdf
> (is this what people were referring to as a Record???)
> 
> Contains:
> 
> <rdf:Description rdf:about="http://purl.uniprot.org/uniprot/Q16665">
>  <rdf:type rdf:resource="http://purl.uniprot.org/core/Protein" />
> 
> 
> It's clear that the entity denoted by :Q16665 is rdf:type :Protein and
> is the subject of statements that are biological in nature such as
> being
> located in sub-cellular compartments or being involved in biochemical
> reactions. It is clearly not a Record. This is generally the case for
> nearly all entries in biomolecular databases.
> 
> Cheers,
> 
> -=Michel=-
> 
> Anxiously waiting see if this clears up things or generates
controversy
> .. it's hard to predict!
> 
> 
> 
> > If nobody ever wants to use the same property to talk about the
> > database
> > record as was used to talk about the molecule, and nobody ever makes
> an
> > assertion that implies that the class of database records is
disjoint
> > from the class of molecules, then I don't see any harm in using the
> > same
> > URI to ambiguously denote both.   But if one is trying to design
data
> > to
> > be reusable by others in unforeseen ways, there clearly *is* a risk
> > that
> > someone will want to make such assertions in conjunction with the
> data,
> > and if that happens there is a clear harm.  This risk is easy to
> avoid
> > by using separate URIs.
> >
> > There *are* trade-offs.  Minting two URIs instead of one *does* add
> > some
> > complexity, though as I pointed out that additional complexity can
be
> > mitigated to the point that it is a *very* low cost.  Still,
> different
> > people will weigh these trade-offs differently, and what's best for
> one
> > situation may not be best for another, as I indicated in my original
> > post.
> >
> > Furthermore, even if one does use the same URI to ambiguously denote
> > both a database record and a molecule, that is not the end of the
> world
> > either.  It is possible (though more difficult) to later separate
out
> > and relate the different senses of an ambiguous URI, as I have
> > described:
> > http://dbooth.org/2007/splitting/
> > Ambiguity is inescapable, and ambiguity between a thing and a page
> that
> > describes that thing is not fundamentally different from other kinds
> of
> > ambiguity (except perhaps that we are aware of it in advance and it
> can
> > be easily avoided), as explained here:
> > http://dbooth.org/2007/splitting/#httpRange-14
> >
> > Finally, although it is flattering that you have named this
> suggestion
> > after me, I cannot take credit.  As I pointed out in my original
> post,
> > the suggestion to differentiate between a molecule and the database
> > record that describes that molecule originates with the Architecture
> of
> > the World Wide Web:
> > http://www.w3.org/TR/webarch/#URI-collision
> > and best practices for implementing this distinction are described
in
> > Cool URIs for the Semantic Web:
> > http://www.w3.org/TR/cooluris
> >
> > David Booth
> >
> >
> 

Received on Thursday, 26 March 2009 13:30:26 UTC