Re: blog: semantic dissonance in uniprot from eric neumann on 2009-03-26 (public-semweb-lifesci@w3.org from March 2009)

From: eric neumann <ekneumann@gmail.com>
Date: Thu, 26 Mar 2009 11:59:43 -0400
To: Michel_Dumontier <Michel_Dumontier@carleton.ca>
Cc: W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
Message-ID: <92e86c7d0903260859n6b1f2f98od859dfdf0684c50e@mail.gmail.com>
Michel's point resonate with my experiences also, though I hesitate trying
to push the definition of 'ProteinAggregate' to the rest of the bio world...
but it's in the right spirit. : )
-Eric

On Thu, Mar 26, 2009 at 9:28 AM, Michel_Dumontier <
Michel_Dumontier@carleton.ca> wrote:

> Pursuant to my email, and in light of several other comments, if our
> goal is to now rectify what Uniprot:Protein _actually_ means in our
> domain, and how it can be semantically mapped to other bio-ontologies,
> then I might also suggest that instances of Uniprot:Protein are
> aggregates of proteins (err... :ProteinAggregate anyone?), possibly
> separated by both space and time, having a similar (base sequence +
> mutations / ptms) composition, sharing certain characteristics (e.g.
> functionality, domains) and observed to participate in biological
> processes. Clearly not a type of protein of the single molecule form,
> but again, certainly not a Record.
>
> -=Michel=-
>
>
>
> >
> >  If however, what we've been talking about is that identifiers like
> >       http://purl.uniprot.org/uniprot/Q16665
> >
> > are actually database records, and not molecular entities, then we can
> > settle this quickly:
> >
> > Uniprot RDF file: http://www.uniprot.org/uniprot/Q16665.rdf
> > (is this what people were referring to as a Record???)
> >
> > Contains:
> >
> > <rdf:Description rdf:about="http://purl.uniprot.org/uniprot/Q16665">
> >  <rdf:type rdf:resource="http://purl.uniprot.org/core/Protein" />
> >
> >
> > It's clear that the entity denoted by :Q16665 is rdf:type :Protein and
> > is the subject of statements that are biological in nature such as
> > being
> > located in sub-cellular compartments or being involved in biochemical
> > reactions. It is clearly not a Record. This is generally the case for
> > nearly all entries in biomolecular databases.
> >
> > Cheers,
> >
> > -=Michel=-
> >
> > Anxiously waiting see if this clears up things or generates
> controversy
> > .. it's hard to predict!
> >
> >
> >
> > > If nobody ever wants to use the same property to talk about the
> > > database
> > > record as was used to talk about the molecule, and nobody ever makes
> > an
> > > assertion that implies that the class of database records is
> disjoint
> > > from the class of molecules, then I don't see any harm in using the
> > > same
> > > URI to ambiguously denote both.   But if one is trying to design
> data
> > > to
> > > be reusable by others in unforeseen ways, there clearly *is* a risk
> > > that
> > > someone will want to make such assertions in conjunction with the
> > data,
> > > and if that happens there is a clear harm.  This risk is easy to
> > avoid
> > > by using separate URIs.
> > >
> > > There *are* trade-offs.  Minting two URIs instead of one *does* add
> > > some
> > > complexity, though as I pointed out that additional complexity can
> be
> > > mitigated to the point that it is a *very* low cost.  Still,
> > different
> > > people will weigh these trade-offs differently, and what's best for
> > one
> > > situation may not be best for another, as I indicated in my original
> > > post.
> > >
> > > Furthermore, even if one does use the same URI to ambiguously denote
> > > both a database record and a molecule, that is not the end of the
> > world
> > > either.  It is possible (though more difficult) to later separate
> out
> > > and relate the different senses of an ambiguous URI, as I have
> > > described:
> > > http://dbooth.org/2007/splitting/
> > > Ambiguity is inescapable, and ambiguity between a thing and a page
> > that
> > > describes that thing is not fundamentally different from other kinds
> > of
> > > ambiguity (except perhaps that we are aware of it in advance and it
> > can
> > > be easily avoided), as explained here:
> > > http://dbooth.org/2007/splitting/#httpRange-14
> > >
> > > Finally, although it is flattering that you have named this
> > suggestion
> > > after me, I cannot take credit.  As I pointed out in my original
> > post,
> > > the suggestion to differentiate between a molecule and the database
> > > record that describes that molecule originates with the Architecture
> > of
> > > the World Wide Web:
> > > http://www.w3.org/TR/webarch/#URI-collision
> > > and best practices for implementing this distinction are described
> in
> > > Cool URIs for the Semantic Web:
> > > http://www.w3.org/TR/cooluris
> > >
> > > David Booth
> > >
> > >
> >
>
>
>
Received on Thursday, 26 March 2009 16:00:21 UTC