RE: blog: semantic dissonance in uniprot

Eric,

On Sat, 2009-03-21 at 13:49 -0400, Michel_Dumontier wrote:
> Eric and friends,
>
>  I¢m very sympathetic to the simplifying assumption of not
> distinguishing between a record and the molecular entity it
> represents, but . . . .

I do not think this would be a wise "simplification".  This is only a
simplification from one perspective: because it avoids having to mint
and maintain pairs of URIs instead of a single URI.  But the downstream
cost is that it creates an ambiguity (or "URI collision")
http://www.w3.org/TR/webarch/#URI-collision
that may cause trouble and be difficult to untangle later as the data is
used in more and more ways.  For example, if any of the same predicates
need to be used on both the record and the molecular entity, they will
become hopelessly confused.  Also, if disjointness assertions are
included then this overloading may cause logical contraditions.

Cool URIs for the Semantic Web
http://www.w3.org/TR/cooluris 
describes best practices for minting URIs using 303 redirects to enable
the record to be obtained (indirectly) by following the URI for a
molecular entity.  If minting a separate URI for the molecular entity
seems onerous, it is trivial to use a 303-redirect service such as
http://thing-described-by.org/ 
to do the job for you.  And if you want to set up your own 303-redirect
service, that site will even show you the exact files that are used to
implement it:
http://thing-described-by.org/#What_This_Site_Does_ 

Provenance (who said what) is extremely important in scientific anaylsis
-- explicitly tracking the evidence leading to scientific assertions.
It is easy for me to envision applications that will both use assertions
about a molecular entity *and* assertions about the records that
describe those molecular entities.

If you are just minting disposable URIs that aren't intended to be very
reusable anyway, then this ambiguity is not a problem, and it may be the
quickest solution to your problem.  But if you want your URIs to be long
lived and used by others for other applications, I think it would be a
mistake.

David Booth

Received on Tuesday, 24 March 2009 14:38:16 UTC