Re: proposal for standard NCBI database URI from Phillip Lord on 2006-05-10 (public-semweb-lifesci@w3.org from May 2006)

From: Phillip Lord <phillip.lord@newcastle.ac.uk>
Date: Wed, 10 May 2006 15:27:06 +0100
To: <public-semweb-lifesci@w3.org>
Message-ID: <ud5emj6px.fsf@newcastle.ac.uk>

>>>>> "MS" == Matthias Samwald <samwald@gmx.at> writes:

  >>  Also, it's not clear what it meant by "same thing".
  >> 
  >> 
  >>  An genbank record and embl record identifying the same piece of
  >> DNA  are not the same thing; they are different records.

  >> Or probably "different record, but same gene, according to some
  >> criteria"

  MS> Clearly, there should be different URIs for different
  MS> records. However, the problem discussed higher up in this thread
  MS> was that there could be different URIs for EXACTLY the same
  MS> record, simply because a different namespace is used, for
  MS> instance. Just a single differing character is sufficient. If we
  MS> do not try to make an effort to avoid even this simple problem,
  MS> our ultimate goal (data/information integration) seems
  MS> unreachable.

I am just a little unconvinced that we should try to shove enough
information into a URI to avoid having to do identify (or identifier)
resolution elsewhere. 

This is, I think, the reason that LSID spec went the "byte identity"
way that it did -- it's a relatively straight-forward and simple
test to determine whether you have the same thing. 

Of course, I agree that having a single byte identical entity having
multiple URI's is a pain, but even if you try to avoid it (which is
wise), it should not be mandated that it has to be avoided. 

  MS> Ultimately, we should only have to talk about the biological
  MS> things (and apply URIs to real world resources)

Aye, right, laudable aim. But 150 years down the line, we still don't
have a workable definition of gene, species, organism or even
life. 

  MS> The question should be 'what is the URI of the class of human
  MS> insulin molecules?', not 'what is the URI of the database entry
  MS> about human insulin molecules in the Uniprot database?'. Down
  MS> with the unnecessary abstractions!

Yep. Unnecessary abstractions are a pain. Think it's fairly easy to
argue, thought, that "human insulin" is much more of an abstraction
that "uniprot record".

Phil

Received on Wednesday, 10 May 2006 14:27:25 UTC