Re: ontology specs for self-publishing experiment from Larry Hunter on 2006-07-11 (public-semweb-lifesci@w3.org from July 2006)

From: Larry Hunter <Larry.Hunter@uchsc.edu>
Date: Tue, 11 Jul 2006 14:09:55 -0600
To: Phillip Lord <phillip.lord@newcastle.ac.uk>
Cc: w3c semweb hcls <public-semweb-lifesci@w3.org>
Message-Id: <1152648595.19401.31.camel@fast.UCHSC.EDU>

On Mon, 2006-07-10 at 11:42 +0100, Phillip Lord wrote:

> 
> My own feeling is that the fly people got it right years ago. Their
> gene identifiers had meaning, but not too much. So, for example,
> sevenless is a mutant lacking the 7th cell in the eye. Clear, straight
> forward and memorable. And if the world changes under you, the name
> could be left the same because it doesn't really matter that much. 

And hugely, miserably ambiguous.  The use of regular English words to
represent drosophila gene names has significantly held back the
application of information extraction technology for that model
organism, and wrecked all sorts of other havoc.  You can't even look up
the "to" gene in NCBI -- its filtered out of the query as a stop word --
but it's in there:
(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=full_report&list_uids=43036)

I would say that the yeast people got it right.  Unambiguous identifiers
that can clearly be recognized as such.  The amazing thing was that the
community agreed to use those names in papers, rather than reserve
"naming rights" for the "discoverer" of the gene.  As usual, the trick
is social, not technological.

Larry

Received on Tuesday, 11 July 2006 20:11:05 UTC