Re: A precedent suggesting a compromise for the SWHCLS IG Best Practices (ARK)

On Sun, 30 Jul 2006 16:46:21 -0700, Alan Ruttenberg  
<alanruttenberg@gmail.com> wrote:

> One of the things that I have has a concern over with LSIDs is what to  
> do with versioned identifiers.
>
> Sometimes it is important to have the version - like when you are doing  
> some sequence based analysis and you need to remember the specific  
> sequence you used. But in many other cases, when you make some assertion  
> about the gene you either don't know the version that the statement was  
> based on, or you don't care because the intension to refer to the  
> concept of whatever the exact sequence turns out to be.


I may be speaking out-of-turn here, and should probably let Sean answer  
this one since he may have (no doubt) thought-through it more deeply than  
I have; however I think you may be mixing up several different entities  
here (as so often happens in a URL world ;-) )

In the case you cite above you are likely talking about a "gene", not a  
"sequence".  A "gene" will have its own LSID, and it is (even by the  
strict genetic definition) a conceptual entity defined by  
complementation.  A "gene" and its "sequence" are not the same thing!   
So... I don't see a problem.  When you need to refer to the gene in the  
abstract, you can refer to the gene's LSID.  When you need to talk about a  
concrete sequence, you refer to *it's* LSID.  The metadata of the gene  
will (in a sensible world) include triples that describe its possible  
sequences, and these will have versions.

Genes have many many many properties, so we cannot munge them all into  
"sequence".  Certainly, this is how we are modelling our data locally...

I stand by LSID's :-)

Mark

Received on Monday, 31 July 2006 01:06:13 UTC