RE: [BioRDF] global uniqueness requirement of LSIDs and RDF from Miller, Michael D (Rosetta) on 2006-08-14 (public-semweb-lifesci@w3.org from August 2006)

From: Miller, Michael D (Rosetta) <Michael_Miller@Rosettabio.com>
Date: Mon, 14 Aug 2006 08:25:59 -0700
To: "Sean Martin" <sjmm@us.ibm.com>, public-semweb-lifesci@w3.org
Message-ID: <E1GCeKY-0003aJ-CD@maggie.w3.org>

Hi Sean,
 
Thanks for your clarification, exactly what John's e-mail brought to my
mind but much better explained.
 
A similar use case might be a gene expression experiment that is sent
into ArrayExpress.  At some point someone who downloads the experiment
discovers that one of the hybridization is totally clustering with a
different set of replicates than the one it was assigned.  The original
investigator takes a look and discovers that the lab technician had
grabbed the sample aliquot from the wrong shelf and recorded the
original sample's LSID.
 
So to update ArrayExpress, the Hybridization is still the same but it
needs a new version and needs to be associated with the proper sample
LSID.  The experiment itself needs to get a new version and have the
Hybridization be moved to the proper set of replicates and the data
needs to have new versions and the DataCubes updated with the new,
recalculated replicate DataCubes..
 
cheers,
Michael

	-----Original Message-----
	From: public-semweb-lifesci-request@w3.org
[mailto:public-semweb-lifesci-request@w3.org] On Behalf Of Sean Martin
	Sent: Monday, August 14, 2006 5:47 AM
	To: public-semweb-lifesci@w3.org
	Subject: Re: [BioRDF] global uniqueness requirement of LSIDs and
RDF
	
	

	Hello John, 
	
	> 
	> > How I've come to think about this is that some properties
are intrinsic
	> > to the type of record, for a person, perhaps their SSN if
American, and
	> > some are not, such as a person's age.  But even this becomes
context
	> > dependent if one wishes to track the state of the person
once a year.
	> 
	> If I understand the uniqueness requirement of LSIDs, then a
new LSID for
	> "Michael Miller" must be created every year when the age
property changes.
	
	This is not quite how it is meant to work. You would only create
a new LSID for Michael Miller each year if he was a data file and
somehow his bytes changed :-)  In the case you describe Michael is more
of an idea (sorry Michael!) with many facets, some that can be
concretely represented as bytes (the bytes named) and some conceptual
that can be described in metadata (that further describe the concept
named) and  have no associated unique data (that is named) bytes. 
	
	You could use an LSID (or any kind of URI) without any directly
associated data bytes to represent Michael as a central concept. Then a
metadata graph associated with this conceptual URI might tell you his
date of birth, it might also contain links to LSIDs and other URIs that
contain separate concrete representations of Michael - for example x-ray
images, MRIs, his DNA sequence or results for other tests that have a
binary representation and where it makes sense to uniquely name each as
a discrete data item. These different representations may even be made
available in different contexts/formats (e.g. images of differing size,
resolution or binary format like png and gif) and each with its own
LSID. Similarly if for some reason one of these images is changed later
(say a better algorithm for sharpening), that new image instance could
be made available as an LSID revision by incrementing the version area
of the LSID name. 
	
	Kindest regards, Sean 
	
	-- 
	Sean Martin 
	IBM Corp.

Received on Monday, 14 August 2006 15:26:27 UTC