Re: Fw: Use of LSIDs in RDF (fwd) from Greg Tyrelle on 2004-04-29 (public-semweb-lifesci@w3.org from April 2004)

From: Greg Tyrelle <greg@tyrelle.net>
Date: Thu, 29 Apr 2004 16:53:56 +1000
To: Sean Martin <sjmm@us.ibm.com>
Cc: public-semweb-lifesci@w3.org
Message-ID: <20040429065356.GA26357@nodalpoint.org>
*** Sean Martin wrote: 
  |BG>This leads me to a question about "persistent" URI's and URL's
  | BG>(PURLS's): How do you ensure that two URI's are pointing at the same
  | BG>object (bytes)? 
  |
  |My question is how does one programmatically identify a persistent HTTP 
  |URI, as opposed to one that will retrieve tomorrow's weather or perhaps 
  |retrieve a file from a P2P network or one that returns dynamically 
  |changing content? Apologies in advance if there is an obvious answer to 
  |this question. 

I am not aware of a way to programmatically identify a persistent HTTP
URI. Making URIs persistent is largely a function of who is responsible
for maintaining that URI's authority.

If I understand correctly the question you are asking is "tell me
something about the resource being identified by this URI ?". There
are a number of approaches to this. In the case of LSID this would be
the getMetaData interfaces. For HTTP URIs my current favourite is
URIQA (MGET HTTP method extension i.e. metadata get) [1]. RDDL [2] is
intended for this purpose but mainly for namespaces.

  |HTTP URI's as probably the primary method of retrieval of the data object 
  |or meta-data about that object - after all much of the public LS data is 
  |actually out there on the web already retrievable by HTTP URI. If HTTP 
  |URI's were sufficient today, we would not have need of the LSID. So 
  |perhaps the question you should ask your self is why are people not 
  |already widely using URL's for LS naming?

People are using URLs (HTTP URIs for naming), for example:

http://www.biomedcentral.com/pubmed/12225585 

is a 302 redirect to the NCBI URL 

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12225585&dopt=Abstract&holding=f1000

Again, I believe it is how HTTP URIs are used or managed which is the
problem, not that they are broken or insufficient technology for the
purpose of naming.

  |For me the main points are: 
  |Location independence of the object named - the extra layer of indirection 
  |makes this flexibility possible - there is a starting assumption that 
  |users will make/exchange local copies of the objects and also that 
  |authority entities will at some point want to transfer the authority over 
  |a LSID to another authority entity - while potentially maintaining control 
  |of their domain name, sometimes the same data is served from more than one 
  |"official" place on the web(e.g. Swiss-Prot - Marja, how does Annotea deal 
  |with this situation?), having the option of not using domain names in the 
  |identifier at all; 

Good points. However HTTP has the 3XX error codes to provide
redirection etc. why invent a new protocol when these already exist ? 

  |Providing/using LSID's for one's data establishes a "contract" in which 
  |certain properties can be assumed (beyond those of the HTTP URI 
  |"contract") of an LSID named object:
  |defines what can safely be assumed about multiple copies of objects which 
  |have the same LSID name - i.e. that they are identical; clear definition 
  |of what persistence means [both availability and never modifying a named 
  |object]; 

This "contract" is a social contract, persistence based on a social
contract can also be true of HTTP URIs.

  |a formal mechanism for retrieving data [never ever changes] over multiple 
  |protocols and discovering and retrieving meta-data [which can change] 
  |about that object and its relationship to other objects [from the original 
  |source of the object or from a third-party who has something to add of 
  |their own] all using a single globally unique name.

I think the selling point of LSID (for me) is a standard interface for
life sciences metadata. One aspect of this that bothers me though, is
partitioning of the semantic web into domains based on their metadata
access interfaces. Access to metadata based on URIs alone only makes
sense to me if the mechanisms to get the me is general for the web.

  |One parting thought.. widespread adoption of LSID spec. across the 
  |industry will at the same time create a very large semantic web. 

It is true that only the widespread adoption of LSID will make it
useful to the semantic web. I am guessing by default (laziness ?) 
HTTP URIs will be used as resources identifier if a LSID is not
*easily* usable i.e. tools, tools, tools...
 
My limited testing of the perl LSID clent implementation, the only
LSIDs I was able to resolve were from the North temperate Lakes [3]
authority. Both the PDB and NCBI authority URLs were not working (or I
couldn't get them to work with the perl client).

_greg

[1] http://sw.nokia.com/uriqa/URIQA.html
[2] http://www.rddl.org/
[3] http://lsid.limnology.wisc.edu/ 

-- 
Greg Tyrelle
Received on Thursday, 29 April 2004 03:04:15 UTC