RE: Prototype URL to Life Science Identifier (LSID) gateway now available from Booth, David (HP Software - Boston) on 2006-10-13 (public-semweb-lifesci@w3.org from October 2006)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Fri, 13 Oct 2006 00:46:18 -0400
To: "Sean Martin" <sjmm@us.ibm.com>, <public-semweb-lifesci@w3.org>
Cc: <Lsid-developer@lists.sourceforge.net>
Message-ID: <EBBD956B8A9002479B0C9CE9FE14A6C20155C7A8@tayexc19.americas.cpqcorp.net>
Sean,

Great work!  If these http URIs are generally used, 
 
    - Client software that is *not* aware of LSID-specific dereferencing
mechanisms can use these URIs to find useful information; and
 
    - Client software that *is* aware of LSID-specitic dereferencing
mechanisms may choose to optimize dereferences by syntactically
recognizing the http URI patterns that you specify below and converting
them to equivalent LSID derereferences.
 
 
A few suggestions (since you asked for feedback):
 
1. I think the default http GET action applied to the object-identifying
URI (such as
http://lsid-info.org/urn:lsid:ncbi.nlm.nih.gov.lsid.biopathways.org:genb
ank:30350027  in your example below) should return metadata about that
object instead of a copy of the object itself, so that software can
learn about the object before actually trying to retrieve a copy.  Why?
A copy of the object does not tell me how to get the other associated
metadata, but the metadata can (and should) tell me how to get a copy of
the object.  In other words, providing the metadata as the default
provides a clear, well-defined chain of discovery.

2. Content negotiation can (should?) be used to provide both RDF and
HTML from the same generic URI.   

3. If content negotiation is used, and the requesting client does not
indicate what kind of media types it likes, then IMO it is best to
default to HTML so that a human can read it.  RDF-aware software should
know to ask for RDF, but a human first discovering the URI may not know
to specifically ask for HTML.
 
4. Optimization (as mentioned above) would be easier if the URI patterns
were based on simple prefixes, rather than having to look both at the
beginning and the end of the URI string for http://lsid-info.org (and
possibly also at the middle, to see if it is not "rdf2html/", though you
could consider that a part of the prefix).

Finally, if you accept suggestions #1 and #4 above:

5. To conform to the TAG's httpRange-14 guidance, an http GET on the
object-identifying URI  (such as
http://lsid-info.org/urn:lsid:ncbi.nlm.nih.gov.lsid.biopathways.org:genb
ank:30350027  in your example below) should return a 303 "See Other"
response code, forwarding to another URI that actually serves the
metadata about the object.  

You might also want to take a look at my write-up on "Converting New URI
Schemes or URN Sub-Schemes to HTTP"[6].
 
[5] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
[6] http://dbooth.org/2006/urn2http/

Thanks!

David Booth, Ph.D.
HP Software
dbooth@hp.com
Phone: +1 617 629 8881
  

 


________________________________

	From: public-semweb-lifesci-request@w3.org
[mailto:public-semweb-lifesci-request@w3.org] On Behalf Of Sean Martin
	Sent: Thursday, October 12, 2006 5:51 PM
	To: public-semweb-lifesci@w3.org
	Cc: Lsid-developer@lists.sourceforge.net
	Subject: Prototype URL to Life Science Identifier (LSID) gateway
now available
	
	

	Hello all, 
	A little while back the W3C's HCLS BioRDF group (in the form of
Susie Stephens) asked if I would look into establishing a prototype of a
gateway service that allows the mapping of  LSIDs to URLs. The initial
suggestion came from Henry Thompson who asked that we take at look at
how the ARK system does something similar.  He and members of the W3C
TAG, as well as others on the public-semweb-lifesci list have indicated
how important they feel it is that one should be able to derefence URIs
using the HTTP URI scheme and I believe this prototype succeeds in doing
just that for LSIDs. You can read that conversation in the archives of
the mail-list [1]. 
	
	The OMG were both interested and willing to cooperate with this
effort and they have established a new DNS domain lsid-info.org for the
purpose. I am pleased to announce that the prototype gateway is now
operational. [2] 
	
	 Here is our first stab at the syntax of the mapping. To form
your LSID access URL, replace the string <LSID> with an actual
resolvable LSID. 
	
	To retrieve the named bytes use http://lsid-info.org/<LSID> 
	        Example:
http://lsid-info.org/urn:lsid:ncbi.nlm.nih.gov.lsid.biopathways.org:genb
ank:30350027 
	
	To retrieve the RDF metadata (effectively a named graph) for the
named bytes or concept use http://lsid-info.org/<LSID>? 
	        Example:
http://lsid-info.org/urn:lsid:gdb.org:GenomicSegment:GDB132938? 
	
	To retrieve and format the RDF metadata for the named bytes or
concept as human-readable HTML use http://lsid-info.org/rdf2html/<LSID> 
	        Example:
http://lsid-info.org/rdf2html/urn:lsid:gene.ucl.ac.uk.lsid.biopathways.o
rg:hugo:MVP 
	
	To retrieve information about the authority for the named bytes
or concept use http://lsid-info.org/host/<LSID> 
	        Example:
http://lsid-info.org/host/urn:lsid:gdb.org:GenomicSegment:GDB132938 
	
	Obviously this syntax can be evolved so we would like feed back.

	
	I hope that others will find this gateway as useful as I believe
we will.  It may be that early work of this group should focus on
establishing recommendations for and possibly even implementations of
how to bridge useful information that is provided using standards
created by non-W3C organizations and communities to the Semantic Web.
It seems to me that perhaps we are going to need a number of similar
gateways to be established permanently for other URI schemes like Handle
DOIs[3], Oasis XRIs[4] as well as non-URI  based identifier schemes
which provide named bytes and metadata that is or can be transformed to
RDF.  In particular a gateway to the DOI system would be useful to us at
this time because it is widely adopted in the scientific publishing
community and we need a means to uniquely identify a paper (and indeed
an offset into that paper) for the purposes of annotations stored  in
our RDF backed systems. 
	
	Kindest regards, Sean 
	
	
	[1]
http://lists.w3.org/Archives/Public/public-semweb-lifesci/2006Jul/0213.h
tml 
	[2] http://lsid-info.org/ 
	[3] http://www.doi.org/ 
	[4] http://en.wikipedia.org/wiki/Extensible_Resource_Identifier 
	= 
	
	-- 
	Sean Martin 
	IBM Corp
Received on Friday, 13 October 2006 04:46:34 UTC