Re: [BioRDF] All about the LSID URI/URN from Carole Goble on 2006-07-28 (public-semweb-lifesci@w3.org from July 2006)

From: Carole Goble <carole@cs.man.ac.uk>
Date: Fri, 28 Jul 2006 19:24:58 +0100
To: Sean Martin <sjmm@us.ibm.com>
CC: public-semweb-lifesci@w3.org, connolly@w3.org, ht@inf.ed.ac.uk, noah_mendelsohn@us.ibm.com
Message-ID: <44CA567A.3090509@cs.man.ac.uk>
Sean

We will be joining you on Monday for the telecon. As you know, our 
projects use LSIDs heavily. And we find it invaluable.
Our practical experiences are, in a nutshell

Conceptually we like
1. decoupled naming from physical location (essential)
2. versioning (very useful)
3. separate data from metadata (very useful)
4. foreign authorities can add metadata to an LSID in a transparent way 
(useful)
5. can be retrofitted (useful - advantage over PURLs)
6. metadata is in RDF (useful)

However, we have problems with the implementation, specifically the use 
of SOAP within the resolution
system, because:
1. its not needed conceptually
2. its costly
3. its overkill which affects performance
4. the main implementation is Axis based - not suitable for phones, pdas 
and other thin clients

And we hardly ever type an LSID into a browser :-) we use them as 
distributed object ids that are rapidly adoptable by a very distributed 
service base

Chat on Monday.

By the way I have already lodged an objection to Susie that to have such 
a telecon when many people who actually, like, use the stuff for, like, 
real are at ISMB2006 in Brazil and will not be able to participate. Like 
Doh!

Carole

Professor Carole Goble
Director, myGrid project (http://www.mygrid.org.uk)
Chair, Open Middleware Infrastructure Institute-UK (http://www.omii.ac.uk)

>
> Hello Dan,
>
> > Thanks for continuing to explain the requirements. I haven't seen
> > LSID requirements that can't be met with http/DNS yet, but that
> > doesn't mean they're not there.
> >
> > Yes, it's easy to see how starting fresh simplified some things.
> > But I am not convinced that starting fresh is the only option,
> > nor that working within the constraints of http/DNS won't give
> > a lot more benefit for approximately the same investment.
> >
> >
>
> I don’t know exactly why a URN style identifier was chosen over a http 
> style URI for LSIDs as I was not involved at the time. My educated 
> guess is that for a number of the reasons I have detailed in my 
> earlier posts http URLs, as understood in common practice then & 
> indeed now, were not seen to exactly fit all the requirements 
> generated when an exact fit was required for them to actually be 
> useful to their fairly fractured community. Nuance detail is important 
> in this problem and I can see from your last reply that we are not 
> meeting there at many levels.
>
> In particular though I suspect the highly distributed nature of the 
> Life Science community, the perceived fragility of URL links and the 
> URL's ambiguous technical/social contracts were major motivating 
> factors and also the successful application of such a scheme 
> internally by a number of the standard initiators. In circumstances 
> where there were (and perhaps are still) not enough obvious standards, 
> guidance & best practices available to show how to make http URLs fit 
> this particular bill, it is perfectly understandable why they fell 
> back on the URN specifications. In those there was clear guidance 
> about how to make persistent identifiers that would work. Given the 
> existence of other substantial persistent identifier efforts (e.g. ARK 
> and DOI), it seems to me they were not alone nor unreasonable in their 
> thinking.
>
> I was involved in the decision making surrounding the choice of the 
> dereferencing protocol for the LSID standard and know that it was 
> purposely based on as many preexisting existing standards as possible, 
> namely the DDDS RFCs (for URN dereferencing), DNS SRV records (for 
> service discovery) and SOAP/WSDL (for communicating metadata/data end 
> points) as well as the common web protocols for transport (http, ftp, 
> file://, SOAP) given that much data that required naming was already 
> accessible online. It was entirely deliberate that as much existing 
> precedent be used as possible and consequently little of the LSID 
> protocol specification is completely new invention.
>
> That said, I can also see the great benefits to allowing direct http 
> URL style access to information named using the LSID scheme, which is 
> why I personally would be much inclined to back the suggestion by 
> Henry Thompson that the LS community go the extra step in the standard 
> to establish the necessary mechanisms to link LSIDs to the web using 
> the pattern he suggested from the ARK group.
>
> From my point of view, we finally have an extremely hard won LSID 
> standard which is already being usefully put into practice by various 
> groups in the Life Sciences community. This is no small achievement. 
> It has a fairly clear social/technical contract, although I believe 
> there are improvements that can be made in the area of metadata. It 
> seems to me that it, f is best used for uniquely naming LS digital 
> objects - anything one can think of today as a file but there are also 
> other valid uses. In addition it seems perfectly valid to use an LSID 
> as a URI in RDF because it is a URN and URNs are URIs. We intend to go 
> on doing so while we find it useful for our purposes. What would be 
> marvelous would be to start defining the scope of the metadata 
> returned so we can take the existing usefulness to the next level.
>
> I will look forwards to talking with you next week. Have a great 
> weekend everyone.
>
> Kindest regards, Sean
>
> -- 
> Sean Martin
> IBM Corp
>
>
>
Received on Friday, 28 July 2006 18:25:26 UTC