Re: [BioRDF] All about the LSID URI/URN from Henry S. Thompson on 2006-07-26 (public-semweb-lifesci@w3.org from July 2006)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Wed, 26 Jul 2006 14:44:58 +0100
To: Sean Martin <sjmm@us.ibm.com>
Cc: public-semweb-lifesci@w3.org, noah_mendelsohn@us.ibm.com
Message-ID: <f5b8xmgbhut.fsf@erasmus.inf.ed.ac.uk>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean Martin writes:

> Apologies if at any point I appear to be shifting the goal posts on you, 
> but the LSID scheme seems to have been developed in response to a 
> significant list of requirements gathered from a large number of stake 
> holders  across the Life Science industry, so there is a fair amount of 
> thinking behind what it has to do which may not be immediately apparent.

Understood.  As I hope my message earlier today makes clear, this is
an information-gathering exercise above all, so more information is
always good.

>> So, register one of lsids.org, lsids.net, lsids.name or lsids.info,
>> and use e.g. http://lsids.or/xxx instead of URN:LSID:xxx.  Bingo -- no
>> new tools required, works in all modern browsers :-).  Implement as
>> much or as little redirection, caching etc. as you wish in the server
>> you run at lsids.info:80, just as you would using DDDS.
>
> The problem with this approach is that individual Life Science 
> organizations want to create their own LSIDs for private and some times 
> public consumption as well as consume those provided publicly or privately 
> by their partners, colleagues and government. Organizations may easily be 
> responsible for creating many hundreds of thousands of identifier name 
> digital objects (gene sequences, image/x-ray/cat/mri scans, spreadsheets, 
> intermediate 'in-silico' experiment processing results, 'in-silico' 
> experimental provenance etc). Some times they will do this offline or 
> behind corporate firewalls. Many are likely to be extremely unwilling to 
> use a centralized approach to naming and resolution for privacy as well as 
> scaling & availability concerns.

Not a problem, I don't think.  You sub-bind lso1.lsids.org to Life
Sciences Org. 1's IP address, lso2.lsids.org to Life Sciences
Org. 2's, etc.  They owe you obedience to the rules in return for that
binding, pretty much parallel to the way you must do things with LSIDs.

>> > LSIDs are independent of any particular transport protocol and
>> > indeed already make use of any of the commonly used ones
>> > simultaneously (ftp, http, SOAP, file:// etc). The thing to remember
>> > here is that we are not thinking about URIs in the abstract here,
>> > but rather a 'living, breathing system' intended for naming digital
>> > objects that will be copied/archived far and wide. It was deemed
>> > important to support as many mechanisms as possible (including
>> > future ones) to support that copying/archiving process without
>> > losing track of the unique name.
>> 
>> So all LSID clients have to support all those protocols?  Doesn't
>> sound like a likely route to wide deployment. . .  Or are you proxying
>> all requests through a few central servers, who choose what protocols
>> to use for the initial fetch?  If so, no problem doing that with
>> 'http'-scheme URIs either. . .
>
> Both of these approaches are in practice today. The two major browsers 
> have plug-ins that allow them to directly resolve LSIDs and there is also 
> software available that allows anyone to quickly implement their own web 
> gateway like the one at http://lsid.biopathways.org/resolver/ 
>
> However another and perhaps more common use at least in the early adopter 
> community is to access LSID dereferenced information programmatically. 
> More likely than not, there is nothing much useful to actually 
> see/manipulate in a web browser when looking at the results of an LSID 
> dereference - at least not for a human! There are at least three or four 
> software language library client stacks freely available that anyone can 
> use in their programs for automation of processes that include data with 
> LSID naming. When you get down to it, it does not take much to get client 
> library coverage across the languages that are commonly used to program 
> Life Sciences applications, especially since the base libraries (DNS, WWW, 
> and Web Services) are already ubiquitous. 

Fair enough, but note that essentially _all_ language libraries
support http URIs already.  To pick an example, suppose someone wants
to use LSIDs in a Web 2.0 app built using the flavour of the month,
Ruby on Rails -- http support is a foregone conclusion there, but as
far as a quick check can tell me, no LSID support exists.

Furthermore, many large companies with tight central control over
employee desktops have very rigid policies which effectively rule out
plugins altogether. . .

As Noah said, it's hard to measure, but likewise hard to exaggerate,
the value for any Web-orientated technology of exploiting the
installed base.

ht
- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (GNU/Linux)

iD8DBQFEx3HakjnJixAXWBoRAjUKAJ9/FtUIHi3yibwJ/DNGNu5iduyiQgCggJvX
FDVjmgx69qE/+gKBZDmfEIU=
=mZ5r
-----END PGP SIGNATURE-----
Received on Wednesday, 26 July 2006 13:45:16 UTC