Re: identifier to use from Hilmar Lapp on 2007-08-21 (public-semweb-lifesci@w3.org from August 2007)

From: Hilmar Lapp <hlapp@duke.edu>
Date: Tue, 21 Aug 2007 12:17:12 -0400
To: Eric Jain <Eric.Jain@isb-sib.ch>
Cc: public-semweb-lifesci hcls <public-semweb-lifesci@w3.org>
Message-Id: <CBF2A5B9-F9F5-4F11-8434-D1D24B874BCF@duke.edu>
Hi - I'm new to this whole discussion, and I'm relatively naive so  
please forgive me if what I'll say sounds terribly stupid. I also  
apologize in advance if I seem to be reiterating points that have  
long been settled - again I've just started to watch this forum  
recently.

On Aug 21, 2007, at 5:12 AM, Eric Jain wrote:

> 3. If you do want to dereference, and do so with a generic tool  
> that wasn't specially written to handle life sciences data (most  
> won't), you are likely to be out of luck if you encounter some  
> domain-specific resolution system.

It seems to me that domain-specific resolution systems are rather a  
fact and we deal with them all the time.

For example, articles are referenced by DOI, entries in most  
institutional repositories are referenced by Handles, and GenBank  
sequences are referenced by a GI number. Any generic tool that wants  
to deal with statements made about or to articles (presumably almost  
all will want to) will need to know how to dereference a DOI.  
Alternatively, for the time being we can prefix the DOI with http:// 
dx.doi.org/ and have a dereferancable HTTP URI.

I'm not sure why we can't apply the same principle to LSIDs. The life  
science field isn't necessarily a small one, and it seems like a  
small price to pay for a tool creator to implement a single  
resolution system to resolve any life science identifier. Is this  
being naive?

>
> If the W3C can encourage life science databases to provide stable  
> URLs (which is simple enough that it shouldn't be a technical  
> problem for any of them, don't even need to buy into any of the  
> semantic web stuff to see that this is useful), this would already  
> make the world a better place (TM).

I may be missing something but I thought one of the main reasons we  
are in this identifiability mess in the life sciences is because  
ostensibly stable URLs in reality aren't stable.

There seems to be a notion that all "life science databases" will be  
there in perpetuity, but in reality there are plenty of examples of  
databases that lost funding and went "out of business", with PIR or  
BIND being some of the better known ones. I'm not quite following why  
after all these years of discussion the validity of URIs should again  
be subject to the vagaries of funding, or the business acumen of  
commercial enterprises.

Domain names are quickly bought, used, and sold to someone else, and  
this is not just theoretical. The proposed "ease" with which HTTP  
URIs can be stably maintained first of all is clearly contradicted by  
the empirical evidence that it's not happening right now (why would a  
W3C recommendation change that? That we want stable HTTP URIs can't  
be new to anyone), and second requires continued ownership of the  
domain name. This seems like a trivial issue but in reality it's not  
once funding is cut off.

For example, the journal Phyloinformatics discontinued recently and  
the domain name phyloinformatics.org is now for sale. If they had  
used HTTP URIs using their domain name, the next owner of the domain  
would probably choose not to maintain any of those, or worse,  
reassign them to something else.

What am I missing?

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================
Received on Wednesday, 22 August 2007 03:58:35 UTC