RE: BioRDF: URI Best Practices from Sean Martin on 2006-07-21 (public-semweb-lifesci@w3.org from July 2006)

From: Sean Martin <sjmm@us.ibm.com>
Date: Fri, 21 Jul 2006 14:53:57 -0400
To: public-semweb-lifesci@w3.org
Message-ID: <OF59D11214.2853C810-ON852571B2.005EF30C-852571B2.0067D12F@us.ibm.com>
> SM> as names for things that have a digital existence. The issues 
> SM> of broken links is a difficult one because once the primary 
> SM> source at a particular location disappears you have nothing 
> SM> left to go on to find a copy of the thing named besides what 
> SM> you can find in the WayBack machine or perhaps a Google 
> SM> cache. 
> 
XW> Should a LSID resolver decide not to resolve a particular LSID, 
wouldn't it
XW> be the same effect as a broken link? 

Not really for a number of reasons. The first is that you may well already 
have it stored somewhere accessible if it has ever been seen by you or 
anyone in your organization before (since even the first version of the 
software for resolution supports local archiving/caching) and if you don?t 
find it at home you/your machine can ask if one of your friends/colleagues 
has a copy or if some other third party does. You have a unique name and a 
means that you can use to ask them either formally (protocol) or 
informally (email).

Because the "contract" understood by those issuing/accessing LSIDs is that 
data once named by them may never change, the bits retrieved are 
archive-able for eternity and given that they also provide a means of 
versioning each revision, you can be sure that you are always talking 
about the exact version you need and no other. 

Contrast this with the current efforts for archiving URLs over time at the 
WayBack Machine [1] and ask yourself if this is really going be a 
sufficient mechanism. How would you go about asking for a particular 
version of something named by a URL? URL links are not only a problem 
because they break which is bad, but worse, they are more of a problem 
because there is no way to tell that they continue to name the same thing.

Secondly, in the LSID scheme you have a significant level of indirection 
between the name and the data services. This means that the data provider 
has much flexibility to change both who provides the service (you do not 
even have to control DNS for that domain) and what means (protocols) they 
use to provide the service. This makes it easier to continue to provide 
service.

> So, this is again more of an
> implementation issue than naming issue.
>

Perhaps it is, but here we are in mid-implementation and actually need 
something right now with properties similar to those provided in the LSID 
spec. URLs as URIs only cut it for some things we are doing (and we do use 
them for those), but for naming objects we use LSIDs. 

>
> Web has many broken links, because they are insignificant to our life 
and
> let to die.  No one can say that our library would have chronicled every
> facets of human life even if it is possible.  But the important ones 
will be
> preserved.  So, if a HTTP URI is important enough, it will survive and
> persist.  If not, it will perish or be used to represent somethingelse.
> Same goes to LSID.  The key is in our effort but not the name.
> 

Actually, LSIDs should never be allowed to represent something else.. that 
is the point, have the right tools for the job as opposed to everything 
looks a bit like a nail if you have a hammer ;-)

Kindest regards, Sean

-- 
Sean Martin
IBM Corp.

[1] http://web.archive.org/web/*/http://www.i3c.org
Received on Friday, 21 July 2006 18:54:19 UTC