Re: [BioRDF] All about the LSID URI/URN from Sean Martin on 2006-07-25 (public-semweb-lifesci@w3.org from July 2006)

From: Sean Martin <sjmm@us.ibm.com>
Date: Tue, 25 Jul 2006 10:21:25 -0400
To: ht@inf.ed.ac.uk (Henry S. Thompson), public-semweb-lifesci@w3.org
Message-ID: <OF5A98E7E9.96B1891E-ON852571B6.004C694A-852571B6.004EF920@us.ibm.com>
> Well, either your scheme is intended to be dereferenceble, or it
> isn't.
> 
>  If it is, then instances are likely/virtually certain to contain some
>  kind of named starting point, which needs to be looked up and
>  resolved to an IP address start the dereferencing process.  Domain
>  names and DNS are by far the best available implementation of this
>  step, with excellent performance, widespread deployment and
>  considerable flexibility.

As it is a URN, the starting point for dereferencing is urn.arpa. The 
specification [1] details the use of the DDDS system (RFCs 3401-3405)which 
uses the existing DNS system (for the very reasons you detail) but 
maintains a level of abstraction between the authority name in the 
identifier and the data service location that can provide a copy of what 
was named, as is proper for URNs.

> >> as well as the only means by which one may 
> >> retrieve it (the protocol, usually http, https or ftp).
> 
> Not so.  The URI RFC [1] makes clear that it is up to protocols to
> specify what URIs they interpret and how, not the other way around.
> It is entirely reasonable, and indeed expected, that new protocols may
> specify interpretations of 'old' URI schemes, including 'http'.
> 
> >> The first question to ask yourself here is that when you are
> >> uniquely naming (in all of space and time!) a file/digital object
> >> which will be usefully copied far and wide, does it make sense to
> >> include as an integral part of that name the only protocol by which
> >> it can ever be accessed and the only place where one can find that
> >> copy?
> 
> I hope the above clarify that this is not the case for names using the
> 'http' scheme.  Indeed they are much more likely to do so for 'http'
> than for almost any other scheme.

Assuming that a new http protocol replaces the existing one, how does this 
change things? Surely the name is still tied to a single protocol (HTTP) 
even if the underlying implementation of that protocol has changed? LSIDs 
are independent of any particular transport protocol and indeed already 
make use of any of the commonly used ones simultaneously (ftp, http, SOAP, 
file:// etc). The thing to remember here is that we are not thinking about 
URIs in the abstract here, but rather a 'living, breathing system' 
intended for naming digital objects that will be copied/archived far and 
wide. It was deemed important to support as many mechanisms as possible 
(including future ones) to support that copying/archiving process without 
losing track of the unique name.

> 
> >> Unfortunately when it 
> >> comes to URL?s there is no way to know that what is served one day 
will be 
> >> served out the next simply by looking at the URL string. There is no 
> >> social convention or technical contract to support the behavior that 
would 
> >> be required.

> 
> True for some 'http' URIs, false for others.  The owners of a group of
> names, whether they use 'http' or not, are responsible for
> documenting, implementing and enforcing usage conventions.  I
> absolutely agree that for your purposes you need to take this very
> seriously, but using 'http' doesn't make this any harder (or, of
> course, any easier).
> 

I am not sure that I can agree with you on this point. How does one go 
about differentiating between one http:// URI and another programmatically 
for the purposes of knowing what its conventions are? As opposed to using 
something else which only has one established convention?


Kindest regards, Sean

[1] http://www.omg.org/cgi-bin/doc?dtc/04-05-01

--
Sean Martin
IBM Corp.
Received on Tuesday, 25 July 2006 15:04:11 UTC