Re: [BioRDF] All about the LSID URI/URN from Dan Connolly on 2006-07-07 (public-semweb-lifesci@w3.org from July 2006)

From: Dan Connolly <connolly@w3.org>
Date: Fri, 07 Jul 2006 07:57:33 -0500
To: public-semweb-lifesci@w3.org
Cc: "Henry S. Thompson" <ht@inf.ed.ac.uk>
Message-Id: <1152277053.1191.248.camel@dirk.w3.org>

http://lists.w3.org/Archives/Public/public-semweb-lifesci/2006Jun/0210.html

> The root of the problem is that the URL 
> contains in it more than just a name. It also contains the network 
> location where the only copy of the named object can be found (this is the 
> hostname or ip address) 

Which URL is that? It's not true of all URLs. Take, for example,
  http://www.w3.org/TR/2006/WD-wsdl20-rdf-20060518/

That URL does not contain the network location where the only
copy can be found; there are several copies on mirrors around the
globe.

$ host www.w3.org
www.w3.org has address 128.30.52.46
www.w3.org has address 193.51.208.69
www.w3.org has address 193.51.208.70
www.w3.org has address 128.30.52.31
www.w3.org has address 128.30.52.45


FYI, the TAG is working on a finding on URNs, Namespaces, and Registries;
the current draft has a brief treatment of this issue of location (in)dependence...
http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html#loc_independent


> as well as the only means by which one may 
> retrieve it (the protocol, usually http, https or ftp). The first question 
> to ask yourself here is that when you are uniquely naming (in all of space 
> and time!) a file/digital object which will be usefully copied far and 
> wide, does it make sense to include as an integral part of that name the 
> only protocol by which it can ever be accessed and the only place where 
> one can find that copy?

If a better protocol comes along, odds are good that it will be usable
with names starting with http: .

See section 2.3 Protocol Independence
http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html#protocol_independent


> Unfortunately when it 
> comes to URL?s there is no way to know that what is served one day will be 
> served out the next simply by looking at the URL string. There is no 
> social convention or technical contract to support the behavior that would 
> be required.

Again, that's not true for all URLs. There are social and technical
means to establish that

  http://www.w3.org/TR/2006/WD-wsdl20-rdf-20060518/

can be cached for a long time.

The social mechanism includes published policies such as...

"As of this note, persistent resources include:
     1. ...
     2. Those which start "http://www.w3.org/TR/" immediately followed
        by four decimal digits."
 --- http://www.w3.org/Consortium/Persistence

and the technical mechanisms include HTTP caching headers:
  Expires: Sat, 07 Jul 2007 12:51:56 GMT

  (a 1 year expiry time is the maximum time per rfc2616)

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

Received on Friday, 7 July 2006 12:57:47 UTC