RE: Handles and PURLs from Bass, Mick on 2003-05-22 (www-rdf-dspace@w3.org from May 2003)

From: Bass, Mick <mick.bass@hp.com>
Date: Thu, 22 May 2003 07:16:46 -0700
To: "Butler, Mark" <Mark_Butler@hplb.hpl.hp.com>, " (www-rdf-dspace@w3.org)" <www-rdf-dspace@w3.org>
Message-ID: <40700B4C02ABD5119F000090278766440656B704@hplex1.hpl.hp.com>
The topic, rather than Handles and PURLs, is more generally I think systems for naming and resolution.

In particular are higher-level abstractions or services (and thus systems/standards/frameworks for providing them) than vanilla URIs (some subset of which are GETable URLs) useful or _required_, particularly when issues of persistence and preservation are brought in scope, and need to be addressed given the real organizational social/political/economic constraints that Mark accurately points out. 

> 5. Due to 3, URIs tend to mix identity and version (i.e. 
> date, time). There are some disadvantages...
...
> this may be easier than keeping track of a bunch of metadata 
> that says all these identifiers refer to versions of the same 
> resource.
> 
> 6. The concept behind PURLs and Handles is good, i.e. when a
> resource moves you don't need to worry about it.

A little over a year ago, in the context of framing naming decisions for Dspace, I wrote a paper [1] that tries to tease apart the various functions that need be covered in a naming/resolution system that requires persistence in guarantees of the global uniqueness of names and/or persistent resolution capabilities for the named resources.  Note that it does have a bibliography, which includes the Larry Stone paper as well as some other excellent work, in particular that on "ark:" URNs.

Now it is possible that this is a crap paper, so it may be useful if those on the list could comment on

1) does the described functional breakdown make sense?
2) how would it be covered by the semantic web architecture?
And maybe
3) how close/distant do you believe these issues are to the remit of SIMILE

My perception is that Handle System is attempting to address some shortcomings or difficulties in covering this functional space using URI's and GETable http: URLs exclusively.

We can test this when we meet the CNRI folks a couple of weeks hence.
 

- Mick

[1] Note on DSpace and the Handle System
	http://web.mit.edu/simile/www/resources/naming-resolution/handles-naming/handles-naming.pdf
	http://web.mit.edu/simile/www/resources/naming-resolution/handles-naming/handles-naming.htm
    


> -----Original Message-----
> From: Butler, Mark [mailto:Mark_Butler@hplb.hpl.hp.com] 
> Sent: Thursday, May 22, 2003 6:39 AM
> To: (www-rdf-dspace@w3.org)
> Subject: Handles and PURLs
> 
> 
> 
> Hi team,
> 
> There has been a bit of discussion going on internally in HP 
> about whether to use Handles in the history system. I'm 
> hoping this discussion is going to forwarded to this list, 
> because I'm sure this is something the rest of the team will 
> have an interest in. Also Eric Miller has stated that he 
> would prefer discussions are sent to an archived email list 
> and I think this is good advice.  
> 
> People may be familiar with it already, but I found "A 
> competitive evaluation of Handles and PURLs" by Larry Stone 
> useful: http://web.mit.edu/handle/www/purl-eval.html
> 
> In essence the argument has been about whether Handles, 
> because they do not support HTTP GET, are compliant with the 
> semantic web architecture. 
> 
> My position on this is that we really shouldn't worry about 
> "compliance" in this way. I have a nice quote about this in 
> my cube: "Dogmatic attachment to the supposed merits of a 
> particular structure hinders the search for an appropriate 
> structure". 
> 
> Also one thing I've been interested in for a while is whether 
> a fundamental rethink about the way we use URIs can enhance 
> the web architecture. A lot of the current discussions about 
> web architecture, and the semantic web for that matter, are 
> constrained by backward compatibility issues. However with 
> any IT system, we often have to make decisions about when it 
> is worth perserving the architecture we already have and when 
> we need to sacrifice backward compatibility in order to move 
> to a completely new architecture because it has compelling 
> advantages. I have a name for this - "Web Version 2.0" - and 
> a mission statement i.e. "We've got a bunch of technologies 
> that form the current web and we've learnt a lot creating 
> those technologies. If we could start again from a blank 
> sheet of paper, unconcerned about backward compatibility, 
> what would we do differently, what could we simplify, and 
> where would it take us". I think this is quite an interesting 
> thought experiment, and I note that conducting thought 
> experiments like this are a cornerstone of the extreme 
> programming methodology. 
> 
> It also seems to me that Handles are attempting to do 
> something like this, but we can easily postulate other 
> approaches. I think there are a number of issues here, and 
> there has been quite a bit of discussion about this, 
> particular within the W3C-TAG, but I haven't seen a document 
> that gives an adequate summary of all the issues, so essentially:
> 
> 1. URLs are a form of URIs.
> 
> 2. URLs are used by people to locate things. Therefore they 
> should be optimized to be user friendly e.g.
>    http://www.hp.com/ is good
>  
> http://www.somenewssite.com/news/lots/of/directory/structure/?
> somequery=fred
> &anotherquery=flintstone is bad
> 
> 3. URIs are used to identify resources. Due to the "cool URIs 
> don't change" principle, once resources are created they are 
> immutable.
> 
> 4. There is a tension between 2 and 3. For example the 
> contents of a site may change, but I still want a 
> user-friendly short-cut to a site as well as a perma-link. It 
> feels like we need some level of dereferencing or indirection 
> here, i.e. typing in http://www.hp.com/ takes us to a 
> particular version of the HP website and the browser then 
> informs the user of a permalink which we can use to retrieve 
> that particular version in the future if we need. 
> 
> 5. Due to 3, URIs tend to mix identity and version (i.e. 
> date, time). There are some disadvantages to mixing these two 
> different axes, particularly as different URIs mix them in 
> different ways so they are not algorithmically separable. 
> Perhaps it might be useful to separate these axes, as then it 
> would be possible to determine from the URIs alone that two 
> resources are versions of the same thing. Now this is 
> controversial, as we've already discussed an opposing view 
> e.g. identifiers must be random. But from the CC/PP work, I'm 
> concious things are much easier for processor developers as 
> this may be easier than keeping track of a bunch of metadata 
> that says all these identifiers refer to versions of the same 
> resource. For more details see 
> http://www.hpl.hp.com/techreports/2003/HPL-> 2003-31.html
> 
> 6. 
> The concept behind PURLs and Handles is 
> good, i.e. when a resource moves you don't need to worry 
> about it. DNS already has a level of indirection built in, so 
> why not do this for retrievable resources? This is discussed 
> in the Stone paper cited above.
> 
> 7. Although the "cool URIs don't change" advice seems good, 
> as Cory Doctorow's Metacrap paper points out web techniques 
> have to exist in world where people are subject to social, 
> political and economic pressures. Companies in particular 
> want to be able to control what information they disseminate 
> at a particular time, and they reserve the right to try to 
> remove or obscure information from the public domain, so it 
> is very rare to see companies follow the "cool URIs don't 
> change" advice. Therefore my position is I would like URIs to 
> give some indication about whether they refer to a 
> retrievable resource and if they are likely to be permanent 
> or not. This is similar to my position on the relationships 
> between namespaces and schemas or RDDL documents - I would 
> like them to indicate the same information. This information 
> allows processors or search engines to deal with these links 
> in a more intelligent way.
> 
> Comments?
> 
> br,
> 
> Dr Mark H. Butler
> Research Scientist                HP Labs Bristol
> mark-h_butler@hp.com
> Internet: http://www-uk.hpl.hp.com/people/marbut/
> 
> 
>
Received on Thursday, 22 May 2003 10:16:49 UTC