- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Thu, 22 May 2003 11:38:45 +0100
- To: " (www-rdf-dspace@w3.org)" <www-rdf-dspace@w3.org>
Hi team, There has been a bit of discussion going on internally in HP about whether to use Handles in the history system. I'm hoping this discussion is going to forwarded to this list, because I'm sure this is something the rest of the team will have an interest in. Also Eric Miller has stated that he would prefer discussions are sent to an archived email list and I think this is good advice. People may be familiar with it already, but I found "A competitive evaluation of Handles and PURLs" by Larry Stone useful: http://web.mit.edu/handle/www/purl-eval.html In essence the argument has been about whether Handles, because they do not support HTTP GET, are compliant with the semantic web architecture. My position on this is that we really shouldn't worry about "compliance" in this way. I have a nice quote about this in my cube: "Dogmatic attachment to the supposed merits of a particular structure hinders the search for an appropriate structure". Also one thing I've been interested in for a while is whether a fundamental rethink about the way we use URIs can enhance the web architecture. A lot of the current discussions about web architecture, and the semantic web for that matter, are constrained by backward compatibility issues. However with any IT system, we often have to make decisions about when it is worth perserving the architecture we already have and when we need to sacrifice backward compatibility in order to move to a completely new architecture because it has compelling advantages. I have a name for this - "Web Version 2.0" - and a mission statement i.e. "We've got a bunch of technologies that form the current web and we've learnt a lot creating those technologies. If we could start again from a blank sheet of paper, unconcerned about backward compatibility, what would we do differently, what could we simplify, and where would it take us". I think this is quite an interesting thought experiment, and I note that conducting thought experiments like this are a cornerstone of the extreme programming methodology. It also seems to me that Handles are attempting to do something like this, but we can easily postulate other approaches. I think there are a number of issues here, and there has been quite a bit of discussion about this, particular within the W3C-TAG, but I haven't seen a document that gives an adequate summary of all the issues, so essentially: 1. URLs are a form of URIs. 2. URLs are used by people to locate things. Therefore they should be optimized to be user friendly e.g. http://www.hp.com/ is good http://www.somenewssite.com/news/lots/of/directory/structure/?somequery=fred &anotherquery=flintstone is bad 3. URIs are used to identify resources. Due to the "cool URIs don't change" principle, once resources are created they are immutable. 4. There is a tension between 2 and 3. For example the contents of a site may change, but I still want a user-friendly short-cut to a site as well as a perma-link. It feels like we need some level of dereferencing or indirection here, i.e. typing in http://www.hp.com/ takes us to a particular version of the HP website and the browser then informs the user of a permalink which we can use to retrieve that particular version in the future if we need. 5. Due to 3, URIs tend to mix identity and version (i.e. date, time). There are some disadvantages to mixing these two different axes, particularly as different URIs mix them in different ways so they are not algorithmically separable. Perhaps it might be useful to separate these axes, as then it would be possible to determine from the URIs alone that two resources are versions of the same thing. Now this is controversial, as we've already discussed an opposing view e.g. identifiers must be random. But from the CC/PP work, I'm concious things are much easier for processor developers as this may be easier than keeping track of a bunch of metadata that says all these identifiers refer to versions of the same resource. For more details see http://www.hpl.hp.com/techreports/2003/HPL-2003-31.html 6. The concept behind PURLs and Handles is good, i.e. when a resource moves you don't need to worry about it. DNS already has a level of indirection built in, so why not do this for retrievable resources? This is discussed in the Stone paper cited above. 7. Although the "cool URIs don't change" advice seems good, as Cory Doctorow's Metacrap paper points out web techniques have to exist in world where people are subject to social, political and economic pressures. Companies in particular want to be able to control what information they disseminate at a particular time, and they reserve the right to try to remove or obscure information from the public domain, so it is very rare to see companies follow the "cool URIs don't change" advice. Therefore my position is I would like URIs to give some indication about whether they refer to a retrievable resource and if they are likely to be permanent or not. This is similar to my position on the relationships between namespaces and schemas or RDDL documents - I would like them to indicate the same information. This information allows processors or search engines to deal with these links in a more intelligent way. Comments? br, Dr Mark H. Butler Research Scientist HP Labs Bristol mark-h_butler@hp.com Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Thursday, 22 May 2003 06:39:05 UTC