- From: Aaron Swartz <aswartz@swartzfam.com>
- Date: Wed, 11 Apr 2001 18:59:46 -0500
- To: Lee Jonas <ljonas@acm.org>, RDF Interest <www-rdf-interest@w3.org>
Lee Jonas <ljonas@acm.org> wrote: > The notion of URLs identifying representations seems a little trite to me. > It indicates the nature of the true problem, without fully addressing it: a > resource at the end of a location is not consistent over time. > > This is for at least two good reasons: resources evolve, and resources move > / disappear / or worse, a second resource ousts the first at a particular > location. Resources do not evolve -- their representations (or network entities) do. > The first issue could have been addressed more formally (and hence > consistently) with a simple versioning scheme. What about ETags? > This would have alleviated > the problem of instantly breaking third party links (or invalidating > metadata semantics) when you change a resource. Yes your links must change > to reflect new versions of things you reference, but these changes could be > a graceful migration, not an abrupt crash. How do versions fix changes in resources? It seems they just break things for the 94% (as previously cited) of links that actually work correctly. > The second is the main bugbear of using a resource's location to identify > it. This phenomenon is well known in distributed object technology. > Superior solutions leave the actual resolution of an object's location to > some distributed service when a client wants to interact with it. Again, URLs don't have to be used this way, but we do. You can try and redo URLs (which are widely-used, implemented, understood, etc.) or you can fix the other parts of the system (some of which can probably be upgraded with little headache). In fact there are projects which are trying to do that: World Free Web, PURL, Alexa/Internet Archive, Google caches, etc. http://wfw.sourceforge.net/ http://purl.org/ http://archive.org/ http://www.alexa.com/company/technology.html http://www.google.com/help/features.html#cached I'll keep track of others at: http://logicerror.com/alternateURLResolution > These are compounded with the fact that the resource can be one of many > formats and there is no clear way to distinguish them from the URL iself. A > resource such as http://mydomain/mypic.png may safely be assumed to be a png > graphic, but what about the resource at the end of http://mydomain/mydir/ ? Resources don't usually have formats. That's why there's content negotiation. > Mime types have become pervasive for identifying a resource's type, yet URLs > predate MIME by years. If you want to know its type you have to make a > request to some server process. Is that true? Hmm, looks like it... > 1) It may become more common to reason about abstract resources whose > identifiers may not be readily representable as a location. It would be > better to identify these with a URN. Hence URNs may be more widely used > than at present. Why should things without a "location" use a URN? They can still be described can't they. Folks! Just because it's a URN doesn't mean it's anything special. It still represents a resource, even if it's in the Fooawackyak scheme. > 3) Data quality will be poorer if it is hard for software to detect a > resource change. Transience is bad news if you are going to store facts > about something that subsequently changes. Yes, RDF does not deal with time very well, but this is, IMO, an RDF problem not a URI one. > What the solution to all this is I don't know. I just can't help feeling > that as the semantic web progresses things are about to get a lot more > complicated unless these issues are addressed. The Semantic Web is going to be complicated no matter what we do. ;-) -- [ Aaron Swartz | me@aaronsw.com | http://www.aaronsw.com ]
Received on Wednesday, 11 April 2001 19:59:59 UTC