- From: Lee Jonas <lee.jonas@cakehouse.co.uk>
- Date: Thu, 12 Apr 2001 14:07:37 +0100
- To: "'Aaron Swartz'" <aswartz@swartzfam.com>, "@homeLee" <ljonas@acm.org>, RDF Interest <www-rdf-interest@w3.org>
Aaron Swartz [mailto:aswartz@swartzfam.com] wrote: >Lee Jonas <ljonas@acm.org> wrote: > >> The notion of URLs identifying representations seems a little trite to me. >> It indicates the nature of the true problem, without fully addressing it: a >> resource at the end of a location is not consistent over time. >> >> This is for at least two good reasons: resources evolve, and resources move >> / disappear / or worse, a second resource ousts the first at a particular >> location. > >Resources do not evolve -- their representations (or network entities) do. My interpretation is: resources are things that can evolve, representations are distinct "snapshots" of a particular resource state, conceptually taken at the point of access (this then includes representations of resources provided by CGI scripts, etc). A W3C Working Draft evolves, the html doc retrieved from its "latest version" URL gets a representation of the latest version of the Working Draft. > >> The first issue could have been addressed more formally (and hence >> consistently) with a simple versioning scheme. > >What about ETags? I am not familiar with these. Can you give me some pointers? > >> This would have alleviated >> the problem of instantly breaking third party links (or invalidating >> metadata semantics) when you change a resource. Yes your links must change >> to reflect new versions of things you reference, but these changes could be >> a graceful migration, not an abrupt crash. > >How do versions fix changes in resources? It seems they just break things >for the 94% (as previously cited) of links that actually work correctly. > They don't fix changes in resources (and hence changes to their representations), they make it less destructive for others to have links to fragments in your documents, which you may subsequently change / delete. Why would this break things for links that work correctly? >> The second is the main bugbear of using a resource's location to identify >> it. This phenomenon is well known in distributed object technology. >> Superior solutions leave the actual resolution of an object's location to >> some distributed service when a client wants to interact with it. > >Again, URLs don't have to be used this way, but we do. You can try and redo >URLs (which are widely-used, implemented, understood, etc.) or you can fix >the other parts of the system (some of which can probably be upgraded with >little headache). In fact there are projects which are trying to do that: >World Free Web, PURL, Alexa/Internet Archive, Google caches, etc. > >http://wfw.sourceforge.net/ >http://purl.org/ >http://archive.org/ >http://www.alexa.com/company/technology.html >http://www.google.com/help/features.html#cached > >I'll keep track of others at: http://logicerror.com/alternateURLResolution > I am not proposing any changes to URL. This is more of an argument for using URNs to identify resources (in a more abstract fashion), where appropriate. Then the mapping to a URL locating a specific representation can be performed dynamically. >> These are compounded with the fact that the resource can be one of many >> formats and there is no clear way to distinguish them from the URL iself. A >> resource such as http://mydomain/mypic.png may safely be assumed to be a png >> graphic, but what about the resource at the end of http://mydomain/mydir/ ? > >Resources don't usually have formats. That's why there's content >negotiation. > Although it would sometimes be unavoidable, wouldn't it be nice to find out the type of a representation without having to negotiate every time? >> Mime types have become pervasive for identifying a resource's type, yet URLs >> predate MIME by years. If you want to know its type you have to make a >> request to some server process. > >Is that true? Hmm, looks like it... > >> 1) It may become more common to reason about abstract resources whose >> identifiers may not be readily representable as a location. It would be >> better to identify these with a URN. Hence URNs may be more widely used >> than at present. > >Why should things without a "location" use a URN? They can still be >described can't they. Folks! Just because it's a URN doesn't mean it's >anything special. It still represents a resource, even if it's in the >Fooawackyak scheme. > Reserving URLs to identify things that you can access representations of has certain advantages. Not least is keeping at least 94% of them vancable. It seems like a simple distinction to me. In an ideal world, URLs are always vancable, URNs may be so, but not necessarily. >> 3) Data quality will be poorer if it is hard for software to detect a >> resource change. Transience is bad news if you are going to store facts >> about something that subsequently changes. > >Yes, RDF does not deal with time very well, but this is, IMO, an RDF problem >not a URI one. > It is a fundamental aspect of the way URLs are defined to be used. They *locate* (note I did not say *identify*) representations (snapshots of state) of underlying resources, not the resources themselves. When resources change, new representations may appear at the same and/or different locations. The only way RDF could satisfactorily deal with this is if it described the resources directly by using URN identifiers, which could be subsequently mapped to a URL locating an appropriate representation. >> What the solution to all this is I don't know. I just can't help feeling >> that as the semantic web progresses things are about to get a lot more >> complicated unless these issues are addressed. > >The Semantic Web is going to be complicated no matter what we do. ;-) > >-- >[ Aaron Swartz | me@aaronsw.com | http://www.aaronsw.com ] regards Lee
Received on Thursday, 12 April 2001 09:07:59 UTC