- From: Harry Halpin <hhalpin@ibiblio.org>
- Date: Sat, 28 Jul 2007 11:43:05 -0400
- To: Pat Hayes <phayes@ihmc.us>
- Cc: Sandro Hawke <sandro@w3.org>, John Black <JohnBlack@kashori.com>, 'Linking Open Data' <linking-open-data@simile.mit.edu>, SW-forum <semantic-web@w3.org>, www-tag@w3.org
Pat Hayes wrote: [snip] > But when you want to refer to something that cannot possibly be > accessed (because it isn't the kind of thing that one can transmit > HTTP protocols to: a book, say, or a galaxy, or a dead Roman emperor, > or... well, just about anything, actually) then what is accessed, via > a 303 redirect, is not the thing referred to (of course) but rather > one of the kinds of thing that can be accessed, which should send you > back some description of, or information about, the thing that the URI > you started with is supposed to refer to. Got that? The *whole* point of the Web is the access relationship, and is its major distinguishing characteristic over previous communication systems (like natural language, printing and T.V.) is twofold: it's ability to have a (fairly) decentralized yet universal space of names (URIs), and that these names can *access* more information. The Web could have two different types of names, one for reference (URN) and one for access (URLs), and that's been tried. And it was more or less not a failure, precisely the whole advantage of the Web is that if one does not know what a name (URI) means, then one can access "more data" to help one disambiguate or discover what the referent is. Since the experiment of having two different types of names (URNs and URLs) failed, it makes some measure of sense to elide the distinction and have just one type of name - URIs - that has the access relationship. > When a URI refers to something inaccessible, then what it eventually > accesses will send you back not a 'representation' of the referent, > but a >>description<< of the referent. (We can't say 'representation > of', which would seem to be the rational thing to say, because what > its a 'representation' of is, by TAG definition, the thing the URI > eventually accesses, which has to be an HTTP endpoint of some kind.) Here's a problem - the 303 redirection trick basically uses the URI for the "inaccessible" resource as some sort of URN, and then allows you to follow-your-nose through the redirect to find out more information in order to pin down the reference. But then, a 303 redirection is *not necessarily* a sign that something is being used a name to refer to something outside the Web that can only be referenced. It could be, but it could be just a plain old redirection. One could imagine a number of ways besides going back to URNs to state that a URI is being used to primarily to refer rather than to access. One could have a new type of redirect, or even some sort of grapical "logo" on a web-page to say that that the URI is being used to refer to something rather than just web-page. > Trying to distinguish these two cases is what has given rise to the > distinction between 'information resource' and the other kind. The TAG > documents try to do this in a theoretically satisfying way by talking > about information that completely characterizes it, or some such. But > there's a much simpler and more down-to-earth way to characterize the > distinction. An information resource is anything that can act as an > HTTP (or, if you want to be more general, some Web transfer protocol > xxTP) endpoint, i.e. can respond appropriately to xxTP requests by > emitting xxTP responses. A non-information resource is anything else. > That's it: end of story. Isn't 303 a response? :) Regardless, I think that's one reading, and a pretty sensible one. However, is the only distinguishing characteristic of a information resource is that it can respond to HTTP? A resource I think was originally defined not just a single representation, but the sum of all possible representations emitted over time (and probably with various context, like cookies, as Sandro pointed out) taken into account. So, an information resource is something that exists only as a set accessible representations through an HTTP endpoint given by a URI. Some people have removed the "HTTP endpoint" clause, and I think that's what causes the confusion over the "writing on the wall" example. Here's the problem - there's no standard way to know if a given resource is the sum of its representations you can access (i.e. an information resource), or if those representations are merely associated descriptions of something that one can only refer to (a non-information resource), which is being described by a HTTP endpoint but is *not the HTTP endpoint itself*. So one pragmatic solution is probably to take a holistic viewpoint and just say that if a URI is used to refer to something inaccessible (a non-information resource), it should clearly attempt to say so, and the onus is on the author to provide associated descriptions to pin down what exactly is being referred to. Another pragmatic solution is to make say that the distinction really doesn't matter, and that - to steal a phrase from Ted Nelson - the Web and reality are increasingly "intertwingled" such taht it's hard to say what's inaccessible on the Web versus what is not. One would normally think that a person's web-page is distinct from them, and that a web-page is accessible through the Web in the way a person isn't. Yet, looking at someone's Myspace account, it's amazing how much of the person themselves is embodied in these representations - and that very real friendships can exist primarily through these representations. So, maybe while the person is somewhat inaccessible through the Web, they are not entirely inaccessible. > Note that this is an architectural kind of criterion, not a > semantic/information-theoretic kind. I doubt if it is really possible > to make the distinction in other than architectural terms. But in any > case, this is a hell of a lot simpler (and I suggest, more accurate) > than the way the TAG currently tries to do it. And it makes it clear > why writing on a wall isn't an information resource but the same > writing on a Web server (not a Web page) is: because the wall can't > respond to HTTP GET and the server can. Agreed. > Pat -- -harry Harry Halpin, University of Edinburgh http://www.ibiblio.org/hhalpin 6B522426
Received on Saturday, 28 July 2007 15:43:53 UTC