W3C home > Mailing lists > Public > www-tag@w3.org > April 2008

Re: Uniform access to descriptions

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 8 Apr 2008 23:59:01 -0500
Message-Id: <p0623090cc421a203db67@[]>
To: wangxiao@musc.edu
Cc: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>, Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org WG" <www-tag@w3.org>, Phil Archer <parcher@icra.org>
At 10:28 PM +0100 4/8/08, Xiaoshu Wang wrote:
>Pat Hayes wrote:
>>OK, I take your point: that you don't disagree with me here, but 
>>instead take the view that/ nothing/ satisfies the TAG criterion 
>>for an information resource, at least as stated by them: "its 
>>essential characteristics can be conveyed by a message"; so the 
>>entire discussion is moot, since by the published criterion, 
>>according to http-range-14,/ no/ http request should/ ever/ return 
>>a 200 code. Which is ridiculous, etc.. But here, I simply disagree 
>>with you. At best, I think what you have shown in 
>>http://dfdf.inesc-id.pt/tr/web-arch is that the published 
>>definition of 'information resource' is faulty. Perhaps so: along 
>>with a lot of other people, I also don't like it. But surely the/ 
>>intention/ of the TAG was reasonably clear. When a browser accesses 
>>something we informally call a 'web site' and gets a 'web page' 
>>back to display to its user, clearly said 'page' is a reasonably 
>>close facsimile (which is, again informally, what 
>>webarch-representation seems to mean, most of the time) of/ 
>>something/, and moreover that something has a number of familiar 
>>properties: it is located on its server, it can respond to http 
>>requests, it might well have been written in HTML, and so on. We 
>>all know this and also know, in a pre-philosophical sense, what is 
>>being spoken about. We also know that it is very hard to give a 
>>single definitive characterization of these things that we can 
>>engage with over the internet, partly because the technology keeps 
>>changing under our feet and extending the possibilities in new 
>>ways. Nevertheless, without splitting hairs about the exact 
>>boundaries of this elusive concept, we can all recognize and 
>>largely agree on a number of clear-cut example and non-example 
>>cases. Webpages and jpeg images are examples. HTML documents are 
>>examples. Non-electronic physical objects and abstract or fictional 
>>entities are clearly non-examples. The TAG bravely, or perhaps 
>>foolishly, tried to give an actual definition: but rather than 
>>seize on this and use it to attack the intention seems, even if its 
>>good philosophy, not to be the most useful way to proceed. Better 
>>to take the intention and use it to attack the definition :-)
>Sure, we can take an informal definition.  But the issue becomes 
>serious when the TAG intend to invoke such logic.  That is: if _:x 
>HTTP-200 => _:x a webarch:IR.  The question was raised with good 
>intention because if without such logic, what is the point of 

Fair enough.

>Then, here is the dilemma: If I publish something in RDF, I must 
>worry if something is an IR or not.  Because in logic, a single 
>contradiction will invalidate the entire theory.

Not on the main path, but I disagree with you here also. Logic 
basically gives up when faced with an inconsistency, but that does 
not imply that one inconsistency makes the entire theory useless. 
There are many ways to isolate or ignore contradictions which can be 
deployed in practice. But still, I agree we should try to get it 
right wherever possible.

>  Hence, I must know precisely the definition of IR in order to 
>ensure that my data won't (unnecessarily) lead to such 
>contradiction. But I cannot do that with the current definition.

But you can in many (most?) cases. Here's what you do. You have a URI 
and something in mind you want it to denote. Now, give that URI to a 
browser and see what happens. If you get a 404, the URI is yours to 
command. If you get a 303 then its still yours, but it would be good 
manners to make it denote something connected with the result of the 
303 somehow. If you get a 200, then you ask yourself the following 
question: is whatever sent me this response the very thing I have in 
mind that I want my URI to denote? If the answer is yes, go ahead. If 
not, use a different URI or (if you can) alter the response it 
returns when GETted.

My point is that this is all that http-range-14 really requires you 
to actually do. You can ignore the metaphysics and the confusion and 
the definition-soup and so on.

>Then, since I can always be accused wrong regardless of my best intension,

Well, you can be accused, but it seems to me that the above algorithm 
also gives you a good defense. Remember, you never have to justify a 
claim that something is an information resource, only that it isn't.

>  what is the point for me to use 303 instead of 200?  The latter is 
>cheaper and easier. If eventually, everyone has felt the same 
>dilemma that I felt and choose 200 anyway, then by public verdict, 
>httpRange-14 becomes useless.

Well, true, but I don't see that happening, in fact. My own private 
conclusion is that http-range-14 makes sense, but the moral I draw 
from it is to never, under any circumstances, use an unhashed URI to 
denote anything but a webpage or a deliverable document of some kind. 
Which is a kind of conformity by avoidance, but much simpler to 
satisfy than trying to mess around with 303 codes.

>On the other hand, if we don't invoke the earlier introduced logic, 
>what is the point of httpRange-14?

See above.

>>>In this sense, any HTML page is abstract with respect to the web. 
>>>What is concrete to the web is the "representation" of the 
>>Not in the webarch sense of 'representation'. Those 
>>'representations' are transient entities which exist only as they 
>>move across the physical Web in an http (or xxxtp) response 
>>message. They are like the photons of the Web: we become aware of 
>>them only when they have already ceased to exist, by causing 
>>changes to our relatively static data structures.
>It is!  What is parsed in your browser is the *representation* of 
>the resource denoted by that URI, which is always abstract w.r.t. to 
>the web-architecture.

My browser is sent a representation, but it then creates something 
else in my RAM which I get to view and store away for later. Or at 
least that is my understanding of the official story about 

>  We always understand reality from its representation. What I 
>perceived you is not you but your representation in my brain.

Well now we are doing real philosophy, but that is wrong. My 
perception (of, say, a tree I am looking at) is maybe constituted by 
representations in my brain, but what I perceive is the actual tree. 
We cannot, in fact, perceive our own mental representations: if we 
could, cognitive science would be a lot easier than it in fact is.

>  When you dereference http://dfdf.inesc-id.pt/tr/web-arch, you 
>didn't get that resource, you get a *representation* or 
>*description* of that resource.

Not a description, but a webarch:representation of it, yes. But I got 
that from the actual resource. It was the resource that sent it to 
me. (If there hadn't been a resource there to send it, I wouldn't 
have got it; just as if there hadn't been a tree to bounce photons 
off, I wouldn't have my mental representation of the tree.) And it is 
the resource which sent me the representation that (in the case of a 
200 code, according to http-range-14) the URI denotes, not the 
representation that it sent.

>>>Or we can take TBL's viewpoint.  To make all slash URI as an 
>>>information resource.
>>The URIs aren't the information resources themselves: the i.r.'s 
>>are what are/ denoted/ by the slash URIs. And that's not really an 
>>architectural decision: its more of a rule of semantic 
>>interpretation. And I (now) think that its the only practical rule 
>>to adopt in this case, if we have to have any rules at all. And all 
>>the rest of the decision follows from that.
>>>  This again gives IR a syntactic definition, which is O.K. and 
>>>usable.  But the reality that many hash URI are used in a way that 
>>>will make TBL's position difficult to accept.
>>? Really? But it seems to me that this position is entirely/ 
>>agnostic/ about the meaning of hash URIrefs. Which indeed is one of 
>>its strengths.
>Do we know what a #URI denote?


>Is it an IR or not?

We have no idea. It could be anything, just as a 303 redirect tells 
us nothing about what the URI is obliged to denote. Http-range-14 is 
silent on both of these cases. It only specifies that in the case of 
an unhashed URI returning a 200 response, the URI is understood to 
denote the resource that emits the response.

>Or something-else? For instance, what does this URI denotes?

The only way to tell, is to find out what assertions are made using 
this URI by sources you are inclined to trust, and go from there. Not 
a very informative answer, I know, but I don't think that it is 
possible to give a more informative one, so we will all have to try 
to get along with this.

>Just remind you that there are a few mime-type for 
>http://dfdf.inesc-id.pt/tr/doc/web-arch/img/fig2 (text/plain, 
>text/html, image/svg+xml, image/jpeg, and image/gif).  This is an 
>area that TAG hasn't been able to address yet.  But again, it can be 
>consistent if we assume that a URI always denote an abstract 
>resource and *representation* is what you get.

Im not sure what you mean by 'abstract resource'. For example, I am 
pretty sure that I am myself not very abstract at all.


>  Hence, #URI can be explained in the same way.  What I mean is if we 
>take TBL, the nature of what a #URI denotes must be taken into the 
>consideration all together.

IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
http://www.ihmc.us/users/phayes      phayesAT-SIGNihmc.us
Received on Wednesday, 9 April 2008 04:59:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:55 GMT