Re: Review of new HTTPbis text for 303 See Other from Xiaoshu Wang on 2009-07-13 (www-tag@w3.org from July 2009)

From: Xiaoshu Wang <xiao@renci.org>
Date: Sun, 12 Jul 2009 20:58:33 -0400
To: Pat Hayes <phayes@ihmc.us>
CC: Richard Cyganiak <richard@cyganiak.de>, "Roy T. Fielding" <fielding@gbiv.com>, Jonathan Rees <jar@creativecommons.org>, Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>, www-tag@w3.org
Message-ID: <4A5A86B9.2000402@renci.org>
Pat Hayes wrote:
>
> On Jul 11, 2009, at 5:27 AM, Richard Cyganiak wrote:
>
>> Pat,
>>
>> On 10 Jul 2009, at 01:32, Pat Hayes wrote:
>>>> If the server has a transferable representation, it would
>>>> respond to the GET with the appropriate status code (200 or 304).
>>>
>>> Well, yes, IF it were driven solely by what one might call rational 
>>> HTTP architectural principles. BUt surely the whole issue about 
>>> httprange14 is that it introduces new principles which on their face 
>>> have nothing to do with http architecture as such, but to do with 
>>> denotation and naming.
>>
>> Not as far as HTTP is concerned. HTTP is just a transfer protocol. 
>> The HTTP world is really simple:
>>
>> 1. There are URIs. URIs are thought to identify things called resources.
>
> OK, stop there and tell me what you mean here by "identifies". Because...
>
>> As far as HTTP is concerned, it does not matter much what the 
>> resource actually is -- a document, a file on a server, a person, 
>> whatever.
>
> ... in the usual sense of 'identifies' that one might expect to be use 
> in the context of a network transfer protocol, which is similar to the 
> use one might expect when talking about programming language 
> identifiers and what they identify, it most certainly does matter. In 
> particular, it simply does not make sense to speak, using that normal 
> terminology, of 'identifying' a person (or a galaxy or a sodium atom, 
> etc.); in fact, it does not make sense to talk of identifying anything 
> much beyond some kind of data structure or data object. So if HTML 
> claims to be able to make sense of talking of 'identifying' people 
> (for example), it must be in a wholly different space than all 
> previous computationally based notational systems, and be using the 
> word "identify" in a wholly different sense. And, to repeat, can you 
> tell me what that sense is?
>
>>
>> 2. Resources (whatever they are) are thought to have things called 
>> representations. As far as HTTP is concerned, it is totally up to the 
>> server owner to decide what's a representation of what. After the 
>> server owner has made their decision, a resource either has a 
>> representation or not.
>
> Really? OK, I will take you at your word. I am a server owner, and I 
> will decide that a certain resource, to wit, me, has a thing called a 
> representation of me. This representation of me is in fact a portrait, 
> painted using acrylic paints on a piece of masonite approximately 30 
> cm square almost exactly a month ago: but let us not go into details, 
> as you tell me that such details are none of HTTP's business. Still, 
> the representation exists, and the resource has it. OK, let us proceed.
>
>>
>> 3. If a resource has a representation, then a GET to its URI should 
>> be answered by 200. If not, then 303, 404 or 410 would be fine choices.
>
> So, HTTP must reply to a GET on my URI with a 200. OK, what should it 
> put as the payload of this 200 response, attached to the code 
> information? HOw do I get acrylic-coated masonite into an http 
> response? There is no representation which can be transmitted in bits. 
> You did not mention this aspect in your above summary: was that an 
> omission?
>
>>
>> I repeat: For the operation of the HTTP protocol, IT DOES NOT MATTER 
>> what exactly a resource is and what the exact relationship between 
>> resources and representations is.
>
> As you can see, I took advantage of this freedom in my example.
>
>> All these matters of denotation, information resources and so on are 
>> introduced by higher layers of the architecture.
>
> Wrong. Denotation is not introduced by a higher level, and even if it 
> were, it would not be higher in an architectural sense. You, in this 
> very message, in fact brought denotation into the picture, by telling 
> me that a URI can "identify" a person. URIs are symbols strings, and 
> the ONLY POSSIBLE SEMANTIC RELATIONSHIP between ANY symbol and a 
> physical object, is denotation. Sorry to shout there a little, but the 
> point needs to be made strongly. That is what "denotation" means: it 
> is all that is left of "identifying" when you take away the actual 
> network machinery, the computational byte-transferring. And you have 
> to take this away when you start claiming to talk of relationships 
> between names (of any kind) and non-computational entities such as 
> people (or indeed of any kind), simply because computational 
> byte-transfer talk is COMPLETELY IRRELEVANT to semantic relationships 
> (such as "identification") between symbols (of any kind) and 
> non-computational entities. The fact, if it is a fact, that this word 
> is not in your technical vocabulary is entirely irrelevant. By 
> claiming that your symbols "identify" non-computational entities such 
> as people or books (or the weather in Oaxacala, to take another random 
> example) , you are no longer playing in the network-architectural 
> sandbox, precisely because these kind of things simply are not 
> connected to networks in the same functional sense that things like 
> web servers are. Either HTTP is a computational notion or it isn't. If 
> it is, then it is indeed quite simple. And I would be delighted if the 
> HTTP literature simply restricted itself to the computational world. 
> But it does not, and never has: HTTP has ALWAYS had these claims to 
> semantic grandeur: it has ALWAYS claimed to be not just about web 
> sites and web servers and files and documents, but about the whole 
> grand span of symbol usage to refer to absolutely anything in any 
> possible universe. And if indeed that is what HTTP is claiming to be 
> able to talk about, then it is about denotation, right out of the box.
>
>>
>> Yes, it would be useful to provide guidance to publishers about how 
>> best to model their information space as resources and 
>> representations. But this is out of scope for the HTTP protocol.
>
> See above. If indeed it is out of scope, so is any talk of URIs 
> "identifying" people. You can't have it both ways. Either you are 
> doing real semantics or you aren't. If you aren't, then don't make 
> ridiculous claims about "identifying" things that have no possible 
> connection to any physical network, or of "representations" that 
> cannot be sent in a byte stream.
>
>> The HTTP protocol kicks in AFTER the publisher has made up their mind 
>> about what resources they have and wether they have representations 
>> or not.
>
> OK, please tell me how to use HTTP to send my piece of masonite 
> attached to a 200 code. I've made up MY mind: over to you.
>
>>
>> Now, different subcommunities have different opinions on how to model 
>> resources and representations. That's not a good thing, and it would 
>> be good for interoperability if everyone agreed. However, this is 
>> pretty much orthogonal to any discussion of the HTTP protocol. As 
>> long as the subcommunities subscribe to the basic 
>> "URI-identifies-resource-which-can-have-representations" model, HTTP 
>> can accomodate them.
>>
>> Now let me take off my RDF hat for a bit.
>>
>> The suggested change for the 303 text came about because one 
>> subcommunity had the funny idea that some resources SHOULD have URIs 
>> but NO representations and it should STILL be possible to get 
>> information about them via HTTP.
>
> No, that is not the primary reason. Http-range-14 is not about 
> resources, it is about URIs and what they denote. The dilemma is that 
> people want 'normal' URIs to denote what it that HTTP thinks of them 
> as identifying, the "information resource" (not that that matters). 
> Which would be fine, except that there are some URIs which people want 
> to denote something else. And still, actually for different ('linked 
> data', Timblish) reasons, people want a GET on those URIs to finish 
> up, one way or another, with useful information being returned. This 
> is a problem. It would be ugly to have two 'kinds' of URI, and 
> impossible to change the millions of 'normal' URIs in any way at all. 
> The decision allows the few non-normal URIs to take part in a slightly 
> irrational HTTP dance which allows everyone to say: look, since it 
> didn't return a 200 code, its not 'normal', and HTTP says it doesn't 
> identify anything at all; so the 'normal' assumptions about what it 
> denotes are cancelled. And that cancellation is the entire content of 
> the decision: it has no other purpose. The nature of the entity which 
> handles the GET, and the presence or absence of 'representations' of 
> it, are irrelevant.
>
>> It beats me why anyone would want to do that
>
> The reason is that there are, believe it or not, entities in the 
> universe other than web servers; and people want to refer to them 
> using URIs.
>
>> ; but if we can make them happy with a minimal tweak to the language 
>> of an existing status code, then why not. HTTP is for everyone.
>>
>>> If the URI in the GET request is not intended to denote the resource 
>>> to which the GET is directed, then that resource must issue a 303 
>>> redirection, and must not return a representation using a 200 status 
>>> code.
>>
>> There is no such thing as denotation in HTTP. The only relation 
>> between URIs and resources in HTTP is "identifies".
>
> Which, if i means anything at all when used between a symbol and a 
> non-computational entity, means 'denote' (or, if you prefer, 'refers 
> to' or 'is a name for'; they are all equivalent usages.) And again, I 
> challenge you (or anyone else) to tall me what "identifies" can 
> possibly mean, in thee circumstances, other than this.
>
>> If you care about other relations, you have to figure out how to 
>> translate them into the 
>> "URI-identifies-resource-which-can-have-representations" model of HTTP.
>
> That model is either (1) already about denotation, or (2) utterly 
> broken, or (3) meaningless as stated.
>
>>
>>> That has nothing to do with the existence or not of such a 
>>> representation. Even if the representation exists and the server has 
>>> access to it, it cannot return it with a 200 code when the URI is 
>>> intended to denote some other thing, in particular a non-information 
>>> resource of some kind.
>>
>> Wether a representation exists or not for a particular kind of 
>> resource is entirely up to the server owner, as far as HTTP is 
>> concerned. If you subscribe to a religion that says, "Thou shall not 
>> make a representation of me, for I am not an information resource", 
>> then that's great, and let me shake your hand brother, but this has 
>> no effect on HTTP.
>
> But thats the easy case. The hard case, for you, is when I use that 
> very handy English word "representation" is one of its normal senses, 
> not when I refuse to use it at all. There are many,  many kinds of 
> representations of things, and only a miniscule proportion of them 
> have anything even remotely to do with computers or network transfer 
> protocols.
>
>>
>>> If we follow your rule, above, and also httprange14, then a server 
>>> can be placed in an impossible position. If it has a representation 
>>> of itself which  could be put into a 200-code response, and it 
>>> receives a GET request with a URI which it knows (somehow, perhaps 
>>> by some externally agreed convention) is being used to denote a 
>>> non-information resource; what should it do? HTTPrange14 requires it 
>>> to not deliver a 200-coded reply, but your criterion requires that 
>>> it must. This is why I think the wording should make absilutely 
>>> minimal assumptions about what exactly the 303 means.
>>
>> (RDF hat back on) Any sensible definition of "non-information 
>> resource" obviously MUST entail "does not have representations in the 
>> HTTP sense". In fact, that IS the definition of "non-information 
>> resource", in my book.
>>
>
> Of course, but that is completely irrelevant to my point. The server, 
> in my example, is not the non-information resource that the URI refers 
> to; that is precisely why httprange14 requires it, the server, to emit 
> a 303 code rather than a 200 code. It is merely the servant whose job 
> it is to emit the appropriate code to make everything work properly. 
> But it is AN information resource, and it may well have a 
> representation (in the http sense) of itself. Its just a different 
> resource than the one the URI denotes/refers to.
>
>> Wrapping up:
>>
>> For the function of the HTTP transfer protocol, it does not matter 
>> what exactly the nature of the things identified by URIs is.
>
> Oh, but it does. Because HTTP talks about information transfer between 
> entities which can transfer information, but it talks of 
> 'identification' of ANY THINGS WHATSOEVER, whether they can or even 
> possibly could transfer information. For example, a numeral identifies 
> a number, and also is a representation of it. So HTTP should apply to 
> this case as well, according to what you say here. I should be able to 
> send a GET request to the number seventeen and expect to get sent back 
> a 200-coded response with a suitable numeral in its body, say "17". I 
> know that is ridiculous: but it FOLLOWS FROM WHAT YOU ARE SAYING; 
> ergo, what you are saying is ridiculous.  So you ought to modify what 
> you are saying, so that it makes more sense.
>
>>
>> For the function of the HTTP transfer protocol, it does not matter 
>> wether the things you serve as representations on your server make 
>> particularly good representations of the resources.
>>
>> There are different schools of thought that try to clarify the nature 
>> of the "identifies" and "has representation" relationships, and this 
>> is critically important if we want to use HTTP URIs as identifiers 
>> for things that exist outside of the Web. But the HTTP protocol 
>> itself is and should be agnostic with regard to your position in 
>> these debates. That's layering.
>
> No, it is a poisonous combination of semantic (or maybe philosophical 
> or semiotic) ignorance, and hubris. You want http to be universal, but 
> you are claiming a kind of universality which goes way beyond anything 
> to do with network architecture, and so you can't escape the 
> consequences by appealing to network design principles.  Maybe you 
> don't intend to be doing this, but it is being done by what you (and I 
> should cast this in a kind of anonymous plural, as the excellent 
> southern phrase y'all, as I don't intend this rant to be directed at 
> you in particular) are actually saying.

I like this -- "it is a poisonous combination of *philosophical* 
ignorance with hubris".  I have had this sentiment for quite a while.

But between Alan and Pat, I actually think that you two agree, as oppose 
to disagree.  The problem is, however, the dual sense of an HTTP-URI, 
which is used both as a name, (which is all about denotation) and a 
locator, (which is all about transportation).  I believe if you two 
reword your arguments by clarifying the exact use, you will find that 
there is not much that you two disagree philosophically.

TAG, take my suggestion -- make *one* pure URN for the Web. It doesn't 
have to be in the shape of the schemeless-one as I have proposed.  As 
long it is a pure name, in the sense that there is no predefined binding 
of a transportation protocol, it will do the job.

Let's not use the cost-excuse.  I just don't buy it.  First, I don't 
believe the cost is high because how can adding a new URI scheme any 
existing schemes.  Second, even if it is high, it will be worth it. So, 
we will not spend those endless discussion all caused by an ambiguous 
wording.

Xiaoshu
Received on Monday, 13 July 2009 00:59:24 UTC