- From: <noah_mendelsohn@us.ibm.com>
- Date: Fri, 31 Jul 2009 16:14:46 -0400
- To: Pat Hayes <phayes@ihmc.us>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>, "www-tag@w3.org WG" <www-tag@w3.org>
I'm not sure whether the TAG is interested in spending time on this question in the near future, as it's taken quite a bit in the past, but I will put an item on an upcoming agenda to at least get the sense of the group. Given that some members with important perspectives on this are gone a lot in August, I'm not sure whether we'll wind up doing more this month than deciding to await their return. In any case, I'll schedule an initial, brief, discussion. Noah -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- Pat Hayes <phayes@ihmc.us> Sent by: www-tag-request@w3.org 07/31/2009 03:25 PM To: "www-tag@w3.org WG" <www-tag@w3.org>, HTTP Working Group <ietf-http-wg@w3.org> cc: (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: Fwd: Review of new HTTPbis text for 303 See Other Folks I do not expect a reply, but I put it to y'all, is this stance (below) in fact consistent with what the HTTP and TAG groups have published concerning URIs and what they are intended to identify? In particular, is it consistent with http-range-14? It seems to me it is clearly not, and that this fact is important to what both groups publish in their specifications and recommendations. As a concrete point to focus discussion, I gather that Henrik's view is that in the case where an HTTP URI identifies a non-information resource, but resolves to an HTTP endpoint, it must follow that the "requested resource" (in the sense of HTTPbis) of the URI in the GET request is an information resource interfaced to the HTTP endpoint, and so cannot be the same as the non-information resource which the URI "identifies" in the sense of RFC 3986. I also gather, from off-line emails, that Richard Cygniak would disagree with this interpretation. (I hope I do not misrepresent anyone here.) Apparently, therefore, two people both quite expert in reading the HTTP spec do not interpret the phrase "requested resource" in the same way, leaving me and I suspect others in a state of complete confusion. Pat Hayes -------- Begin forwarded message: From: Henrik Nordstrom <henrik@henriknordstrom.net> Date: July 31, 2009 1:38:18 PM CDT To: Pat Hayes <phayes@ihmc.us> Subject: Re: Review of new HTTPbis text for 303 See Other I am not even going to answer you this time. Go back, read the HTTP specifications, and come back when you have something concrete which actually relate to the specifications as such to talk about. If there is something you want to change then make concrete suggestions on how (and make sure to base it on current drafts). As already said HTTP does not care and have no intentions to ever care what kind of "resource" an URI maps to, semantics of that or what it denotes. All HTTP specifies is an interface language for talking to the server publishing this over HTTP, anything else is irrelevant to HTTP. HTTP has it's definition of the term "resource", like it or not. Within the HTTP specification the word "resource" has the meaning as defined by HTTP. Any meaning defined elsewhere is irrelevant as far as HTTP is concerned. But to address your concerns the term resource will quite likely barely be used at all in the revised HTTP specifications, or at least much less than it is today. I have said what I have to say to you on the subject. Further responses talking about semantics, connections to real-world or even abstract things, or taking statements in the specifications outside the context of the specification where they are written will be silently ignored. Regards Henrik fre 2009-07-31 klockan 13:00 -0500 skrev Pat Hayes: On Jul 20, 2009, at 8:37 PM, Henrik Nordstrom wrote: mån 2009-07-20 klockan 13:16 -0500 skrev Pat Hayes: Apparently you have not understood my point, above. There are cases where NO implementation of ANY KIND can POSSIBLY map a URI to the resource it identifies. So one cannot simply toss this issue over the wall to some other, unspecified, "implementer". Its nothing to do with implementation. For the kinds of URIs that HTTP deals with it can, as far as HTTP is concerned with the definition of "resource" as used by http which for technical specification writing reasons is slightly narrower than the general URI definition of resource. It is not 'slightly' narrower. The general definition of 'resource' has it meaning absolutely anything, real or imaginary, concrete or abstract, that can be referred to and distinguished from other things (the last five words inserted to make sense of the 'identify' language). This is not a 'slightly' wider sense than the one that you apparently have in mind. It is a spectacularly, transfinitely, almost cosmically wider sense. It is as wide as human language knows how wide to make a distinction. I understand, but I am not talking about 'effects', but about semantics. And HTTP is completely ignorant of any semantics that the URIs accessed via HTTP may have. What HTTP cares about is if there may be effects on the resource state by actions requested by HTTP. (i.e. DELETE is assumed to have certain effect when executed on the http resource) My point is that you cannot completely ignore the rest of the world. When writing a technical specification you can, as the relevant part of the world is then the parts that the specification intends to cover and only those parts. BUt when your specification is about a language or a notation, and in part about what that notation means, and when in fact that notation is being used to mean things in a certain (wide) category, then such usage does fall within the scope of your specification, and you should deal with it, if only by stating explicitly that you are not going to consider it. But to ignore it and pretend that it isn't there, by re- defining an existing terminology so as to avoid interfacing with other specifications, is both intellectually dishonest and socially irresponsible. Sorry, strong language, but I really do feel strongly about this, having had to face up to this issue myself when writing specifications. BUt you yourself said that I was thinking about the wrong kind of meaning, not the kind of meaning intended by the spec. Really, you cannot have it both ways. Please make up your mind which is your position, and stick to it. HTTP places absolutely no meaning at all on the general term "resource" as used in english Never mind the English meaning, which is now lost to history in these debates. or even the "resource" as defined by URI specifications. But does it not strike you as inappropriate to simply ignore the normative definitions used in defining technical terms which you yourself use? What is the point of writing specifications if other specification writers are free to redefine the terminology which my specification defines normatively? The only kind of resource HTTP places any meaning on at all is the very much narrowed down "resource" as defined by the HTTP specifications, and even then it's just as an abstract concept to simplify the world description somewhat. To HTTP it does not matter at all what those resources are, only if they can be accessed and/or transmitted via HTTP I understand all this. But there are cases where the resource identified by the HTTP URI is, in fact, not one of these. That is - regardless of its true metaphysical nature, which I agree we will not delve into - whatever it really is - it is not something that can be accessed and/or transmitted. Such cases are REAL, they are out there in the actual world. If your spec refuses to acknowledge this, then it is simply an incomplete specification; and as such, it is less useful that it can and should be. or not as defined by whoever "owns" the resource and who also defines their intended URI semantics (again completely outside of HTTP specifications). I know it does not wish to, but http-range-14 has left it no choice but to care about it, at least a little. Has it? Care to explain that again then, using the term meanings as defined by HTTP. http-range-14 specifies an HTTP-defined action (the use of a 303 redirect) be used under circumstances which arise when the URI in question identifies a thing which is not a resource according to the narrow sense of 'resource' which you are arguing HTTP should restrict itself to. The semantics of URIs has nothing at all to do with layering. It is part of the specification **of URIs themselves**. When anyone talks about the relationship between a URI and the resource it identifies, or denotes, or refers to, or is used to request, or indeed pretty much any relationship between a URI and a resource, they are talking about semantics. Ok. My point here is that HTTP does not care about those semantics. And my point is that it must, at least to the minimal extent required to state a normatively required action under circumstances which can only be described by referring to those semantics. (And also - though this is more controversial - I would argue that in fact, HTTP is already concerned with the semantics of URIs, even though it refuses to acknowledge this elementary fact.) All it possibly cares about is that the server is the ultimately responsible for executing that semantic mapping This is a conceptual mistake. Semantic mappings are not executable. of URI to resource (in URI terms), and that this mapping results in HTTP network accessible resources (which you seem to sometimes call a representation where HTTP calls it a resource I hope not. I try to keep the resource/representation distinction clear. There are however two or more notions of what counts as a 'representation': when in doubt, I use the now-standard circumlocution awww:representation to refer to the narrow sense used in REST and (I assume) HTTP. ) and their possible representations as defined by HTTP. Because the HTTP specs also talk about this. And it is generally a good idea, when two specs talk about the same thing using the same language, that some effort is expended to make sure they are intending to use this language in the same way. Unfortunately if a new term is to be defined for every slight variation there is of the term "resource" in this I am afraid it would be even more confusing. As I have tried to emphasize, this is not a 'slight variation', and in any case I doubt if there are going to be any more changes once we have established that a resource can be absolutely anything. There is very good reasons why "resource" in the URI specifications broader than "resource" in HTTP specifications and both being narrower than the general English "resource". No, the English meaning is actually narrower than the URI specifications sense, which is highly idiosyncratic and of fairly recent coinage (see the Wikipedia entry of 'resource' for a good history.) I understand, but it refers to resources. If for example the spec says (as I believe it does, currently) that if the server has available a transmittable representation of the requested resource, then it must return that with a 200 code, this statement makes no reference to the URI that was used to identify the resource. The URI reference is implicit as the whole text is in the context of builiding a response to a request for a specific URI. Trying to read the text outside that context is non-sense. PLease read what I wrote more carefully. To say that the server has available a transmittable representation of the requested resource, without referring to the URI that was used to request the resouirce, is not nonsensical in any way at all. It implies, as I read it, that this condition holds independently of the URI, so that if the same resource is requested by different URIs then this condition either holds for both of them or for neither of them. So it rules out the possible case where the condition holds for one URI request but not for the other URI request, with a different URI but the same resource. .... No, it is quite on the point. If the server can respond differently to different URIs which both identify the same resource, that changes the game. If the defined semantics of the URIs says the server should respond differently then they in the world as defined by HTTP refer to different resources, but possibly very closely related such. It all boils down to the definition of what a resource is, and the HTTP resource is as I already explained NOT as general as the URI resource. No, the situation is far worse than this. According to your previous paragraph, we can have a situation where two URIs identify the same resource according to the URI spec, but must be understood by HTTP as corresponding to different resources. Just narrowing the sense of 'resource' will not get you this horrible situation. This, if indeed you are right (nobody else has suggested this idea, so I hope you are wrong) makes the HTTP and URI specifications sharply **incompatible** with one another. In the terminology defined by HTTP the difference between an (HTTP-)URI and resource is more of a special case, and not related to any of what you talk about. It is related. In fact it is critical. To me when talking about HTTP it's not. Ah. That certainly makes sense, and indeed is what I understood when I first became involved in these URI-meaning debates. But this position is not consistent with what is said about resources in other standards. And moreover, if this is true, then the http-range-14 decision is simply untenable. For in that case, the 'requested resource' is something that cannot possibly be inside a server. Julius Caesar, let us say, might be the requested resource. And is what we have been saying all along. Trying to use Julius Casear as an example when talking about HTTP resources just does not make any sense as the two by definition can not be the same thing. And yet, there are HTTP URIs which identify Julius Caesar, in the sense of "identify" used in the URI specs. And, moreover, Http- range-14 actually places some conditions on what HTTP must do with such a URI, **because** it identifies a resource of that 'off-Web' kind. So the behavior of HTTP depends, in part, and can only be accurately specified by mentioning, the situation where a URI identifies a "non-HTTP" resource. And this DOES make sense. In fact , it is actually TRUE. Yes it's a simplification, but defining or assume anything about resources anywhere beyond that is outside of HTTP scope and nothing HTTP cares about and is left to the application of HTTP and/or URIs. No, sorry, that position is simply untenable. See me earlier replies to Richard on this point. HTTP cannot hide inside a 'layer' and pretend it is only dealing with computational identifiers which 'map' to computational artifacts. Both the uses and the specifications of http URIs have extended its scope beyond that narrow purview. And I disagree. The semantics of the application of HTTP is and should be much broader than the semantics as used by the HTTP wire protocol. The operation of HTTP, according to http-range-14, is ALREADY concerned with how URIs denote real-world entities beyond the operation of http. And my viewpoint is that that's completely outside of what the HTTP specifications or operations is concerned about. In fact it intentionally does not care about any such concerns and leaves that to the application of HTTP to any such entities. And, to repeat, that view is untenable, precisely because semantics is not about computation. Your notions of layering simply do not apply when you are purporting to make decisions based upon meanings: which you are, whether you like it or not. HTTP-range-14 has made this choice for you. Don't argue with me, if you want to keep your nice tidy 'layering': go back and argue with whoever made the http-range-14 ruling. Anyone is free to define HTTP applications for such entities, by defining HTTP resources mapping to such entities as they please. HTTP only defines how one may interface with those once defined in terms of HTTP resources. What relations those HTTP resources have to any real-world entities is defined by that application, not by HTTP. (Not, by the way, with how *resources* map to real- world resources. In the cases in question, the relationship between the URI and the real-world entity is direct, not mediated through some other resource inside a server.) And in my world that's an impossible condition, as those real-world resources do not exists in HTTP terms They do exist, you are just refusing to look at them. and need to be mediated via some server defined HTTP resource to be accessible via HTTP, or requests for that HTTP-URI would simply result in a 404 until a such HTTP resource is implemented for mapping to the real-world resource. But the phrase "that can be used to interact with a resource" ALREADY limits what a resource can be. You cannot interact with the number 27 or with Julius Caesar. Please note that this part is just explanatory text trying to explain the relationship between HTTP and URI specifications, not a normative definition. The definition of "resource" in the HTTP specifications is found in the terminology section. resource A network data object or service That is not the definition of resource used in RFC3986, however. What I said, and why I highlighted it here. The definitions are different, and you need to use the right definition for each specification or you'll get confused when discussing borderline issues like this. For most practical considerations in the use of HTTP the difference is negligible however. Not any more. Thats why I'm making such a fuss about it. And BTW, these are not 'borderline' issues. HTTP URIs can identify resources in the broader RFC3986 sense; and for those URIs, there may well not be any resource in this narrow sense identified by the URI at all. And yet, still, a GET on them might resolve to an http endpoint. What does the http spec say about such a case? What is the endpoint to do? Yes it's correct that HTTP URIs can identify resources in the broader sense, but not something the HTTP specifications as such concerns itself about. HTTP specifications end at the http endpoint and it's http mapped resource. Hmm, so in these cases, the HTTP URI identifies **two** different resources? The URI one and the HTTP one? Is that what you are saying? I doubt if many people on the TAG would like this. And my point was only that in this case, it is at best confusing any maybe actually wrong to say that IF the server has a transmittable representation available then it must send it with a 200 code. And we don't. We say "suitable to be transmitted", which is quite different from "transmittable" as there is representations that MAY be transmittable in theory but which is still deemed unsuitable (by the http server endpoint or it's policy) OK, I wasnt meaning to confuse this issue, just using 'transmittable' as a shorthand. Sorry. For what are we to say about the second case? It all depends on what is meant by the "requested resource". The difference between a "resource" (as identified by a specific URI) and an HTTP "requested resource" not what you think. The two differ when there are multiple independent representations available by the exact same URI, such as content in different language based on the language preferences of the client etc. But they also differ, presumably, when the identified resource is Julius Caesar. Or do they? I really have no way to know. (It seems to me that HTTP rather shoots itself in the foot by this insistence that its specs must not refer to or even acknowledge the existence of resources that are other than network data or services, since it has defined out of existence the very case that it should be able to refer to, if only to explicitly say that its not going to specify what happens in it. This is rather an ostrich way of writing specs, to pretend that all of the world that you don't like doesn't exist, so that you aren't obliged to say anything about it.) I don¨t agree here. HTTP specifications places a technical limit on what the word "resource" means within the HTTP specifications, which is purely a technical definition. And says nothing about the cases when HTTP URIs are used to refer to other kinds of resource. Which is an ostrich way of writing specifications. My response is that it's the servers role to select a suitable representation of the resource based on the meaning of the URI. Does that mean, of the resource that the URI identifies? And does "identify" mean, denote? Sorry if I am unclear some times. English is not at all my native language, and the word "denote" is not really part of my limited English vocabulary. Sorry. 'denotes' AKA 'refers to', 'identifies', 'is a name for', is used as a name for'. I will try to remember to say 'refers to' or 'identifies'. >From my understanding of "denote" it's: Of the HTTP resource the HTTP-URI identifies. Where identifies as in is in the sense of how an Universal Resource Identifier identifies a network-accessible resource, ignoring completely what that resource denotes in the broader sense. But you cannot ignore this completely when the URI does *in fact* identify something other than a network-accessible resource. ??!!? Of course two different URIs can refer to the same resource. If HTTP is built on a different supposition, then HTTP is simply wrong. Sure they can. The points here is: * that HTTP does not care if they do OK, but... * and that HTTP has the view that if the semantics of those URIs is different then they do in fact NOT refer to the same resource That simply does not make sense. What you say here (seem to say here) is logical nonsense. Look, if two names refer to the same thing (call it a resource if you like) then there is only one thing that they both refer to. So to say that 'as far as X is concerned' they refer to different things, is simply meaningless. There aren't two things there to be referred to, in this case. So, sorry: they DO IN FACT refer to the same resource. If HTTP thinks otherwise, then HTTP is simply WRONG. There is no finer-grained identity than identity itself. If you think I am technically mistaken on this topic, please refer me to some published work which makes semantic sense of the view of identity that you are basing this claim upon. (And as I have had this discussion many times before, if you are going to cite LISP at me: identity in LISP is EQ, not EQUAL.) They may refer to different facets of some larger/broader resource but not the same. I have no idea what you mean by a facet of a resource. What 'facets' does Richard or J.C. have? If those URIs happens to really refer to the same resource both URIs will respond identically, and further is indistinguishable from two identical copies of the same resource. ?? I am trying to make sense of this, and not sure I have it right. Take the case in my email to Richard, where there is a URI denoting him, Richard C., the actual person. (Note, this is not a topic that HTTP gets to rule out or refuse to acknowledge, because this can in fact happen. My question is about what HTTP should do in such a case.) HTTP handles the case by restricting it's notion of resource to the network-accessible resource used for interfacing with Richard C. First, there is no such resource: Richard C. isn't the kind of thing that you can 'interface' with over a network. (Well, maybe by email, but then we would be talking about his emailbox.) Second, its not important what HTTP 'restricts' itself to: the fact remains that (in the case described) the URI does **in fact** identify Richard, not some network-accessible thingie that stands in some relationship to him. (That thingie might have its own URI, of course, which does identify it.) So if what you say here is correct, I presume that HTTP simply treats the URI as not having a corresponding http:resource at all. Right? Because it is a basic assumption of the whole Web architecture that the resource identified by a URI is unique. So if the URI identifies Richard, it can't also identify the thingie. That resource MAY or MAY NOT have an actual interface with Richard C, HTTP does not care and need not care for it's operations. In this case, according to Richard, he is the 'requested resource'. The GET request is directed to a server which has some other resource inside it, call this resource R. R is a resource in your narrower sense (a network data object or service), but this is *not* the requested resource in this case, even though the URI resolves to (the server containing) R. In terms of HTTP R is the requested resource. I thought you might say that. So what then is the relationship between a requested resource and the resource identified by a URI? Apparently they can be different, so we have at least two resources somehow connected with a URI. Are there any more? (Do you agree?) In this case, http-range-14 requires that the server emit a 303 coded response, because even though there may well be a transmittable (awww-) representation of R, there is none of Richard C., and he is the requested resource. That's up to R (or whoever/whatever defines R) to decide. No, it is not. It is simply a fact that there is no transmittable awww:representation of Richard. He isn't the kind of thing that has such representations. But in any case, it appears that, on your account, the whole action of HTTP need have **absolutely nothing** to do with the resource that the URI identifies (in this case, Richard.) So tell me: here I am with a URI, and in order to find out more about what it identifies, I use it in an HTTP GET, and something happens. What, if anything, can I conclude about the resource that my URI identifies? AFAIK, the only possible answer is, on your account: nothing at all. Its all going to be mediated by the resource that the URI requests, and that need have nothing to do with what it identifies. Nor need the response codes have any connection with the resource identified by the URI: indeed, if the requested (not identified) resource has a 200-level-suitable awww:representation, then that is what the server must send me back, even though neither it not its source (that is, in the above example, neither the awww:representation of R not R itself) need have anything whatever to do with the identified resource (Richard). Right? I agree this picture has a certain elegance and simplicity, but it makes complete nonsense of almost everything that has been said and written about URIs and resources for the past decade. It means that the picture of Web architecture promoted by the TAG is sharply and fatally different from that supported by HTTP. Anyone else like to comment on this? Pat ------------------------------------------------------------ IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes ------------------------------------------------------------ IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Friday, 31 July 2009 20:15:40 UTC