Re: Review of new HTTPbis text for 303 See Other from Pat Hayes on 2009-07-31 (www-tag@w3.org from July 2009)

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 31 Jul 2009 13:00:05 -0500
To: Henrik Nordstrom <henrik@henriknordstrom.net>
Cc: "www-tag@w3.org WG" <www-tag@w3.org>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <CB363854-DAE1-46A6-B9E5-CABE9CB20B6A@ihmc.us>
On Jul 20, 2009, at 8:37 PM, Henrik Nordstrom wrote:

> mån 2009-07-20 klockan 13:16 -0500 skrev Pat Hayes:
>
>> Apparently you have not understood my point, above. There are cases
>> where NO implementation of ANY KIND can POSSIBLY map a URI to the
>> resource it identifies. So one cannot simply toss this issue over the
>> wall to some other, unspecified, "implementer". Its nothing to do  
>> with
>> implementation.
>
> For the kinds of URIs that HTTP deals with it can, as far as HTTP is
> concerned with the definition of "resource" as used by http which for
> technical specification writing reasons is slightly narrower than the
> general URI definition of resource.

It is not 'slightly' narrower. The general definition of 'resource'  
has it meaning absolutely anything, real or imaginary, concrete or  
abstract, that can be referred to and distinguished from other things  
(the last five words inserted to make sense of the 'identify'  
language). This is not a 'slightly' wider sense than the one that you  
apparently have in mind. It is a spectacularly, transfinitely, almost  
cosmically wider sense. It is as wide as human language knows how wide  
to make a distinction.
>
>> I understand, but I am not talking about 'effects', but about  
>> semantics.
>
> And HTTP is completely ignorant of any semantics that the URIs  
> accessed
> via HTTP may have.
>
> What HTTP cares about is if there may be effects on the resource state
> by actions requested by HTTP. (i.e. DELETE is assumed to have certain
> effect when executed on the http resource)
>
>> My point is that you cannot completely ignore the rest of the world.
>
> When writing a technical specification you can, as the relevant part  
> of
> the world is then the parts that the specification intends to cover  
> and
> only those parts.

BUt when your specification is about a language or a notation, and in  
part about what that notation means, and when in fact that notation is  
being used to mean things in a certain (wide) category, then such  
usage does fall within the scope of your specification, and you should  
deal with it, if only by stating explicitly that you are not going to  
consider it.  But to ignore it and pretend that it isn't there, by re- 
defining an existing terminology so as to avoid interfacing with other  
specifications, is both intellectually dishonest and socially  
irresponsible. Sorry, strong language, but I really do feel strongly  
about this, having had to face up to this issue myself when writing  
specifications.

>
>> BUt you yourself said that I was thinking about the wrong kind of
>> meaning, not the kind of meaning intended by the spec. Really, you
>> cannot have it both ways. Please make up your mind which is your
>> position, and stick to it.
>
> HTTP places absolutely no meaning at all on the general term  
> "resource"
> as used in english

Never mind the English meaning, which is now lost to history in these  
debates.

> or even the "resource" as defined by URI
> specifications.

But does it not strike you as inappropriate to simply ignore the  
normative definitions used in defining technical terms which you  
yourself use? What is the point of writing specifications if other  
specification writers are free to redefine the terminology which my  
specification defines normatively?

>
> The only kind of resource HTTP places any meaning on at all is the  
> very
> much narrowed down "resource" as defined by the HTTP specifications,  
> and
> even then it's just as an abstract concept to simplify the world
> description somewhat. To HTTP it does not matter at all what those
> resources are, only if they can be accessed and/or transmitted via  
> HTTP

I understand all this. But there are cases where the resource  
identified by the HTTP URI is, in fact, not one of these. That is -  
regardless of its true metaphysical nature, which I agree we will not  
delve into - whatever it really is - it is not something that can be  
accessed and/or transmitted. Such cases are REAL, they are out there  
in the actual world. If your spec refuses to acknowledge this, then it  
is simply an incomplete specification; and as such, it is less useful  
that it can and should be.

> or not as defined by whoever "owns" the resource and who also defines
> their intended URI semantics (again completely outside of HTTP
> specifications).
>
>> I know it does not wish to, but http-range-14 has left it no choice
>> but to care about it, at least a little.
>
> Has it? Care to explain that again then, using the term meanings as
> defined by HTTP.

http-range-14 specifies an HTTP-defined action (the use of a 303  
redirect) be used under circumstances which arise when the URI in  
question identifies a thing which is not a resource according to the  
narrow sense of 'resource' which you are arguing HTTP should restrict  
itself to.
>
>> The semantics of URIs has nothing at all to do with layering. It is
>> part of the specification **of URIs themselves**. When anyone talks
>> about the relationship between a URI and the resource it identifies,
>> or denotes, or refers to, or is used to request, or indeed pretty  
>> much
>> any relationship between a URI and a resource, they are talking about
>> semantics.
>
> Ok. My point here is that HTTP does not care about those semantics.

And my point is that it must, at least to the minimal extent required  
to state a normatively required action under circumstances which can  
only be described by referring to those semantics. (And also - though  
this is more controversial - I would argue that in fact, HTTP is  
already concerned with the semantics of URIs, even though it refuses  
to acknowledge this elementary fact.)

> All
> it possibly cares about is that the server is the ultimately  
> responsible
> for executing that semantic mapping

This is a conceptual mistake. Semantic mappings are not executable.

> of URI to resource (in URI terms),
> and that this mapping results in HTTP network accessible resources
> (which you seem to sometimes call a representation where HTTP calls  
> it a
> resource

I hope not. I try to keep the resource/representation distinction  
clear. There are however two or more notions of what counts as a  
'representation': when in doubt, I use the now-standard circumlocution  
awww:representation to refer to the narrow sense used in REST and (I  
assume) HTTP.

> ) and their possible representations as defined by HTTP.
>
>> Because the HTTP specs also talk about this. And it is generally a
>> good idea, when two specs talk about the same thing using the same
>> language, that some effort is expended to make sure they are  
>> intending
>> to use this language in the same way.
>
> Unfortunately if a new term is to be defined for every slight  
> variation
> there is of the term "resource" in this I am afraid it would be even
> more confusing.

As I have tried to emphasize, this is not a 'slight variation', and in  
any case I doubt if there are going to be any more changes once we  
have established that a resource can be absolutely anything.

>
> There is very good reasons why "resource" in the URI specifications
> broader than "resource" in HTTP specifications and both being narrower
> than the general English "resource".

No, the English meaning is actually narrower than the URI  
specifications sense, which is highly idiosyncratic and of fairly  
recent coinage (see the Wikipedia entry of 'resource' for a good  
history.)

>
>> I understand, but it refers to resources. If for example the spec  
>> says
>> (as I believe it does, currently) that if the server has available a
>> transmittable representation of the requested resource, then it must
>> return that with a 200 code, this statement makes no reference to the
>> URI that was used to identify the resource.
>
> The URI reference is implicit as the whole text is in the context of
> builiding a response to a request for a specific URI. Trying to read  
> the
> text outside that context is non-sense.

PLease read what I wrote more carefully. To say that the server has  
available a transmittable representation of the requested resource,  
without referring to the URI that was used to request the resouirce,  
is not nonsensical in any way at all. It implies, as I read it, that  
this condition holds independently of the URI, so that if the same  
resource is requested by different URIs then this condition either  
holds for both of them or for neither of them. So it rules out the  
possible case where the condition holds for one URI request but not  
for the other URI request, with a different URI but the same resource.

>

....

>
>> No, it is quite on the point. If the server can respond differently  
>> to
>> different URIs which both identify the same resource, that changes  
>> the
>> game.
>
> If the defined semantics of the URIs says the server should respond
> differently then they in the world as defined by HTTP refer to  
> different
> resources, but possibly very closely related such.
>
> It all boils down to the definition of what a resource is, and the  
> HTTP
> resource is as I already explained NOT as general as the URI resource.

No, the situation is far worse than this. According to your previous  
paragraph, we can have a situation where two URIs identify the same  
resource according to the URI spec, but must be understood by HTTP as  
corresponding to different resources. Just narrowing the sense of  
'resource' will not get you this horrible situation. This, if indeed  
you are right (nobody else has suggested this idea, so I hope you are  
wrong) makes the HTTP and URI specifications sharply **incompatible**  
with one another.

>
>>> In the terminology defined by HTTP the difference between an
>>> (HTTP-)URI
>>> and resource is more of a special case, and not related to any of  
>>> what
>>> you talk about.
>>
>> It is related. In fact it is critical.
>
> To me when talking about HTTP it's not.
>
>> Ah. That certainly makes sense, and indeed is what I understood  
>> when I
>> first became involved in these URI-meaning debates. But this position
>> is not consistent with what is said about resources in other
>> standards.  And moreover, if this is true, then the http-range-14
>> decision is simply untenable. For in that case, the 'requested
>> resource' is something that cannot possibly be inside a server.  
>> Julius
>> Caesar, let us say, might be the requested resource.
>
> And is what we have been saying all along. Trying to use Julius Casear
> as an example when talking about HTTP resources just does not make any
> sense as the two by definition can not be the same thing.

And yet, there are HTTP URIs which identify Julius Caesar, in the  
sense of "identify" used in the URI specs. And, moreover, Http- 
range-14 actually places some conditions on what HTTP must do with  
such a URI, **because** it identifies a resource of that 'off-Web'  
kind. So the behavior of HTTP depends, in part, and can only be  
accurately specified by mentioning, the situation where a URI  
identifies a "non-HTTP" resource. And this DOES make sense. In fact ,  
it is actually TRUE.
>
>>> Yes it's a simplification, but defining or assume anything about
>>> resources anywhere beyond that is outside of HTTP scope and nothing
>>> HTTP
>>> cares about and is left to the application of HTTP and/or URIs.
>>
>> No, sorry, that position is simply untenable. See me earlier replies
>> to Richard on this point. HTTP cannot hide inside a 'layer' and
>> pretend it is only dealing with computational identifiers which 'map'
>> to computational artifacts. Both the uses and the specifications of
>> http URIs have extended its scope beyond that narrow purview.
>
> And I disagree. The semantics of the application of HTTP is and should
> be much broader than the semantics as used by the HTTP wire protocol.
>
>> The operation of HTTP, according to http-range-14, is ALREADY
>> concerned with how URIs denote real-world entities beyond the
>> operation of http.
>
> And my viewpoint is that that's completely outside of what the HTTP
> specifications or operations is concerned about. In fact it
> intentionally does not care about any such concerns and leaves that to
> the application of HTTP to any such entities.

And, to repeat, that view is untenable, precisely because semantics is  
not about computation. Your notions of layering simply do not apply  
when you are purporting to make decisions based upon meanings: which  
you are, whether you like it or not. HTTP-range-14 has made this  
choice for you. Don't argue with me, if you want to keep your nice  
tidy 'layering': go back and argue with whoever made the http-range-14  
ruling.


> Anyone is free to define
> HTTP applications for such entities, by defining HTTP resources  
> mapping
> to such entities as they please. HTTP only defines how one may  
> interface
> with those once defined in terms of HTTP resources. What relations  
> those
> HTTP resources have to any real-world entities is defined by that
> application, not by HTTP.
>
>> (Not, by the way, with how *resources* map to real-
>> world resources. In the cases in question, the relationship between
>> the URI and the real-world entity is direct, not mediated through  
>> some
>> other resource inside a server.)
>
> And in my world that's an impossible condition, as those real-world
> resources do not exists in HTTP terms

They do exist, you are just refusing to look at them.

> and need to be mediated via some
> server defined HTTP resource to be accessible via HTTP, or requests  
> for
> that HTTP-URI would simply result in a 404 until a such HTTP  
> resource is
> implemented for mapping to the real-world resource.
>
>
>> But the phrase "that can be used to interact with a resource" ALREADY
>> limits what a resource can be. You cannot interact with the number 27
>> or with Julius Caesar.
>
> Please note that this part is just explanatory text trying to explain
> the relationship between HTTP and URI specifications, not a normative
> definition.
>
> The definition of "resource" in the HTTP specifications is found in  
> the
> terminology section.
>
>
>>>       resource
>>>
>>>               A network data object or service
>>
>> That is not the definition of resource used in RFC3986, however.
>
> What I said, and why I highlighted it here. The definitions are
> different, and you need to use the right definition for each
> specification or you'll get confused when discussing borderline issues
> like this.
>
> For most practical considerations in the use of HTTP the difference is
> negligible however.

Not any more. Thats why I'm making such a fuss about it. And BTW,  
these are not 'borderline' issues.

>
>> HTTP
>> URIs can identify resources in the broader RFC3986 sense; and for
>> those URIs, there may well not be any resource in this narrow sense
>> identified by the URI at all. And yet, still, a GET on them might
>> resolve to an http endpoint. What does the http spec say about such a
>> case? What is the endpoint to do?
>
> Yes it's correct that HTTP URIs can identify resources in the broader
> sense, but not something the HTTP specifications as such concerns  
> itself
> about. HTTP specifications end at the http endpoint and it's http  
> mapped
> resource.

Hmm, so in these cases, the HTTP URI identifies **two** different  
resources? The URI one and the HTTP one? Is that what you are saying?  
I doubt if many people on the TAG would like this.

>
>> And my point was only
>> that in this case, it is at best confusing any maybe actually wrong  
>> to
>> say that IF the server has a transmittable representation available
>> then it must send it with a 200 code.
>
> And we don't. We say "suitable to be transmitted", which is quite
> different from "transmittable" as there is representations that MAY be
> transmittable in theory but which is still deemed unsuitable (by the
> http server endpoint or it's policy)

OK, I wasnt meaning to confuse this issue, just using 'transmittable'  
as a shorthand. Sorry.

>
>> For what are we to say about the
>> second case? It all depends on what is meant by the "requested
>> resource".
>
> The difference between a "resource" (as identified by a specific URI)
> and an HTTP "requested resource" not what you think. The two differ  
> when
> there are multiple independent representations available by the exact
> same URI, such as content in different language based on the language
> preferences of the client etc.

But they also differ, presumably, when the identified resource is  
Julius Caesar. Or do they? I really have no way to know.

>
>> (It seems to me that HTTP rather shoots itself in the foot by this
>> insistence that its specs must not refer to or even acknowledge the
>> existence of resources that are other than network data or services,
>> since it has defined out of existence the very case that it should be
>> able to refer to, if only to explicitly say that its not going to
>> specify what happens in it. This is rather an ostrich way of writing
>> specs, to pretend that all of the world that you don't like doesn't
>> exist, so that you aren't obliged to say anything about it.)
>
> I don¨t agree here. HTTP specifications places a technical limit on  
> what
> the word "resource" means within the HTTP specifications, which is
> purely a technical definition.

And says nothing about the cases when HTTP URIs are used to refer to  
other kinds of resource. Which is an ostrich way of writing  
specifications.

>
>
>>> My response is that
>>> it's the servers role to select a suitable representation of the
>>> resource based on the meaning of the URI.
>>
>> Does that mean, of the resource that the URI identifies? And does
>> "identify" mean, denote?
>
> Sorry if I am unclear some times. English is not at all my native
> language, and the word "denote" is not really part of my limited  
> English
> vocabulary.

Sorry. 'denotes' AKA 'refers to', 'identifies', 'is a name for', is  
used as a name for'. I will try to remember to say 'refers to' or  
'identifies'.

>
>> From my understanding of "denote" it's:
>
> Of the HTTP resource the HTTP-URI identifies.
>
> Where identifies as in is in the sense of how an Universal Resource
> Identifier identifies a network-accessible resource, ignoring  
> completely
> what that resource denotes in the broader sense.

But you cannot ignore this completely when the URI does *in fact*  
identify something other than a network-accessible resource.

>
>> ??!!? Of course two different URIs can refer to the same resource. If
>> HTTP is built on a different supposition, then HTTP is simply wrong.
>
> Sure they can. The points here is:
> * that HTTP does not care if they do

OK, but...

> * and that HTTP has the view that if the semantics of those URIs is
> different then they do in fact NOT refer to the same resource

That simply does not make sense. What you say here (seem to say here)  
is logical nonsense. Look, if two names refer to the same thing (call  
it a resource if you like) then there is only one thing that they both  
refer to. So to say that 'as far as X is concerned' they refer to  
different things, is simply meaningless. There aren't two things there  
to be referred to, in this case. So, sorry: they DO IN FACT refer to  
the same resource. If HTTP thinks otherwise, then HTTP is simply  
WRONG. There is no finer-grained identity than identity itself.

If you think I am technically mistaken on this topic, please refer me  
to some published work which makes semantic sense of the view of  
identity that you are basing this claim upon. (And as I have had this  
discussion many times before, if you are going to cite LISP at me:  
identity in LISP is EQ, not EQUAL.)

> They may
> refer to different facets of some larger/broader resource but not the
> same.

I have no idea what you mean by a facet of a resource. What 'facets'  
does Richard or J.C. have?

>
> If those URIs happens to really refer to the same resource both URIs
> will respond identically, and further is indistinguishable from two
> identical copies of the same resource.
>
>> ?? I am trying to make sense of this, and not sure I have it right.
>> Take the case in my email to Richard, where there is a URI denoting
>> him, Richard C., the actual person. (Note, this is not a topic that
>> HTTP gets to rule out or refuse to acknowledge, because this can in
>> fact happen. My question is about what HTTP should do in such a  
>> case.)
>
> HTTP handles the case by restricting it's notion of resource to the
> network-accessible resource used for interfacing with Richard C.

First, there is no such resource: Richard C. isn't the kind of thing  
that you can 'interface' with over a network. (Well, maybe by email,  
but then we would be talking about his emailbox.)  Second, its not  
important what HTTP 'restricts' itself to: the fact remains that (in  
the case described) the URI does **in fact** identify Richard, not  
some network-accessible thingie that stands in some relationship to  
him. (That thingie might have its own URI, of course, which does  
identify it.) So if what you say here is correct, I presume that HTTP  
simply treats the URI as not having a corresponding http:resource at  
all. Right? Because it is a basic assumption of the whole Web  
architecture that the resource identified by a URI is unique. So if  
the URI identifies Richard, it can't also identify the thingie.

> That
> resource MAY or MAY NOT have an actual interface with Richard C, HTTP
> does not care and need not care for it's operations.
>
>> In this case, according to Richard, he is the 'requested resource'.
>> The GET request is directed to a server which has some other resource
>> inside it, call this resource R. R is a resource in your narrower
>> sense (a network data object or service), but this is *not* the
>> requested resource in this case, even though the URI resolves to (the
>> server containing) R.
>
> In terms of HTTP R is the requested resource.

I thought you might say that. So what then is the relationship between  
a requested resource and the resource identified by a URI? Apparently  
they can be different, so we have at least two resources somehow  
connected with a URI. Are there any more?

>
>> (Do you agree?) In this case, http-range-14
>> requires that the server emit a 303 coded response, because even
>> though there may well be a transmittable (awww-) representation of R,
>> there is none of Richard C., and he is the requested resource.
>
> That's up to R (or whoever/whatever defines R) to decide.

No, it is not. It is simply a fact that there is no transmittable  
awww:representation of Richard. He isn't the kind of thing that has  
such representations.

But in any case, it appears that, on your account, the whole action of  
HTTP need have **absolutely nothing** to do with the resource that the  
URI identifies (in this case, Richard.) So tell me: here I am with a  
URI, and in order to find out more about what it identifies, I use it  
in an HTTP GET, and something happens. What, if anything, can I  
conclude about the resource that my URI identifies? AFAIK, the only  
possible answer is, on your account: nothing at all. Its all going to  
be mediated by the resource that the URI requests, and that need have  
nothing to do with what it identifies. Nor need the response codes  
have any connection with the resource identified by the URI: indeed,  
if the requested (not identified) resource has a 200-level-suitable  
awww:representation, then that is what the server must send me back,  
even though neither it not its source (that is, in the above example,  
neither the awww:representation of R not R itself) need have anything  
whatever to do with the identified resource (Richard). Right?

I agree this picture has a certain elegance and simplicity, but it  
makes complete nonsense of almost everything that has been said and  
written about URIs and resources for the past decade. It means that  
the picture of Web architecture promoted by the TAG is sharply and  
fatally different from that supported by HTTP.

Anyone else like to comment on this?

Pat

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 31 July 2009 18:01:13 UTC