Re: Review of new HTTPbis text for 303 See Other from Pat Hayes on 2009-07-20 (www-tag@w3.org from July 2009)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 20 Jul 2009 13:16:24 -0500
To: Henrik Nordstrom <henrik@henriknordstrom.net>
Cc: "www-tag@w3.org WG" <www-tag@w3.org>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <2DB43B78-030C-4F6E-BF55-0465CA6EDDEB@ihmc.us>
To cut to the chase, scroll down to [[**]]

Pat

On Jul 18, 2009, at 3:30 PM, Henrik & Anna Nordstrom wrote:

> lör 2009-07-18 klockan 12:55 -0500 skrev Pat Hayes:
>
>> Unfortunately, this use of 'map' terminology only confuses things  
>> even
>> further. Many cases of the denotation relationship between URIs and
>> resources cannot be mapped by any server, if 'map' refers to any kind
>> of computable operation.
>
> And the point is?
>
> HTTP does not care how URIs map to resources (in any terminology, but
> see the end for what resource is in HTTP terminology), it's the role  
> of
> the server implementers and webmasters/authors to define such
> relationships if any.

Apparently you have not understood my point, above. There are cases  
where NO implementation of ANY KIND can POSSIBLY map a URI to the  
resource it identifies. So one cannot simply toss this issue over the  
wall to some other, unspecified, "implementer". Its nothing to do with  
implementation.

>
> For some URIs the mapping is trivial and often even 1-1 (but may be
> n-m), for some it's less obvious, and for some it may even be
> impossible. HTTP does not care. What HTTP cares about is to provide
> means for representing this at suitable level of detail as needed for
> proper operation of HTTP where the proposed 303 response to GET is one
> possible outcome.
>
> HTTP operates with URIs and the representations of resources servers
> return in response to requests to those URIs. End of story. Such
> accesses MAY render effects outside HTTP (such as items being  
> shipped to
> you from a web shop, robots making some move, some electron bouncing
> around, a valve opening/closing somewhere etc) but those effects are
> outside of HTTP specifications.

I understand, but I am not talking about 'effects', but about semantics.

>
> In addition it places certain protocol level restrictions on how  
> servers
> may behave when there is many possible representations accessible via
> the same URI, but that's a different topic.

I am also not talking about behaviors.

>
>> No, that is not what I said, and your misunderstanding of what I said
>> is typical of the communication problems we find in these  
>> discussions.
>> The relationship of denoting (referring to, naming, being a name of)
>> is, prima facia at least, completely unrelated to issues of resolving
>> on servers. Servers and networks and transport protocols simply have
>> nothing to do with naming,
>
> Good. So what is your argument exactly? At the level, view and context
> of HTTP, ignoring the rest of the world.

My point is that you cannot completely ignore the rest of the world.

>
>> The picture given by Richard, and to which I was reacting in the  
>> above
>> quote, was this. URIs denote things. The HTTP GET takes a URI and
>> resolves it to a server (not a resource) which responds with a
>> **representation of the denoted resource** attached to a 200 code, if
>> it has a suitable such representation. ("Suitable representation"  
>> here
>> has to be given further gloss, as it is a nonstandard usage.)
>
> In the context it simply means that the server is responsible to
> determine if there is a representation available suitable to be used
> when building the response and what that is if any. HTTP does not  
> really
> care how the server decides that.

Well, OK, that is fair enough, and I can live with that, provided we  
are clear about what exactly the 'requested resource' is. See [[**]]  
below.

>
>> That
>> makes sense, given that we further specify that some resources just
>> don't have representations, so the server must issue a 303 code in
>> those cases.
>
> Yes, or to be exact when there is no suitable representation available
> to be used for the HTTP response to this request for a valid URI which
> do refer to some kind of resource as identified by the server.
>
>> Note that the represented, requested resource itself
>> plays no part in this picture: it is all about servers and
>> representations of resources.
>
> Fully agree.
>
>> Well, part of the passion in these exchanges arose from Richard C's
>> insistence that HTTP said *absolutely nothing* about meanings of  
>> URIs,
>
> And it doesn't. That's part of the application of URIs to something,  
> not
> HTTP itself. For most the semantics of such application of URIs is  
> even
> outside the URI specifications and solely left to the actual
> implementation/application of URIs.
>
>> so I am surprised to hear that the HTTP protocol intends any notion  
>> of
>> meaning at all.
>
> Not in my world.

BUt you yourself said that I was thinking about the wrong kind of  
meaning, not the kind of meaning intended by the spec. Really, you  
cannot have it both ways. Please make up your mind which is your  
position, and stick to it.

>
>> But more to the point, it is now simply a fact that
>> URIs are used as denoting names on the Web.
>
> HTTP does not really care about this either.

I know it does not wish to, but http-range-14 has left it no choice  
but to care about it, at least a little.

>
>> Published W3C specs require them to play this role.
>
> At the level of HTTP, or at the level of user experience of URIs when
> accessed via some user agent?

The semantics of URIs has nothing at all to do with layering. It is  
part of the specification **of URIs themselves**. When anyone talks  
about the relationship between a URI and the resource it identifies,  
or denotes, or refers to, or is used to request, or indeed pretty much  
any relationship between a URI and a resource, they are talking about  
semantics.

>
> And what does those unnamed W3C specs have to do with the HTTP
> specifications?

Because the HTTP specs also talk about this. And it is generally a  
good idea, when two specs talk about the same thing using the same  
language, that some effort is expended to make sure they are intending  
to use this language in the same way.
>
>> OK, that is good. It would help if that were stated explicitly, since
>> the same server and the same resource can be located/identified by a
>> different URI also. The wording which I was objecting to refers only
>> to resources, so this point is not clear, as the same resource might
>> be identified by several URIs.
>
> The wording is in the context of how a sever can build a response to a
> request for a specific URI, with no relevance to requests for other
> URIs.

I understand, but it refers to resources. If for example the spec says  
(as I believe it does, currently) that if the server has available a  
transmittable representation of the requested resource, then it must  
return that with a 200 code, this statement makes no reference to the  
URI that was used to identify the resource. The only way I can read it  
is as saying that this applies **independently of the URI**, and hence  
to any URI which identifies the same requested resource. So the  
'context' you mention is irrelevant.


>
> HTTP does not care much if the same resource has multiple URIs, with
> just one minor optional exception (Content-Location).
>
>> The suggested wording refers to the 'requested resource'.
>
> Yes, as defined by HTTP. Not in the general English meaning (whatever
> that is, I would not know what a "requested resource" is in general
> English, or even Swedish for that matter).

In English, it would be the topic of a request for something that  
could be used for some purpose, as in "Please pass me the salt."

>
>> You here are talking about the 'requested URI'.
>
> Yes, as a slight simplification I choose to ignore server driven
> negotiation in this discussion, but that's besides the point.

No, it is quite on the point. If the server can respond differently to  
different URIs which both identify the same resource, that changes the  
game.

>
>> These are not the same. Which is correct?
>
> In the terminology defined by HTTP the difference between an  
> (HTTP-)URI
> and resource is more of a special case, and not related to any of what
> you talk about.

It is related. In fact it is critical.

>
> resource in HTTP terminology is NOT the general resource of anything  
> you
> seem to refer to, but in specific the resource within the server which
> holds the information and/or processing required to build a suitable
> representation to be used within HTTP (or as close to that you can  
> get).

Ah. That certainly makes sense, and indeed is what I understood when I  
first became involved in these URI-meaning debates. But this position  
is not consistent with what is said about resources in other  
standards.  And moreover, if this is true, then the http-range-14  
decision is simply untenable. For in that case, the 'requested  
resource' is something that cannot possibly be inside a server. Julius  
Caesar, let us say, might be the requested resource.

> Yes it's a simplification, but defining or assume anything about
> resources anywhere beyond that is outside of HTTP scope and nothing  
> HTTP
> cares about and is left to the application of HTTP and/or URIs.

No, sorry, that position is simply untenable. See me earlier replies  
to Richard on this point. HTTP cannot hide inside a 'layer' and  
pretend it is only dealing with computational identifiers which 'map'  
to computational artifacts. Both the uses and the specifications of  
http URIs have extended its scope beyond that narrow purview.

> HTTP
> does not care or imply anything about how those resources map to any
> real-world or abstract resources beyond the operation of HTTP.

The operation of HTTP, according to http-range-14, is ALREADY  
concerned with how URIs denote real-world entities beyond the  
operation of http. (Not, by the way, with how *resources* map to real- 
world resources. In the cases in question, the relationship between  
the URI and the real-world entity is direct, not mediated through some  
other resource inside a server.)

>
> To quote the specifications:
>
> p1-messaging, 2.1 Uniform Resource Identifiers
>
>        HTTP does not limit what a resource may be; it merely defines  
> an
>        interface that can be used to interact with a resource via  
> HTTP.

But the phrase "that can be used to interact with a resource" ALREADY  
limits what a resource can be. You cannot interact with the number 27  
or with Julius Caesar.

>
>        [and reference to RFC3986 for further details about URI and
>        resource in general]
>
> p1-messaging, C. Terminology
>
>        resource
>
>                A network data object or service

That is not the definition of resource used in RFC3986, however. HTTP  
URIs can identify resources in the broader RFC3986 sense; and for  
those URIs, there may well not be any resource in this narrow sense  
identified by the URI at all. And yet, still, a GET on them might  
resolve to an http endpoint. What does the http spec say about such a  
case? What is the endpoint to do?

> that can be identified
>                by a URI, as defined in Section 2.1. Resources may be
>                available in multiple representations (e.g. multiple
>                languages, data formats, size, and resolutions) or vary
>                in other ways.
>
> Note: "resource" in 2.1 above refers to the more general RFC3986
> meaning, in the rest of the HTTP documents it generally refers to the
> HTTP definition of resource.
>
>>> which includes matching the intended
>>> meaning of that URI by whatever name scheme the server implements
>>
>> No, it cannot possibly do that, in many cases. No implementation of
>> anything is ever going to match anything to Julius Caesar, who has  
>> not
>> existed for around 2000 years.
>
> Not what I was talking about, and I don's see what your point with  
> this
> is either.

My point was only that the task which you said above is included,  
cannot in general always be included, since this task may be  
impossible in some cases.

>
> This response was in response to your talk about a resource (in  
> general
> terms) having multiple URIs with different meanings.

I don't know what 'meaning of a URI' means. I think of URIs as names  
which refer, and thats all the semantic meaning they have. The case in  
question was one in which the same HTTP endpoint - server - might have  
to process one URI which denotes an http-resource, and also another  
URI which has no such associated resource, and hence for which it, the  
endpoint, is obliged to emit a 303 response. And my point was only  
that in this case, it is at best confusing any maybe actually wrong to  
say that IF the server has a transmittable representation available  
then it must send it with a 200 code. For what are we to say about the  
second case? It all depends on what is meant by the "requested  
resource".  if, in the second case, that can be the denoted (non-http:  
but legally awww:) resource, then all is well. But if it simply means  
'an http:resource attached to the URI endpoint', then this requirement  
is inconsistent, in such a case, with http-range-14.

(It seems to me that HTTP rather shoots itself in the foot by this  
insistence that its specs must not refer to or even acknowledge the  
existence of resources that are other than network data or services,  
since it has defined out of existence the very case that it should be  
able to refer to, if only to explicitly say that its not going to  
specify what happens in it. This is rather an ostrich way of writing  
specs, to pretend that all of the world that you don't like doesn't  
exist, so that you aren't obliged to say anything about it.)

> My response is that
> it's the servers role to select a suitable representation of the
> resource based on the meaning of the URI.

Does that mean, of the resource that the URI identifies? And does  
"identify" mean, denote?

> A server ignoring such meaning
> as defined by the server would be in error with itself. In the terms  
> of
> HTTP each of those URIs actually refer to a resource of it's own as  
> they
> have different URIs and meaning, even if those resources perhaps are
> very closely related.

??!!? Of course two different URIs can refer to the same resource. If  
HTTP is built on a different supposition, then HTTP is simply wrong.

>
>> Resources aren't sent in response, representations are. Did you mean
>> representation?
>
> Yes and no. The resource to be used when making an HTTP representation
> of the requested resource.

[[**]]
?? I am trying to make sense of this, and not sure I have it right.  
Take the case in my email to Richard, where there is a URI denoting  
him, Richard C., the actual person. (Note, this is not a topic that  
HTTP gets to rule out or refuse to acknowledge, because this can in  
fact happen. My question is about what HTTP should do in such a case.)  
In this case, according to Richard, he is the 'requested resource'.  
The GET request is directed to a server which has some other resource  
inside it, call this resource R. R is a resource in your narrower  
sense (a network data object or service), but this is *not* the  
requested resource in this case, even though the URI resolves to (the  
server containing) R. (Do you agree?) In this case, http-range-14  
requires that the server emit a 303 coded response, because even  
though there may well be a transmittable (awww-) representation of R,  
there is none of Richard C., and he is the requested resource.

 From what you say here, I think you may have a different picture in  
mind, where the "requested resource" in this case must be R itself.  
That indeed is the picture I had originally, when I entered this  
thread. But in that case, Roy's suggested wording is inconsistent with  
http-range-14, because in this case it would prevent the server  
issuing a 303, as required by that decision.

Now, just to clarify, it seems to me that this case could arise even  
when R itself does not exist at all: there is simply a server which is  
able to recognize, somehow, that the URI in question identifies  
something other than a network data object or service, and so it  
cannot return a transmittable awww:representation of it. In HTTP  
terms, no http:resource exists at that endpoint to construct a  
transmittable representation from. Do you agree that this is a  
possibility, and is consistent with a 303 response? Or would you say  
that in this case, the only acceptable response would be a 400-code?

> But see above for the HTTP meaning of
> resource in this context.

Frankly. Im not very interested in any definition of 'resource' other  
than that in the URI specification, and I don't think that any  
published specification which refers to URIs should use any other  
definition. HTTP URIs are, in fact, being used to refer to resources  
in this broader sense. If the HTTP spec refuses to acknowledge this,  
then the world will simply go on and ignore the http spec in critical  
cases. Which would be a pity, but would not be a complete disaster.

Pat

>
> Regards
> Henrik
>
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 20 July 2009 18:18:52 UTC