- From: <noah_mendelsohn@us.ibm.com>
- Date: Fri, 31 Jul 2009 16:14:46 -0400
- To: Pat Hayes <phayes@ihmc.us>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>, "www-tag@w3.org WG" <www-tag@w3.org>
I'm not sure whether the TAG is interested in spending time on this
question in the near future, as it's taken quite a bit in the past, but I
will put an item on an upcoming agenda to at least get the sense of the
group. Given that some members with important perspectives on this are
gone a lot in August, I'm not sure whether we'll wind up doing more this
month than deciding to await their return. In any case, I'll schedule an
initial, brief, discussion.
Noah
--------------------------------------
Noah Mendelsohn
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Pat Hayes <phayes@ihmc.us>
Sent by: www-tag-request@w3.org
07/31/2009 03:25 PM
To: "www-tag@w3.org WG" <www-tag@w3.org>, HTTP Working Group
<ietf-http-wg@w3.org>
cc: (bcc: Noah Mendelsohn/Cambridge/IBM)
Subject: Fwd: Review of new HTTPbis text for 303 See Other
Folks
I do not expect a reply, but I put it to y'all, is this stance (below) in
fact consistent with what the HTTP and TAG groups have published
concerning URIs and what they are intended to identify? In particular, is
it consistent with http-range-14? It seems to me it is clearly not, and
that this fact is important to what both groups publish in their
specifications and recommendations.
As a concrete point to focus discussion, I gather that Henrik's view is
that in the case where an HTTP URI identifies a non-information resource,
but resolves to an HTTP endpoint, it must follow that the "requested
resource" (in the sense of HTTPbis) of the URI in the GET request is an
information resource interfaced to the HTTP endpoint, and so cannot be the
same as the non-information resource which the URI "identifies" in the
sense of RFC 3986. I also gather, from off-line emails, that Richard
Cygniak would disagree with this interpretation. (I hope I do not
misrepresent anyone here.) Apparently, therefore, two people both quite
expert in reading the HTTP spec do not interpret the phrase "requested
resource" in the same way, leaving me and I suspect others in a state of
complete confusion.
Pat Hayes
--------
Begin forwarded message:
From: Henrik Nordstrom <henrik@henriknordstrom.net>
Date: July 31, 2009 1:38:18 PM CDT
To: Pat Hayes <phayes@ihmc.us>
Subject: Re: Review of new HTTPbis text for 303 See Other
I am not even going to answer you this time. Go back, read the HTTP
specifications, and come back when you have something concrete which
actually relate to the specifications as such to talk about. If there is
something you want to change then make concrete suggestions on how (and
make sure to base it on current drafts).
As already said HTTP does not care and have no intentions to ever care
what kind of "resource" an URI maps to, semantics of that or what it
denotes. All HTTP specifies is an interface language for talking to the
server publishing this over HTTP, anything else is irrelevant to HTTP.
HTTP has it's definition of the term "resource", like it or not. Within
the HTTP specification the word "resource" has the meaning as defined by
HTTP. Any meaning defined elsewhere is irrelevant as far as HTTP is
concerned.
But to address your concerns the term resource will quite likely barely
be used at all in the revised HTTP specifications, or at least much less
than it is today.
I have said what I have to say to you on the subject. Further responses
talking about semantics, connections to real-world or even abstract
things, or taking statements in the specifications outside the context
of the specification where they are written will be silently ignored.
Regards
Henrik
fre 2009-07-31 klockan 13:00 -0500 skrev Pat Hayes:
On Jul 20, 2009, at 8:37 PM, Henrik Nordstrom wrote:
mån 2009-07-20 klockan 13:16 -0500 skrev Pat Hayes:
Apparently you have not understood my point, above. There are cases
where NO implementation of ANY KIND can POSSIBLY map a URI to the
resource it identifies. So one cannot simply toss this issue over the
wall to some other, unspecified, "implementer". Its nothing to do
with
implementation.
For the kinds of URIs that HTTP deals with it can, as far as HTTP is
concerned with the definition of "resource" as used by http which for
technical specification writing reasons is slightly narrower than the
general URI definition of resource.
It is not 'slightly' narrower. The general definition of 'resource'
has it meaning absolutely anything, real or imaginary, concrete or
abstract, that can be referred to and distinguished from other things
(the last five words inserted to make sense of the 'identify'
language). This is not a 'slightly' wider sense than the one that you
apparently have in mind. It is a spectacularly, transfinitely, almost
cosmically wider sense. It is as wide as human language knows how wide
to make a distinction.
I understand, but I am not talking about 'effects', but about
semantics.
And HTTP is completely ignorant of any semantics that the URIs
accessed
via HTTP may have.
What HTTP cares about is if there may be effects on the resource state
by actions requested by HTTP. (i.e. DELETE is assumed to have certain
effect when executed on the http resource)
My point is that you cannot completely ignore the rest of the world.
When writing a technical specification you can, as the relevant part
of
the world is then the parts that the specification intends to cover
and
only those parts.
BUt when your specification is about a language or a notation, and in
part about what that notation means, and when in fact that notation is
being used to mean things in a certain (wide) category, then such
usage does fall within the scope of your specification, and you should
deal with it, if only by stating explicitly that you are not going to
consider it. But to ignore it and pretend that it isn't there, by re-
defining an existing terminology so as to avoid interfacing with other
specifications, is both intellectually dishonest and socially
irresponsible. Sorry, strong language, but I really do feel strongly
about this, having had to face up to this issue myself when writing
specifications.
BUt you yourself said that I was thinking about the wrong kind of
meaning, not the kind of meaning intended by the spec. Really, you
cannot have it both ways. Please make up your mind which is your
position, and stick to it.
HTTP places absolutely no meaning at all on the general term
"resource"
as used in english
Never mind the English meaning, which is now lost to history in these
debates.
or even the "resource" as defined by URI
specifications.
But does it not strike you as inappropriate to simply ignore the
normative definitions used in defining technical terms which you
yourself use? What is the point of writing specifications if other
specification writers are free to redefine the terminology which my
specification defines normatively?
The only kind of resource HTTP places any meaning on at all is the
very
much narrowed down "resource" as defined by the HTTP specifications,
and
even then it's just as an abstract concept to simplify the world
description somewhat. To HTTP it does not matter at all what those
resources are, only if they can be accessed and/or transmitted via
HTTP
I understand all this. But there are cases where the resource
identified by the HTTP URI is, in fact, not one of these. That is -
regardless of its true metaphysical nature, which I agree we will not
delve into - whatever it really is - it is not something that can be
accessed and/or transmitted. Such cases are REAL, they are out there
in the actual world. If your spec refuses to acknowledge this, then it
is simply an incomplete specification; and as such, it is less useful
that it can and should be.
or not as defined by whoever "owns" the resource and who also defines
their intended URI semantics (again completely outside of HTTP
specifications).
I know it does not wish to, but http-range-14 has left it no choice
but to care about it, at least a little.
Has it? Care to explain that again then, using the term meanings as
defined by HTTP.
http-range-14 specifies an HTTP-defined action (the use of a 303
redirect) be used under circumstances which arise when the URI in
question identifies a thing which is not a resource according to the
narrow sense of 'resource' which you are arguing HTTP should restrict
itself to.
The semantics of URIs has nothing at all to do with layering. It is
part of the specification **of URIs themselves**. When anyone talks
about the relationship between a URI and the resource it identifies,
or denotes, or refers to, or is used to request, or indeed pretty
much
any relationship between a URI and a resource, they are talking about
semantics.
Ok. My point here is that HTTP does not care about those semantics.
And my point is that it must, at least to the minimal extent required
to state a normatively required action under circumstances which can
only be described by referring to those semantics. (And also - though
this is more controversial - I would argue that in fact, HTTP is
already concerned with the semantics of URIs, even though it refuses
to acknowledge this elementary fact.)
All
it possibly cares about is that the server is the ultimately
responsible
for executing that semantic mapping
This is a conceptual mistake. Semantic mappings are not executable.
of URI to resource (in URI terms),
and that this mapping results in HTTP network accessible resources
(which you seem to sometimes call a representation where HTTP calls
it a
resource
I hope not. I try to keep the resource/representation distinction
clear. There are however two or more notions of what counts as a
'representation': when in doubt, I use the now-standard circumlocution
awww:representation to refer to the narrow sense used in REST and (I
assume) HTTP.
) and their possible representations as defined by HTTP.
Because the HTTP specs also talk about this. And it is generally a
good idea, when two specs talk about the same thing using the same
language, that some effort is expended to make sure they are
intending
to use this language in the same way.
Unfortunately if a new term is to be defined for every slight
variation
there is of the term "resource" in this I am afraid it would be even
more confusing.
As I have tried to emphasize, this is not a 'slight variation', and in
any case I doubt if there are going to be any more changes once we
have established that a resource can be absolutely anything.
There is very good reasons why "resource" in the URI specifications
broader than "resource" in HTTP specifications and both being narrower
than the general English "resource".
No, the English meaning is actually narrower than the URI
specifications sense, which is highly idiosyncratic and of fairly
recent coinage (see the Wikipedia entry of 'resource' for a good
history.)
I understand, but it refers to resources. If for example the spec
says
(as I believe it does, currently) that if the server has available a
transmittable representation of the requested resource, then it must
return that with a 200 code, this statement makes no reference to the
URI that was used to identify the resource.
The URI reference is implicit as the whole text is in the context of
builiding a response to a request for a specific URI. Trying to read
the
text outside that context is non-sense.
PLease read what I wrote more carefully. To say that the server has
available a transmittable representation of the requested resource,
without referring to the URI that was used to request the resouirce,
is not nonsensical in any way at all. It implies, as I read it, that
this condition holds independently of the URI, so that if the same
resource is requested by different URIs then this condition either
holds for both of them or for neither of them. So it rules out the
possible case where the condition holds for one URI request but not
for the other URI request, with a different URI but the same resource.
....
No, it is quite on the point. If the server can respond differently
to
different URIs which both identify the same resource, that changes
the
game.
If the defined semantics of the URIs says the server should respond
differently then they in the world as defined by HTTP refer to
different
resources, but possibly very closely related such.
It all boils down to the definition of what a resource is, and the
HTTP
resource is as I already explained NOT as general as the URI resource.
No, the situation is far worse than this. According to your previous
paragraph, we can have a situation where two URIs identify the same
resource according to the URI spec, but must be understood by HTTP as
corresponding to different resources. Just narrowing the sense of
'resource' will not get you this horrible situation. This, if indeed
you are right (nobody else has suggested this idea, so I hope you are
wrong) makes the HTTP and URI specifications sharply **incompatible**
with one another.
In the terminology defined by HTTP the difference between an
(HTTP-)URI
and resource is more of a special case, and not related to any of
what
you talk about.
It is related. In fact it is critical.
To me when talking about HTTP it's not.
Ah. That certainly makes sense, and indeed is what I understood
when I
first became involved in these URI-meaning debates. But this position
is not consistent with what is said about resources in other
standards. And moreover, if this is true, then the http-range-14
decision is simply untenable. For in that case, the 'requested
resource' is something that cannot possibly be inside a server.
Julius
Caesar, let us say, might be the requested resource.
And is what we have been saying all along. Trying to use Julius Casear
as an example when talking about HTTP resources just does not make any
sense as the two by definition can not be the same thing.
And yet, there are HTTP URIs which identify Julius Caesar, in the
sense of "identify" used in the URI specs. And, moreover, Http-
range-14 actually places some conditions on what HTTP must do with
such a URI, **because** it identifies a resource of that 'off-Web'
kind. So the behavior of HTTP depends, in part, and can only be
accurately specified by mentioning, the situation where a URI
identifies a "non-HTTP" resource. And this DOES make sense. In fact ,
it is actually TRUE.
Yes it's a simplification, but defining or assume anything about
resources anywhere beyond that is outside of HTTP scope and nothing
HTTP
cares about and is left to the application of HTTP and/or URIs.
No, sorry, that position is simply untenable. See me earlier replies
to Richard on this point. HTTP cannot hide inside a 'layer' and
pretend it is only dealing with computational identifiers which 'map'
to computational artifacts. Both the uses and the specifications of
http URIs have extended its scope beyond that narrow purview.
And I disagree. The semantics of the application of HTTP is and should
be much broader than the semantics as used by the HTTP wire protocol.
The operation of HTTP, according to http-range-14, is ALREADY
concerned with how URIs denote real-world entities beyond the
operation of http.
And my viewpoint is that that's completely outside of what the HTTP
specifications or operations is concerned about. In fact it
intentionally does not care about any such concerns and leaves that to
the application of HTTP to any such entities.
And, to repeat, that view is untenable, precisely because semantics is
not about computation. Your notions of layering simply do not apply
when you are purporting to make decisions based upon meanings: which
you are, whether you like it or not. HTTP-range-14 has made this
choice for you. Don't argue with me, if you want to keep your nice
tidy 'layering': go back and argue with whoever made the http-range-14
ruling.
Anyone is free to define
HTTP applications for such entities, by defining HTTP resources
mapping
to such entities as they please. HTTP only defines how one may
interface
with those once defined in terms of HTTP resources. What relations
those
HTTP resources have to any real-world entities is defined by that
application, not by HTTP.
(Not, by the way, with how *resources* map to real-
world resources. In the cases in question, the relationship between
the URI and the real-world entity is direct, not mediated through
some
other resource inside a server.)
And in my world that's an impossible condition, as those real-world
resources do not exists in HTTP terms
They do exist, you are just refusing to look at them.
and need to be mediated via some
server defined HTTP resource to be accessible via HTTP, or requests
for
that HTTP-URI would simply result in a 404 until a such HTTP
resource is
implemented for mapping to the real-world resource.
But the phrase "that can be used to interact with a resource" ALREADY
limits what a resource can be. You cannot interact with the number 27
or with Julius Caesar.
Please note that this part is just explanatory text trying to explain
the relationship between HTTP and URI specifications, not a normative
definition.
The definition of "resource" in the HTTP specifications is found in
the
terminology section.
resource
A network data object or service
That is not the definition of resource used in RFC3986, however.
What I said, and why I highlighted it here. The definitions are
different, and you need to use the right definition for each
specification or you'll get confused when discussing borderline issues
like this.
For most practical considerations in the use of HTTP the difference is
negligible however.
Not any more. Thats why I'm making such a fuss about it. And BTW,
these are not 'borderline' issues.
HTTP
URIs can identify resources in the broader RFC3986 sense; and for
those URIs, there may well not be any resource in this narrow sense
identified by the URI at all. And yet, still, a GET on them might
resolve to an http endpoint. What does the http spec say about such a
case? What is the endpoint to do?
Yes it's correct that HTTP URIs can identify resources in the broader
sense, but not something the HTTP specifications as such concerns
itself
about. HTTP specifications end at the http endpoint and it's http
mapped
resource.
Hmm, so in these cases, the HTTP URI identifies **two** different
resources? The URI one and the HTTP one? Is that what you are saying?
I doubt if many people on the TAG would like this.
And my point was only
that in this case, it is at best confusing any maybe actually wrong
to
say that IF the server has a transmittable representation available
then it must send it with a 200 code.
And we don't. We say "suitable to be transmitted", which is quite
different from "transmittable" as there is representations that MAY be
transmittable in theory but which is still deemed unsuitable (by the
http server endpoint or it's policy)
OK, I wasnt meaning to confuse this issue, just using 'transmittable'
as a shorthand. Sorry.
For what are we to say about the
second case? It all depends on what is meant by the "requested
resource".
The difference between a "resource" (as identified by a specific URI)
and an HTTP "requested resource" not what you think. The two differ
when
there are multiple independent representations available by the exact
same URI, such as content in different language based on the language
preferences of the client etc.
But they also differ, presumably, when the identified resource is
Julius Caesar. Or do they? I really have no way to know.
(It seems to me that HTTP rather shoots itself in the foot by this
insistence that its specs must not refer to or even acknowledge the
existence of resources that are other than network data or services,
since it has defined out of existence the very case that it should be
able to refer to, if only to explicitly say that its not going to
specify what happens in it. This is rather an ostrich way of writing
specs, to pretend that all of the world that you don't like doesn't
exist, so that you aren't obliged to say anything about it.)
I don¨t agree here. HTTP specifications places a technical limit on
what
the word "resource" means within the HTTP specifications, which is
purely a technical definition.
And says nothing about the cases when HTTP URIs are used to refer to
other kinds of resource. Which is an ostrich way of writing
specifications.
My response is that
it's the servers role to select a suitable representation of the
resource based on the meaning of the URI.
Does that mean, of the resource that the URI identifies? And does
"identify" mean, denote?
Sorry if I am unclear some times. English is not at all my native
language, and the word "denote" is not really part of my limited
English
vocabulary.
Sorry. 'denotes' AKA 'refers to', 'identifies', 'is a name for', is
used as a name for'. I will try to remember to say 'refers to' or
'identifies'.
>From my understanding of "denote" it's:
Of the HTTP resource the HTTP-URI identifies.
Where identifies as in is in the sense of how an Universal Resource
Identifier identifies a network-accessible resource, ignoring
completely
what that resource denotes in the broader sense.
But you cannot ignore this completely when the URI does *in fact*
identify something other than a network-accessible resource.
??!!? Of course two different URIs can refer to the same resource. If
HTTP is built on a different supposition, then HTTP is simply wrong.
Sure they can. The points here is:
* that HTTP does not care if they do
OK, but...
* and that HTTP has the view that if the semantics of those URIs is
different then they do in fact NOT refer to the same resource
That simply does not make sense. What you say here (seem to say here)
is logical nonsense. Look, if two names refer to the same thing (call
it a resource if you like) then there is only one thing that they both
refer to. So to say that 'as far as X is concerned' they refer to
different things, is simply meaningless. There aren't two things there
to be referred to, in this case. So, sorry: they DO IN FACT refer to
the same resource. If HTTP thinks otherwise, then HTTP is simply
WRONG. There is no finer-grained identity than identity itself.
If you think I am technically mistaken on this topic, please refer me
to some published work which makes semantic sense of the view of
identity that you are basing this claim upon. (And as I have had this
discussion many times before, if you are going to cite LISP at me:
identity in LISP is EQ, not EQUAL.)
They may
refer to different facets of some larger/broader resource but not the
same.
I have no idea what you mean by a facet of a resource. What 'facets'
does Richard or J.C. have?
If those URIs happens to really refer to the same resource both URIs
will respond identically, and further is indistinguishable from two
identical copies of the same resource.
?? I am trying to make sense of this, and not sure I have it right.
Take the case in my email to Richard, where there is a URI denoting
him, Richard C., the actual person. (Note, this is not a topic that
HTTP gets to rule out or refuse to acknowledge, because this can in
fact happen. My question is about what HTTP should do in such a
case.)
HTTP handles the case by restricting it's notion of resource to the
network-accessible resource used for interfacing with Richard C.
First, there is no such resource: Richard C. isn't the kind of thing
that you can 'interface' with over a network. (Well, maybe by email,
but then we would be talking about his emailbox.) Second, its not
important what HTTP 'restricts' itself to: the fact remains that (in
the case described) the URI does **in fact** identify Richard, not
some network-accessible thingie that stands in some relationship to
him. (That thingie might have its own URI, of course, which does
identify it.) So if what you say here is correct, I presume that HTTP
simply treats the URI as not having a corresponding http:resource at
all. Right? Because it is a basic assumption of the whole Web
architecture that the resource identified by a URI is unique. So if
the URI identifies Richard, it can't also identify the thingie.
That
resource MAY or MAY NOT have an actual interface with Richard C, HTTP
does not care and need not care for it's operations.
In this case, according to Richard, he is the 'requested resource'.
The GET request is directed to a server which has some other resource
inside it, call this resource R. R is a resource in your narrower
sense (a network data object or service), but this is *not* the
requested resource in this case, even though the URI resolves to (the
server containing) R.
In terms of HTTP R is the requested resource.
I thought you might say that. So what then is the relationship between
a requested resource and the resource identified by a URI? Apparently
they can be different, so we have at least two resources somehow
connected with a URI. Are there any more?
(Do you agree?) In this case, http-range-14
requires that the server emit a 303 coded response, because even
though there may well be a transmittable (awww-) representation of R,
there is none of Richard C., and he is the requested resource.
That's up to R (or whoever/whatever defines R) to decide.
No, it is not. It is simply a fact that there is no transmittable
awww:representation of Richard. He isn't the kind of thing that has
such representations.
But in any case, it appears that, on your account, the whole action of
HTTP need have **absolutely nothing** to do with the resource that the
URI identifies (in this case, Richard.) So tell me: here I am with a
URI, and in order to find out more about what it identifies, I use it
in an HTTP GET, and something happens. What, if anything, can I
conclude about the resource that my URI identifies? AFAIK, the only
possible answer is, on your account: nothing at all. Its all going to
be mediated by the resource that the URI requests, and that need have
nothing to do with what it identifies. Nor need the response codes
have any connection with the resource identified by the URI: indeed,
if the requested (not identified) resource has a 200-level-suitable
awww:representation, then that is what the server must send me back,
even though neither it not its source (that is, in the above example,
neither the awww:representation of R not R itself) need have anything
whatever to do with the identified resource (Richard). Right?
I agree this picture has a certain elegance and simplicity, but it
makes complete nonsense of almost everything that has been said and
written about URIs and resources for the past decade. It means that
the picture of Web architecture promoted by the TAG is sharply and
fatally different from that supported by HTTP.
Anyone else like to comment on this?
Pat
------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 mobile
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 mobile
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Friday, 31 July 2009 20:15:40 UTC