Re: Push back on the resource/representation introduction in HTTPbis? from Pat Hayes on 2013-11-05 (www-tag@w3.org from November 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 5 Nov 2013 15:51:05 -0600
To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
Cc: "www-tag@w3.org List" <www-tag@w3.org>
Message-Id: <EAA52217-8D4C-4030-9557-070D68076019@ihmc.us>
On Nov 5, 2013, at 9:41 AM, Henry S. Thompson <ht@inf.ed.ac.uk> wrote:

> On several occasions the TAG has discussed our concerns about the way
> in which the the HTTP specs (both 2616 and now HTTPbis) introduce the
> basic semantics of URIs and URI usage in HTTP, see most recently our
> recent f2f minutes [1].  We have also discussed revising the way in
> which we use that language ourselves in AWWW, and I floated a trial
> balloon in that regard late last spring [2].
> 
> Now that it's nearly too late (HTTPbis in in WG Last Call at the
> IETF), I've finally taken a stab at framing a possible TAG request
> about this in a positive form.  What do people think about the prose
> below as a replacement for Section 2 and the start of Section 3 in
> httpbis-p2-semantics [3]?

Well, since you ask...

I am still having a problem (the same problem I have had for over a decade) reconciling two central assertions made here, viz:

"The target of an HTTP request is called a resource.  HTTP does not
   limit the nature of a resource; it merely defines an interface that
   might be used to interact with resources."

and

"If we consider that a resource could be anything, and that the
   uniform interface provided by HTTP is similar to a window through
   which one can observe and act upon such a thing only through the
   communication of messages to some independent actor on the other
   side, then we need an abstraction to represent ("take the place of")
   the current or desired state of that thing in our communications.  We
   call that abstraction a representation [REST].

   For the purposes of HTTP, a "representation" is information that is
   intended to reflect a past, current, or desired state of a given
   resource, ... "

Together, these two seem to me to be contradictory. The central point is what is meant by the phrase "a resource could be anything". The only coherent meanings I can attach to this claim are (1) that everything *is* a resource, or (2) that everything has the *potential to become* a resource. Reading (1) makes the idea of 'resource' vacuous, and most of the language ever produced by the TAG similarly vapid, so I presume that cannot be the intended reading. So (2) must be what is meant. That however supposes that there is some process by means of which a random thing, that was previously a mere thing, becomes promoted into the new Web-sanctioned category of 'resource'. But what is this process, that makes a thing into a resource? What is that happens when some thing becomes a Web resource? I have never seen any exposition of what this process might be, or indeed what *sort* of process it might be. 

Perhaps calling something a 'resource' is rather like calling it a 'subject' when a sentence is uttered which mentions it (the thing) in subject position. It is not a change of what might be called ontological status, not a change to the thing itself. The thing itself is unaltered, but something else casts it, as it were, in a new light: the new sentence says something *about* it. Similarly, the creation of some HTTP endpoint which emits representations which represent the thing, means that this thing has become a Web resource: that is how a thing becomes a resource. HTTP can access something that emits representations of it; and the 'it' here can be anything that can have representations, which with a sufficiently broad interpretation of "representation" can indeed be anything. This picture does make sense; in fact, it is the only picture that makes sense of this claimed universality of resources. So to take an example, Julius Caesar can be a resource: all we need is an HTTP endpoint which emits representations of the old Roman (such as http://en.wikipedia.org/wiki/Julius_Caesar, say) and J.C. has been made into a resource. 

But that tidy picture (has some flaws, but also) clashes with what these statements say. Because they imply things about resources which immediately rule out some things (such as J.C.) from consideration as potential resources. They say, in particular, that resources:

(1) are targets of HTTP requests
(2) can be interacted with (in the present) via an interface 
(3) have states
(4) can be observed and acted upon 

none of which are true of Julius Caesar (who does not exist in the present so cannot be interacted with) nor the element with atomic weight 23 (which cannot be the target of an HTTP request) nor with the Klein group (which is not a physical object) nor with the Crab Nebula (which is too far away to be interacted with), nor...  But I am sure you get the point. (Another example, by the way, is the weather in Oaxcala, an example used in a very influential document on this topic.) 

Of course, one might say that there is a distinction between what HTTP connects you to when you tell it to GET the IRI - call that a Web thingie -  and what the representations, which are emitted by this Web thingie, are representations of. And when we say "resource", what we mean is the latter, not the former. Which would be fine, except that this is not what the text actually says; and it would mean that this notion of "resource" had no bearing on HTTP architecture at all. HTTP is solely concerned with the business of accessing and wrangling Web thingies and transmitting the representations they emit. It has not the slightest bearing on what those representations are *about*, or *how* they represent what they do, or indeed in what sense they can be said to "represent" at all. HTTP treats them simply as byte strings attached to codes. What you get (or don't get) is the byte stream emitted by that thingie you poked, at the time you poked it: end of HTTP story. Which, if this spin on "representation" (which is *not* the spin from [REST], of course) is correct, just as well, because if the HTTP specs had to be concerned with ways in which representations of Roman emperors can be said to represent Roman emperors, they might take rather a long time to write (and even longer to read.)

> What I'm trying to do (with some welcome help from Jonathan Rees) in
> this rewrite is to avoid the problems we have noted with some aspects
> of the existing language, while keeping the goal of providing a
> helpful, but not overly complicated, introduction to the motivating
> background for the HTTP protocol, while never-the-less providing the
> necessary technically precise definitions of key terms used throughout
> the specs.

This passage suffers from a familiar kind of schizophrenia about what 'resource' means. Y'all want IRIs to be able to *refer* to anything - literally, anything - but you also want them to refe... well, maybe to identify, things that HTTP can take them to. And these two classes - everything, on the one hand, and things that HTTP can interact with on the other, are NOT THE SAME. No amount of verbal cunning is going to make them be the same. You cannot have it both ways. Either 'resource' is just a stupidly misleading usage synonymous with 'anything'; or resources are *potentially* anything at all that can be 'represented'; or resources are things that have states, can be observed and acted upon and are targets of HTTP requests (so must be non-abstract things functionally connected to the Internet and existing in the present). It would help enormously if y'all could get your act together and decide which meaning you intend, and then carefully avoid wordings which imply something else. And as a minimum, avoid making grand claims which are inconsistent with perfectly reasonable architectural discussions, whose meaning they destroy by introducing them wrongly. If a resource "can be anything", then the this document cannot *possibly* define   "an interface that might be used to interact with resources", because such an interface would have to be able to interact with anything, and no such universal interface is possible. 

FWIW, the history of things on the Web being called "resources" (rather than "things", say) is directly traceable back to the pre-Web writings of Doug Engelbart, who first used the term in this kind of a context. And what Doug meant the word to mean, quite obviously from his writings, was the limited sense of things that you can actually observe and interact with using computable protocols. He called them "resources" precisely because that is what they are: you can access them and use them on a network; they provide functionality. He did not mean the term to include anything and everything that can be referred to or thought about. This particular metaphysical train wreck was introduced later, as far as I can tell by Tim Berners-Lee and Larry Masinter when they decided to conflate URL with URN, and thereby fatally muddled reference (done by names) with access (done by locators). It is now too late to undo all this disastrous harm, but at least we can strive to limit the damage by speaking more carefully.

Pat



> 
> I'm at least as interested in strategic feedback at this point as
> editorial:  the pressing issue for the TAG right now is whether to
> feed _something_ like this into the IETF HTTP WG, or to accept that
> we've missed our chance this time around.
> 
> ht
> 
> [1] http://www.w3.org/2001/tag/2013/09/30-minutes.html#item02 [at the end]
> [2] http://lists.w3.org/Archives/Public/www-tag/2013Jun/0023.html
> [3] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#section-2
> -----------------
> 2. Targets
> 
> Origin servers make content and services available to clients via
> Uniform Resource Identifiers (URIs---see section 2.7 of [Part1]).
> Clients use appropriate URIs in requests in order to access (or
> update) a server's content or interact with its services.
> 
> We use the word "resource" in these specifications to refer to the
> whole range of content and services which a server may provide.
> 
> In the case of a given request, we call the resource that is accessed
> the "target resource" of the request (or "target" for short), and the
> URI as contained in the request the "target URI".  How those
> responsible for configuring an origin server determine what URIs can
> be used to give access to what resources, or how clients might learn
> or check this independently of HTTP requests, is out of scope for
> these specifications.
> 
> We call the relationship between a request and the actions to be
> performed and/or the response to be generated when processing it the
> "request semantics", the primary determinants of which are the request
> method, the target resource and the state of the server at the time
> the request is handled.  Sections 4 through 7 below specify the
> details of this relationship.
> 
> The utility of individual servers and the resources they give access
> to, to say nothing of the value of the HTTP-mediated Web as a whole,
> depend in large part on the consistency of the relationship between
> URIs, request methods and their joint semantics, both across
> similar-but-different URIs available from the same server, and over
> time for the same URI.  Such matters are, however, out of scope for
> this specification.
> 
> [Does this para. really belong at this introductory level????]
> 
>  When a client constructs an HTTP/1.1 request message, it sends the
>  target URI in one of various forms, as defined in (Section 5.3 of
>  [Part1]).  When a request is received, the server reconstructs an
>  effective request URI for the target resource (Section 5.5 of
>  [Part1]).
> 
> One design goal of HTTP is to separate resource access from request
> semantics, which is made possible by vesting the request semantics in
> the request method (Section 4) and a few request- modifying header
> fields (Section 5).  Resource owners SHOULD NOT include request
> semantics within the URI that accesses it, such as by specifying an
> action to invoke within the path or query components of the effective
> request URI, unless those semantics are disabled when they are
> inconsistent with the request method.
> 
> 3. Representations
> 
> HTTP allows an origin server to give access to a broad range of types
> of resources.  It does so via a uniform interface which supports
> interaction with resources only through the exchange of messages.
> 
> In the exchange of request and response involved in accessing a
> given resource, the relationship between message contents and the
> accessed resource can be simple, as for example in the case of the
> response message to a GET request whose target resource is a piece
> of static clip art, or it may be very complex, as in the case of a
> POST request to a ticket booking service.  In what follows below we
> will use the word "representation" to generalize over all these
> cases: a "representation" is the encoding within a message of
> information intended to reflect the current state or output of a
> resource, and/or (at least in part) affect its future state or
> actions.  We further articulate message contents into a set of
> "representation metadata" and a potentially unbounded stream of
> "representation data".
> 
> Thus in the simple and common case of a GET request whose target URI
> gives access to web content such as a picture, video or web page, a
> cooperative server will ensure that the response message, using a
> format that can be readily communicated via the protocol, enables
> the presentation to the user of that content, whereas in the case of
> a booking service a POST request will contain the information
> necessary to enable a ticket to be issued.
> 
> -- 
>       Henry S. Thompson, School of Informatics, University of Edinburgh
>      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
>                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
>                       URL: http://www.ltg.ed.ac.uk/~ht/
> [mail from me _always_ has a .sig like this -- mail without it is forged spam]
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 5 November 2013 21:51:41 UTC