Re: Review of new HTTPbis text for 303 See Other from Dan Brickley on 2009-07-21 (ietf-http-wg@w3.org from July to September 2009)

From: Dan Brickley <danbri@danbri.org>
Date: Tue, 21 Jul 2009 09:52:04 +0200
To: Henrik Nordstrom <henrik@henriknordstrom.net>
CC: Pat Hayes <phayes@ihmc.us>, "www-tag@w3.org WG" <www-tag@w3.org>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4A6573A4.8020501@danbri.org>
On 21/7/09 03:37, Henrik Nordstrom wrote:
> mån 2009-07-20 klockan 13:16 -0500 skrev Pat Hayes:
>
>> Apparently you have not understood my point, above. There are cases
>> where NO implementation of ANY KIND can POSSIBLY map a URI to the
>> resource it identifies. So one cannot simply toss this issue over the
>> wall to some other, unspecified, "implementer". Its nothing to do with
>> implementation.
>
> For the kinds of URIs that HTTP deals with it can, as far as HTTP is
> concerned with the definition of "resource" as used by http which for
> technical specification writing reasons is slightly narrower than the
> general URI definition of resource.
>
>> I understand, but I am not talking about 'effects', but about semantics.
>
> And HTTP is completely ignorant of any semantics that the URIs accessed
> via HTTP may have.
>
> What HTTP cares about is if there may be effects on the resource state
> by actions requested by HTTP. (i.e. DELETE is assumed to have certain
> effect when executed on the http resource)

Perhaps we should all be talking about DELETE a bit more than about GET. 
Some of the issues are starker...

if http://www.ihmc.us/users/phayes/PatHayes is directly Pat Hayes the 
person (not a document about him), should HTTP DELETE remove Pat from 
the world, or just remove that URI from the universe of information?

What's a compliant implementation to do?

>> The semantics of URIs has nothing at all to do with layering. It is
>> part of the specification **of URIs themselves**. When anyone talks
>> about the relationship between a URI and the resource it identifies,
>> or denotes, or refers to, or is used to request, or indeed pretty much
>> any relationship between a URI and a resource, they are talking about
>> semantics.
>
> Ok. My point here is that HTTP does not care about those semantics. All
> it possibly cares about is that the server is the ultimately responsible
> for executing that semantic mapping of URI to resource (in URI terms),
> and that this mapping results in HTTP network accessible resources
> (which you seem to sometimes call a representation where HTTP calls it a
> resource) and their possible representations as defined by HTTP.

So would be perfectly reasonable to have two Web sites / services / 
installations, call them www-A and www-B, run according to similar 
readings of HTTP?:

  * Both of them agree that 
http://www-a.ihmc.us/users/phayes/PatHayesAbout and 
http://www-b.ihmc.us/users/phayes/PatHayesAbout are names/identifiers 
for Pat Hayes (ie. the person whose mailbox is phayes@ihmc.us), rather 
than for a page/document
  * Both of them implement HTTP verbs that proxy Pat into the Web, by 
allowing (through GETs, 303 redirections etc) representations of him to 
be exposed and accessed via the HTTP protocol
  * One of them reads an HTTP DELETE on the Pat URIs as a request to 
adjust the world such that www-a no longer shares any information about 
Pat via http://www-a.ihmc.us/users/phayes/PatHayesAbout (ie. "forget 
this resource mapping").
  * the other reads a DELETE on 
http://www-b.ihmc.us/users/phayes/PatHayesAbout as a request to adjust 
the world such that Pat is no longer in it. an altogether more serious, 
expensive and irreversible action.

Is there a fact-of-the-matter about whether website / webmaster www-A or 
www-B has the correct reading of HTTP? Or the two are perfectly free to 
diverge in their readings? If I am considering sending an HTTP DELETE 
message to www-a and/or www-b, what information should I take into 
account, while trying to determine how my messages will be understood? 
Is there any way to find out?

If all HTTP verbs (including extensions) are always with regard to the 
information-wrapping things, even if the URI is taken to name a "thing 
in the non-digital world", this is important to know and to agree. If 
it's up to the Web server, that's important to know. If nobody knows and 
it's all a bit vague still, that's also important to know. I don't think 
we collectively have a clear account of these issues yet.

>> No, it is quite on the point. If the server can respond differently to
>> different URIs which both identify the same resource, that changes the
>> game.
>
> If the defined semantics of the URIs says the server should respond
> differently then they in the world as defined by HTTP refer to different
> resources, but possibly very closely related such.

So HTTP always interposes a wrapper / proxy entity-thing-object (sheesh, 
we're running out of neutral words :) ...

ie. even if we all agree that
http://www-a.ihmc.us:8080/users/phayes/PatHayesAbout
http://www-a.ihmc.us:80/users/phayes/PatHayesAbout
...are two names for the self-same thing (namely Pat), they are in 
HTTP-speak inevitably going to be different (http:)"resources"? That's 
my reading of your last post, at least.

(irrespective even of whether the same Apache webserver exposes 
either/both of these, or different servers, or whatever - that's all 
internal and not directly relevant)

> It all boils down to the definition of what a resource is, and the HTTP
> resource is as I already explained NOT as general as the URI resource.

(following through my example)

So, still in the world where we all "agree" that both 
http://www-a.ihmc.us:8080/users/phayes/PatHayesAbout AND
http://www-a.ihmc.us:80/users/phayes/PatHayesAbout "are Pat"...

1. we have two different URIs
2. we have one wordly thing ("a URI resource?") that they name/identify; 
a human person in this case.
3. we have two other kinds of thing (HTTP resources) that proxy/wrap 
that person into the Web; each such "wrapping" is implemented by some 
"HTTP server" thing that speaks HTTP to the digital world, and typically 
has some private link to the single underlying thing named.
4. HTTP Verbs such as DELETE are understood in the context of one of 
these (possibly many) bindings, rather than "in the abstract": the 
server isn't getting a message saying "DELETE Pat Hayes" it is getting a 
message saying "DELETE the HTTP resource /phayes/PatHayesAbout"
5. It isn't clear how much DELETing the server is expected to do; in OO 
or SQL-backed sites, a DELETE might also cause the bound thing to be 
removed, ie. information removed from some external backend system.
6. Whether the HTTP client who sent can be considered to have 
*requested* for anything more than the resource-to-thing mapping to be 
DELETEd isn't clear.


>>> In the terminology defined by HTTP the difference between an
>>> (HTTP-)URI
>>> and resource is more of a special case, and not related to any of what
>>> you talk about.
>> It is related. In fact it is critical.
>
> To me when talking about HTTP it's not.

At this point I picture someone stood up in court of law, flapping a 
printout of the HTTP/1.1 spec, saying "but but ... you DELETEd my 
*car*... Sure, we agreed the HTTP resource identified my car, but all I 
wanted to do was remove that *mapping* when I sent an HTTP DELETE to 
/car/32".

Do HTTP/1.1 experts have a role in adjudicating in such disputes?

How much deleting is http-justifiable?

>> The operation of HTTP, according to http-range-14, is ALREADY
>> concerned with how URIs denote real-world entities beyond the
>> operation of http.
>
> And my viewpoint is that that's completely outside of what the HTTP
> specifications or operations is concerned about.

HTTP DELETE is a destructive act, we can agree that I hope. To request 
or to honour an HTTP DELETE request is to do something potentially 
damaging (or potentially life-saving, even). If we don't know quite 
clearly what an "HTTP DELETE" message is asking, how can anyone ever 
risk sending one? Especially to complex Web services, that connect to 
backend databases, sensors and to a rich world of ecommerce, users etc.

Yet these messages are regularly sent and handled. I can only assume 
they are typically interpreted conservatively, or by a tighter 
client/server implicit understanding than is mandated by HTTP alone.

> In fact it intentionally does not care about any such concerns and leaves that to
> the application of HTTP to any such entities. Anyone is free to define
> HTTP applications for such entities, by defining HTTP resources mapping
> to such entities as they please. HTTP only defines how one may interface
> with those once defined in terms of HTTP resources. What relations those
> HTTP resources have to any real-world entities is defined by that
> application, not by HTTP.

Can the nature of those mapping be hinted at or otherwise revealed 
*through* HTTP?

>>   (Not, by the way, with how *resources* map to real-
>> world resources. In the cases in question, the relationship between
>> the URI and the real-world entity is direct, not mediated through some
>> other resource inside a server.)
>
> And in my world that's an impossible condition, as those real-world
> resources do not exists in HTTP terms and need to be mediated via some
> server defined HTTP resource to be accessible via HTTP, or requests for
> that HTTP-URI would simply result in a 404 until a such HTTP resource is
> implemented for mapping to the real-world resource.


>> URIs can identify resources in the broader RFC3986 sense; and for
>> those URIs, there may well not be any resource in this narrow sense
>> identified by the URI at all. And yet, still, a GET on them might
>> resolve to an http endpoint. What does the http spec say about such a
>> case? What is the endpoint to do?
>
> Yes it's correct that HTTP URIs can identify resources in the broader
> sense, but not something the HTTP specifications as such concerns itself
> about. HTTP specifications end at the http endpoint and it's http mapped
> resource.

So - just as any server that changes the *underlying* resource in 
response to an HTTP GET is exceeding it's http-requested mandate, any 
server that removes the *underlying*  real world (rather than 
http-wrapping) resource on an HTTP DELETE is also exceeding what was asked?

concrete example (using SQL information objects as the real world object):

A car database, information about specific cars, stored in MySQL, and 
exposed via Apache+PHP.

The SQL has a table, known_cars. that table has fields reg_number, 
owner_email, price, for_sale, description_text, photo_url, car_id.

The Web site exposes each as /car/001 /car/002 using the car_id field.

Forget everything for now about whether /car/001 is a document or a car. 
Here I am interested only in the question of where the "http resource" 
stops, and the thing it maps to starts. And the mapping is firstly 
through a set of descriptions that happen to live in a mysql db.

* do we agree that HTTP servers who change the underlying database 
record after a GET are doing something that wasn't asked of them?
* do we agree that HTTP servers who change the underlying database 
record after a DELETE are doing something that wasn't asked of them?


[...]

> The rest of the world is obviously free (and in many cases should)
> ignore the HTTP definition of resource as it's of no relevance to them
> just as the possible existence of real-world resources has no relevance
> to the HTTP specifications.

To be explicit ... The Car example here is supposed to emphasise that we 
run into these same inclarities, even when the concern is with purely 
digital stuff. An HTTP server wrapped around a database server, for 
example. Do we consider the records in the SQL server to be intimate 
pieces of the "http resources" (and hence they live and die by the same 
HTTP verbs), or are they somehow merely mapped/linked.

I can imagine sysadmins and Web developers who quite reasonably answer 
such questions differently, and structure their data integrity policies 
accordingly. This matter of understanding how deep a DELETE request 
should go isn't one which only arises when we are talking about cars and 
people... but also when dealing with backend car SQL records or people 
directly entries.

cheers,

Dan
Received on Tuesday, 21 July 2009 07:52:50 UTC