Re: Comments on draft-ietf-httpbis-p2-semantics-07 from Hugh Winkler on 2009-07-22 (ietf-http-wg@w3.org from July to September 2009)

From: Hugh Winkler <hughw@wellstorm.com>
Date: Tue, 21 Jul 2009 19:11:12 -0500
To: Mark Nottingham <mnot@mnot.net>
Cc: ietf-http-wg@w3.org
Message-ID: <927441b30907211711oa2e44bel2a94d5cca9b8c1b8@mail.gmail.com>
Thanks Mark. Let me rephrase my request. HTTP WG, please fix
asynchronous POST. It is broken in the way I described. I once saw 202
in an ebXML (or was it Rosettanet) environment; I've never seen
another one used in anger. We encounter asynchronous HTTP POST all the
time in web browsers. Not one of them uses 202.

The WG ought to do something so that all kinds of user agents - web
browsers as  well as B2B XML systems --  can do asynchronous POST
using a uniform interface. A server ought not care what kind of client
has POSTed to it, web browser or B2B agent software; in either case
the protocol  interaction ought to be the same .

Inline...

On Sun, Jul 19, 2009 at 10:41 PM, Mark Nottingham<mnot@mnot.net> wrote:
> Hi Hugh,
>
> The sections you mention don't require the use of 201/202 in certain
> situations (such as uploading a file); rather they're just made available
> for use.

Right -- I don't think I said anything required sending 201 or 202.
Section 8.2.2 addresses the case where the server has created a
resource and intends to return 201.

  The origin server MUST create the resource before returning the 201
status code.
   If the action cannot be carried out immediately, the server SHOULD
   respond with 202 (Accepted) response instead.

So the spec says you SHOULD return 202 if you must create a resource
asynchronously. Can't we permit redirects too? SHOULD is pretty strong
language.

>
> Furthermore, we can't really add new semantics (you say "clarify", but I
> think that's what you're asking us to do) to these status codes, or
> constrain them further, as that may make existing implementations
> non-conformant.

The spec has never said anything about what to do in case you receive
a 202 in response to a GET. Please just tell us what it means to get
202 in response to GET. Is it illegal? Does it mean exactly the same
thing as 200? If there are http clients out there doing something
besides treating it as general 2xx, they can't count on consistent
server semantics. How could we make them non-conformant?

I suggest that saying it means, "the resource  behind this URI has
been accepted for processing, and its state may change later", is a
simple clarification that would be enough so that UAs could safely do
something useful with that response ( add the URL to a list of ongoing
uploads/airline ticket searches/etc., and GET it again periodically).
If a particular server does not honor that semantic, no harm has been
done -- multiple GETS will be just as harmless as before. Those
servers would not become non-conformant.

>
> Cheers,
>
>
> On 19/07/2009, at 8:36 AM, Hugh Winkler wrote:
>
>> Hi all,
>>
>> There's a common use case HTTP has not addressed well: asynchronous
>> file uploads. Many web browsers have "download managers", but none
>> have "upload managers". Large file uploads are left to application
>> developer, because browsers  have poor guidance from the spec on how
>> they should interact with servers and intermediaries, during an
>> asynchronous upload.
>>
>> Example problem:  My web form allows  the user to upload a file. My
>> application will create a resource -- say, a PDF document -- from it.
>> It might take my server 2 seconds or 20 minutes to do so. If my server
>> can process the request quickly, it returns 201 Created, and sets the
>> Location header to the URI of the new PDF. (And lots of web apps
>> return 303 in this case). It might also return the PDF as the entity
>> response. If my server must process the request asynchronously,
>> HTTPbis semantics say I  should return 202 Accepted. In this case the
>> server can return an entity that gives the current status, and links
>> to a status monitor.
>>
>> First, notice that the in the asynchronous case HTTP suggests the
>> entity should both give the current status and link to a status
>> monitor. That's duplicative, and makes the user click again to go to a
>> real status monitor. But it's the best we can do -- the UA POSTed to
>> this URL. It cannot POST again just to get an updated status.
>>
>> Second, HTTP has not suggested sending a Location header as part of
>> the 202 response. Although that might have been a good idea, it's now
>> moot. Browsers have never followed Location headers in 2xx responses
>> -- they just display the returned entity. Changing that semantic now
>> could be harmful -- that understanding is baked into millions of web
>> applications and browsers.
>>
>> In fact, the spec (HTTPbis 8.2.2 and 8.2.3) is out of sync with what
>> web apps actually do. They do not return 202, display an entity
>> showing a status requiring a user to click again. A typical webapp
>> will return 303, redirecting the user to a page having the status
>> monitor, saving the user a useless click.
>>
>> So, my first suggestion is that the first paragraph of 8.2.2 (201
>> Created) ought  to be amended to suggest both 202 and 303 as
>> appropriate responses.
>>
>> Now, in the case the server responds 303 to the POST,  no browser or
>> other agent  is harmed. 303 is a common response to POST. The browser
>> now does GET on the redirect URI. Ordinary web applications might
>> return 200 and a status monitor entity. Other applications might
>> return 202. After all, 202 is the "real" response to the original
>> request -- and that is the meaning of the 303.  Furthermore, returning
>> 202 in response to a GET suggests that if the UA does a GET on this
>> URL later, the status could change to indicate the outcome of the
>> original POST. In the example problem, a UA redirected in the response
>> to the POST, to a status page  that returns 202, could GET that url
>> until the status changes -- in the case of a successful PDF
>> generation, either a 201 Created response, or another 303 or 302; in
>> any case, the Location URI finally will contain the URL of the newly
>> created resource.
>>
>> My second suggestion is that that sec 8.2.3 (202 Accepted) should
>> elaborate on the meaning of 202 in response to GET. Up to now, 202 has
>> seen limited use, and it's generally thought of as a response to POST,
>> PUT, or DELETE. But as a response to GET, it should have this added
>> meaning that the returned status code is likely to change in the
>> future, so please try again.
>>
>> Clarifying these semantics could allow agents to take over a lot of
>> the automation of asynchronous file uploads. Web browsers could have
>> an "Uploads" window where you check the status of your uploads.  Web
>> client libraries could also take over  automating this function from
>> application code.
>>
>>
>> Hugh
>>
>
>
> --
> Mark Nottingham     http://www.mnot.net/
>
>



-- 
Hugh Winkler, CEO
Wellstorm Development
31900 Ranch Road 12
Suite 206
Dripping Springs, TX 78620
USA
http://www.wellstorm.com/
+1 512 264 3998 x801
Received on Wednesday, 22 July 2009 00:11:48 UTC