- From: Hugh Winkler <hughw@wellstorm.com>
- Date: Sat, 18 Jul 2009 17:36:44 -0500
- To: ietf-http-wg@w3.org
Hi all, There's a common use case HTTP has not addressed well: asynchronous file uploads. Many web browsers have "download managers", but none have "upload managers". Large file uploads are left to application developer, because browsers have poor guidance from the spec on how they should interact with servers and intermediaries, during an asynchronous upload. Example problem: My web form allows the user to upload a file. My application will create a resource -- say, a PDF document -- from it. It might take my server 2 seconds or 20 minutes to do so. If my server can process the request quickly, it returns 201 Created, and sets the Location header to the URI of the new PDF. (And lots of web apps return 303 in this case). It might also return the PDF as the entity response. If my server must process the request asynchronously, HTTPbis semantics say I should return 202 Accepted. In this case the server can return an entity that gives the current status, and links to a status monitor. First, notice that the in the asynchronous case HTTP suggests the entity should both give the current status and link to a status monitor. That's duplicative, and makes the user click again to go to a real status monitor. But it's the best we can do -- the UA POSTed to this URL. It cannot POST again just to get an updated status. Second, HTTP has not suggested sending a Location header as part of the 202 response. Although that might have been a good idea, it's now moot. Browsers have never followed Location headers in 2xx responses -- they just display the returned entity. Changing that semantic now could be harmful -- that understanding is baked into millions of web applications and browsers. In fact, the spec (HTTPbis 8.2.2 and 8.2.3) is out of sync with what web apps actually do. They do not return 202, display an entity showing a status requiring a user to click again. A typical webapp will return 303, redirecting the user to a page having the status monitor, saving the user a useless click. So, my first suggestion is that the first paragraph of 8.2.2 (201 Created) ought to be amended to suggest both 202 and 303 as appropriate responses. Now, in the case the server responds 303 to the POST, no browser or other agent is harmed. 303 is a common response to POST. The browser now does GET on the redirect URI. Ordinary web applications might return 200 and a status monitor entity. Other applications might return 202. After all, 202 is the "real" response to the original request -- and that is the meaning of the 303. Furthermore, returning 202 in response to a GET suggests that if the UA does a GET on this URL later, the status could change to indicate the outcome of the original POST. In the example problem, a UA redirected in the response to the POST, to a status page that returns 202, could GET that url until the status changes -- in the case of a successful PDF generation, either a 201 Created response, or another 303 or 302; in any case, the Location URI finally will contain the URL of the newly created resource. My second suggestion is that that sec 8.2.3 (202 Accepted) should elaborate on the meaning of 202 in response to GET. Up to now, 202 has seen limited use, and it's generally thought of as a response to POST, PUT, or DELETE. But as a response to GET, it should have this added meaning that the returned status code is likely to change in the future, so please try again. Clarifying these semantics could allow agents to take over a lot of the automation of asynchronous file uploads. Web browsers could have an "Uploads" window where you check the status of your uploads. Web client libraries could also take over automating this function from application code. Hugh
Received on Saturday, 18 July 2009 22:37:21 UTC