- From: Bill de hOra <bill@dehora.net>
- Date: Sat, 05 Apr 2008 13:33:15 +0100
- To: "Roy T. Fielding" <fielding@gbiv.com>
- CC: Brian McBarron <bpm@google.com>, google-gears-eng@googlegroups.com, Charles Fry <fry@google.com>, Mark Nottingham <mnot@yahoo-inc.com>, Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>
Roy T. Fielding wrote: > > On Apr 4, 2008, at 6:50 AM, Brian McBarron wrote: >> What we need to implement >> http://code.google.com/p/google-gears/wiki/ResumableHttpRequestsProposal >> is a mechanism which: > > Don't reimplement TCP on top of HTTP. The result will be not work > as well as TCP. Among other reasons, the connection will be reset > at the TCP level and the RST will rollback the TCP windows to the > point where your little client will never see the ack from the server. > > Re-think your solution instead. > > Consider this. Any server that is capable of supporting such a > mechanism will have to maintain temporary files that are uniquely > accessible to that client for the amount of time that the server > is willing to allow resumption. So, let's give that resource a URI. > > The first request is sent to the origin server without needing > to know if it (or anything along the path) understands the extension. > > POST /my/favorites HTTP/1.1 > Host: example.com > .... > > If the server receives that message with no Via field or with a > Via field that indicates no HTTP/1.0 intermediaries, then the server > can respond immediately with > > 103 Resumable > Location: http://example.com/my/favorites;r123 > > Note that the URI above is created by the server, so it could be > anything, though a sensible security policy would limit it to a > suffix of the requested POST URI. You'll have to try it to see > if this works in practice with HTTP/1.1 user agents -- you may > need an additional request header to indicate it if not. > > If the connection drops, the client can then do > > HEAD /my/favorites;r123 HTTP/1.1 > Host: example.com > > to determine how much of the content was successfully transferred > and stored on the origin server, and append to that content with > a simple > > POST /my/favorites;r123 HTTP/1.1 > Host: example.com > > and repeat as necessary. Auth/access control can be added as normal. > I suggest, however, that anything as fragile as a resumed request > be accompanied with some sort of integrity check, like content-md5. > > The extension is therefore achieved with the addition of one status > code and should work with any HTTP/1.1 compliant server chain. > If it doesn't, you are no worse off than before (and HTTP is not > complicated and constipated by unnecessary Expect fields). So this is interesting. I'm working on exactly the same thing for uploads from mobile networks, where connections get dropped all the time. Typically these are photo/video and as the cameras get to 5mp, the chances of actually getting a failed upload improve. I've split the problem into two parts: - upload resumption - upload packets Upload resumption means continuing from a known offset; I'm not using new response codes or expectations for that. Upload packaging means splitting the image in advance, and uploading each part separately. . == Upload resumption For upload resumption, it goes like this: POST /my/favorites HTTP/1.1 Host: example.com ...binary... 201 Created Location: http://example.com/my/favorites/image5 ETag: "c180de84f991g8" That sequence integrates with Atom Protocol, which is important - a new response code means upgrades for the successful AtomPub case. If the client never gets the 201, the client can retry POST again (the server. will have what I used to call a 'phantom' in HTTPLR, but that can be cleaned up). The Etag can help with requests arriving out of order (my understanding is that such heisenbugs are possible over mobile networks; but I'm happy to be corrected on that). If the connection drops, the client might be confused, but it can check Content-Length: HEAD /my/favorites/image5 HTTP/1.1 Host: example.com 200 Ok Location: http://example.com/my/favorites/image5 Content-Length: nnn ETag: "c180de84f991g8" Then the client can send the rest of the data offset from 'nnn'. But to make the semantics clear I use PATCH (or overloaded POST with X-HTTP-Method-Override). Definitely not PUT. PATCH /my/favorites/image5 HTTP/1.1 Host: example.com If-None-Match: "e180ee84f0671b1" ...nnn+... == Upload packets This is where the image is split in advance and they are uploaded to the server, perhaps in parallel. It *is* different and is I think a reinvention of TCP a few layers up, largely because unlike the resumption case, the packets can generally arrive out of order (seriously, we're talking photomosiacs). To deal with that means being able to say this packet is "m of n". It gets complicated quite quickly and the fundamental tradeoff seems to be this - do you tunnel the protocol through http, or do you expose each part as a resource, a surrogate of some overall resource that denotes the uploaded content? I have some ideas on how to do this but nothing concrete. To be honest, part of me isn't sure whether isn't a good idea to overload http and resource semantics in this way so we can play at being bittorent. It seems to be valuable (or at least wished for) in the mobile space; repeating failed uploads are a real problem there. I doubt it can be done without extensions, but would be delighted to be wrong about that. Anyway, I think the resumption case be dealt with straightforwardly without any extensions to http. Bill
Received on Saturday, 5 April 2008 12:33:55 UTC