Re: Deploying new expectation-extensions from Bill de hOra on 2008-04-05 (ietf-http-wg@w3.org from April to June 2008)

From: Bill de hOra <bill@dehora.net>
Date: Sat, 05 Apr 2008 13:33:15 +0100
To: "Roy T. Fielding" <fielding@gbiv.com>
CC: Brian McBarron <bpm@google.com>, google-gears-eng@googlegroups.com, Charles Fry <fry@google.com>, Mark Nottingham <mnot@yahoo-inc.com>, Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <47F7718B.1060909@dehora.net>
Roy T. Fielding wrote:
> 
> On Apr 4, 2008, at 6:50 AM, Brian McBarron wrote:
>> What we need to implement 
>> http://code.google.com/p/google-gears/wiki/ResumableHttpRequestsProposal 
>> is a mechanism which:
> 
> Don't reimplement TCP on top of HTTP.  The result will be not work
> as well as TCP.  Among other reasons, the connection will be reset
> at the TCP level and the RST will rollback the TCP windows to the
> point where your little client will never see the ack from the server.
> 
> Re-think your solution instead.
> 
> Consider this.  Any server that is capable of supporting such a
> mechanism will have to maintain temporary files that are uniquely
> accessible to that client for the amount of time that the server
> is willing to allow resumption.  So, let's give that resource a URI.
> 
> The first request is sent to the origin server without needing
> to know if it (or anything along the path) understands the extension.
> 
>    POST /my/favorites HTTP/1.1
>    Host: example.com
>    ....
> 
> If the server receives that message with no Via field or with a
> Via field that indicates no HTTP/1.0 intermediaries, then the server
> can respond immediately with
> 
>    103 Resumable
>    Location: http://example.com/my/favorites;r123
> 
> Note that the URI above is created by the server, so it could be
> anything, though a sensible security policy would limit it to a
> suffix of the requested POST URI.  You'll have to try it to see
> if this works in practice with HTTP/1.1 user agents -- you may
> need an additional request header to indicate it if not.
> 
> If the connection drops, the client can then do
> 
>    HEAD /my/favorites;r123 HTTP/1.1
>    Host: example.com
> 
> to determine how much of the content was successfully transferred
> and stored on the origin server, and append to that content with
> a simple
> 
>    POST /my/favorites;r123 HTTP/1.1
>    Host: example.com
> 
> and repeat as necessary.  Auth/access control can be added as normal.
> I suggest, however, that anything as fragile as a resumed request
> be accompanied with some sort of integrity check, like content-md5.
> 
> The extension is therefore achieved with the addition of one status
> code and should work with any HTTP/1.1 compliant server chain.
> If it doesn't, you are no worse off than before (and HTTP is not
> complicated and constipated by unnecessary Expect fields).

So this is interesting. I'm working on exactly the same thing for 
uploads from mobile networks, where connections get dropped all the 
time. Typically these are photo/video and as the cameras get to 5mp, the 
chances of actually getting a failed upload improve.

I've split the problem into two parts:

  - upload resumption
  - upload packets

Upload resumption means continuing from a known offset; I'm not using 
new response codes or expectations for that.  Upload packaging means 
splitting the image in advance, and uploading each part separately. .

== Upload resumption

For upload resumption, it goes like this:

    POST /my/favorites HTTP/1.1
    Host: example.com

    ...binary...


    201 Created
    Location: http://example.com/my/favorites/image5
    ETag: "c180de84f991g8"

That sequence integrates with Atom Protocol, which is important  -  a 
new response code means upgrades for the successful AtomPub case.

If the client never gets the 201, the client can retry POST again (the 
server. will have what I used to call a 'phantom' in HTTPLR, but that 
can be cleaned up).

The Etag can help with requests arriving out of order (my understanding 
is that such heisenbugs are possible over mobile networks; but I'm happy 
to be corrected on that).

If the connection drops, the client might be confused, but it can check 
Content-Length:

    HEAD /my/favorites/image5 HTTP/1.1
    Host: example.com

	
    200 Ok
    Location: http://example.com/my/favorites/image5
    Content-Length: nnn
    ETag: "c180de84f991g8"

Then the client can send the rest of the data offset from 'nnn'. But to 
make the semantics clear I use PATCH (or overloaded POST with 
X-HTTP-Method-Override). Definitely not PUT.

    PATCH /my/favorites/image5 HTTP/1.1
    Host: example.com
    If-None-Match: "e180ee84f0671b1"

    ...nnn+...


== Upload packets

This is where the image is split in advance and they are uploaded to the 
server, perhaps in parallel.  It *is* different and is I think a 
reinvention of TCP a few layers up, largely because unlike the 
resumption case, the packets can generally arrive out of order 
(seriously, we're talking photomosiacs).  To deal with that means being 
able to say this packet is "m of n".

It gets complicated quite quickly and the fundamental tradeoff seems to 
be this - do you tunnel the protocol through http, or do you expose each 
part as a resource, a surrogate of some overall resource that denotes 
the uploaded content?

I have some ideas on how to do this but nothing concrete. To be honest, 
part of me isn't sure whether isn't a good idea to overload http and 
resource semantics in this way so we can play at being bittorent. It 
seems to be valuable (or at least wished for) in the mobile space; 
repeating failed uploads are a real problem there. I doubt it can be 
done without extensions, but would be delighted to be wrong about that.


Anyway, I think the resumption case be dealt with straightforwardly 
without any extensions to http.

Bill
Received on Saturday, 5 April 2008 12:33:55 UTC