Partial uploads, resumable requests, and progress of long-running operations

Hello HTTP WG,

I am seeking to submit some new HTTP features that helps make large requests, allowing user agents to resume an unsafe request if it is interrupted, upload large documents in multiple parts, receive realtime updates on the processing, and advertise these client capabilities to the server. I was surveying HTTP server applications and noted a proliferation of rituals for performing large uploads and monitoring progress of operations; all in server-specific, non-interoperable fashions. This seems like an area of HTTP that is begging for standardization, so I wrote a specification and a proof-of-concept implementation. [1]

I've split these features into three documents, each of which may be implemented separately, as desired by origin servers and user agents: 

- "partial-upload" specifies a PATCH media type that writes to a specific byte range of a server resource, allowing a file to be uploaded in many smaller requests. Previously, applications would have to define a service-specific mechanism for accepting multiple requests, representing multiple segments of the single upload (e.g. [2]).

- "resume-request" specifies a way to address the current request body and/or response message. Then, if the request is interrupted, a client may resume the upload by appending data in a PATCH request. Likewise, by making a GET request to the response-message or response Content-Location, clients may resume an interrupted response. Previously, an upload would have to be retried from the start, and responses might be lost entirely.

- “progress" specifies a header allowing the server to update the client on progress it is making while generating a response. Previously, applications would have to define a custom mechanism for reading the progress of an operation, typically a read via a separate HTTP request, or WebSockets.

Each of these features are implemented with new headers over 1xx interim responses; no additional HTTP requests are needed, except as necessary after an interrupted connection (e.g. because of a TCP reset). These features may be implemented by HTTP client libraries or directly in the user-agent, without any special support required by application developers or end users. Further, it is intended to be feature compatible with all similar patterns seen in the wild today.

The repository includes a working proof-of-concept, however at present, it requires this patch to Node.js [3] (with any luck, this will be merged and available in the next release of Node.js).

I intend to submit these for consideration as Internet standards, I suppose that would be through this working group. The documents are also split up the way they are so they can be considered separately; I predict the new media type for patching byte ranges would move much faster than the others.

Please review the documents and provide any feedback you may have!

Thank you,

Austin Wright.

[1] <https://github.com/awwright/http-progress <https://github.com/awwright/http-progress>> A few months ago I considered using the 102 (Processing) status code to convey progress of an operation, but realized it had no mechanism to convey additional information, so I wrote up a document describing the “Progress” header. However, I had some difficulty figuring out how the client might receive the final status if a long response was interrupted: while 202 Accepted seems to be the standard solution, it offers no guidance to user agents to get an actual status code. Separately, I wrote that you might be able to use multipart/byteranges in a PATCH request body to patch part of a resource. Shortly thereafter I realized these two problems are related, and wrote resume-request, then a proof-of-concept.

[2] <https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingRESTAPImpUpload.html <https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingRESTAPImpUpload.html>> which, despite Amazon’s description, I do not believe to be RESTful, because its use requires prior knowledge of how that service works; a standard user agent would not be able to discover or make use of this functionality.

[3] <https://github.com/nodejs/node/pull/28459 <https://github.com/nodejs/node/pull/28459>> This patch is required to receive headers in 1xx interim responses; presently Node.js v12.4.0 only exposes the status code, discarding the headers (even though they are fully parsed by the internal parser).

Received on Sunday, 14 July 2019 09:18:25 UTC