Re: Idempotency-Key for resuming requests (TUS Resumable Uploads Protocol) from Austin William Wright on 2022-10-07 (ietf-http-wg@w3.org from October to December 2022)

From: Austin William Wright <aaa@bzfx.net>
Date: Fri, 7 Oct 2022 12:31:55 -0700
To: Marius Kleidl <marius@transloadit.com>
Cc: ietf-http-wg <ietf-http-wg@w3.org>
Message-Id: <09E9720A-9E44-4EF4-A795-7973659A8D70@bzfx.net>
On Oct 7, 2022, at 08:02, Marius Kleidl <marius@transloadit.com> wrote:
> 
> Dear Austin,
> 
> you brought up a good point by throwing Idempotency-Key into the recent Interim meeting. Idempotency and resumability are related but not the same, because when resuming an upload you want to start the data transfer at a different offset to avoid transmitting the same data again. I am missing this part from your proposal. How does the client obtain the information which data the server has already received and which part it must still transmit? Or did I just miss that piece?

Indeed, I hand-waved away the details about how exactly the unsent data is resumed, and how you resynchronize the client state with the server (determining how many bytes the server received or lost). I assumed the current technique, using an Upload-Offset header and making a HEAD request, will suffice.

But since GET/HEAD is supposed to be cachable (and so returning a client-specific Upload-Offset would be dubious), let me propose an alternative way of resynchronizing the client state, there's three parts to this:

(1) While you are making the first upload, the server can (optionally) periodically send “1xx Received" responses indicating how many bytes have been saved to durable storage. This allows the client to forget about these bytes, saving memory. When resuming an upload, the client can start from the last sent “1xx Received” offset.

(2) If you make a RESUME request with no upload, then the server simply returns how many bytes it has saved. This just changes the HEAD method <https://www.ietf.org/archive/id/draft-ietf-httpbis-resumable-upload-00.html#name-offset-retrieving-procedure-2> out with the new method.

(3) You make a RESUME request with an upload body, however you don’t know the offset. So you wait for the server to send the “1xx Received” offset indicating the bytes received from the previous request; the client then resumes the upload at this point (and making additional “1xx Received” responses as it processes the incoming data). (Note how this is quite similar to the 100 Continue workflow, where the client waits for acknowledgement from the server before proceeding with the upload.)

Finally, yes, it also seems to me that idempotency and resumability aren't exactly the same in general. It occurred to me, a request can be idempotent because it overwrites the server state entirely (PUT), or because the request is conditional (If-Match).

What’s important here is both Idempotency-Key and resumable uploads are the same type of conditional request: the request should only be executed if it hasn’t previously been completed.

Thanks,

Austin.


> 
> That being said, I think that an idempotency key could play an important role in resumable uploads, especially considering the creation of uploads resources.
> 
> Best regards,
> Marius
> 
> 
> El jue., 6 oct. 2022 1:11, Austin William Wright <aaa@bzfx.net <mailto:aaa@bzfx.net>> escribió:
> Hello HTTP WG,
> 
> It occurred to me that, in general, where a client wants resumable requests, it also wants those requests be idempotent. I noted the HTTP APIs WG is considering an Idempotency-Key header, and it seems to me these should be part of the same mechanism. Let me explain how that might work:
> 
> If the client sends an Idempotency-Key, then the server can respond with a 1xx (Resumption Supported) confirming that the key will be honored for resuming the request.
> 
> Clients that want resumption SHOULD send Idempotency-Key, but they don’t have to (for example, to minimize UA fingerprinting). If the client does not send Idempotency-Key, the server should still send 1xx (Resumption Supported) to assign the request an Idempotency-Key. But it doesn’t have to (for example, 1xx is filtered by gateways).
> 
> Then when the request is interrupted, the client may resume the request with the same URI and Idempotency-Key of the request being resumed. By defining a RESUME method, a client could attempt to send a RESUME request in the hopes the server supports resumption, even if it didn’t receive a 1xx response. (Using the same URI allows the server to partition incomplete requests/uploads by the resource URI. There should be little need to re-send the method, since the server must store the details of the original request anyways, but a header could be defined if such a need is found.)
> 
> I see few downsides with this technique. There’s some builtin redundancy so resumption will work even if one party is buggy, and if a new method is used, there’s no possibility that a server could misinterpret the request. Idempotency-Key would replace "Upload-Token” entirely.
> 
> Thoughts?
> 
> Thanks,
> 
> Austin.
Received on Friday, 7 October 2022 19:32:12 UTC