Re: draft-ietf-httpbis-resumable-upload-01

Hi Marius,

Responding in-line



On Wed, Aug 9, 2023 at 6:16 PM Marius Kleidl <marius@transloadit.com> wrote:

> Hi Lucas,
>
> I had a brief read through the digest draft today while thinking how this
> could be applied to resumable uploads. The new integrity fields with its
> separation between content and representation should fit perfectly to the
> approach of resumable uploads. I thought I would name a few concrete
> examples to see if my understanding of the digest draft is correct:
>
> 1) Assume that the client knows the digest of the entire file that it
> wants to upload. It can then set Repr-Digest in the Upload Creation
> Procedure and upload the data. If the transfer is interrupted it is just
> resumed as normal with HEAD and PATCH. Once the upload is complete, the
> server can verify the integrity of the uploaded data by comparing it to the
> Repr-Digest from the Upload Creation Procedure. If they don't match, the
> server can reject the upload and return an error to the client or similar.
>

Yeah I think that mostly holds true. Note however that the representation
depends on factors like content-encoding. Which makes me consider that a
resumable upload is also tied to representation. I.e. it isn't possible to
start an upload using gzip, have it fail part way through, and then resume
an upload with brotli. If that's true, we probably want some text in the
document to make it super clear.

>
> 2) If the client does not know the digest of the entire upload at the
> beginning (because it is streamed from another resource), it can compute
> the digest while the upload is running and then include Repr-Digest as a
> trailer on the final Upload Creation Procedure or Upload Appending
> Procedure. The server can then verify as in 1).
>

Yep


>
> 3) If the client does not compute the digest, but wants to query the
> server's digest, it can include the Want-Repr-Digest in the Upload Creation
> Procedure. On the final response, the server should then include the
> Repr-Digest whose value corresponds to the entire upload that it has
> received. The client then may verify the digest or provide it to the
> client's users.
>

The Want- headers are a bit nebulous in reality. This might work or it
might not. It would be equally fine for a server to just send the digest
without observing a Want- header. But maybe you're thinking about trying to
avoid unecessary server overhead?


> 4) If the clients sets Content-Digest on an Upload Creation Procedure or
> Upload Appending Procedure, it only applies to the specific request body.
> So if the request transmission gets interrupted, the server is not able to
> verify the integrity and must reject the entire request without appending
> its content to the upload. This makes the request effectively
> transactional, where either the entire body or nothing is appended to the
> upload. This is a bit contrary to the intention of uploads that can be
> resumed from the point where they failed, but if you split an upload into
> multiple requests, this might still be interesting for some applications.
>

How endpoints deal with digest validation failures is not defined by the
digest spec itself, meaning there a many possible options. I agree it seems
wrong that a server would reject a partial upload if the content-digest
didn't match. So perhaps the caveat would be the content-digest is only
validated when the full content is received.


> Do these examples align with the intentions behind the digest draft or did
> I get something wrong? All in all, this appears like a great fit!
>

They seem to align to me, modulo some details or options. Initially I was
hesitant to add to too much detail or proscribed behaviour but, given your
list of use cases, perhaps we should consider what should be in scope for
resumable clients or servers to do and state it clearly what the interop
expectations would be. Thoughts from the WG?

Cheers
Lucas



>

Received on Wednesday, 9 August 2023 17:39:27 UTC