- From: Lucas Pardue <lucaspardue.24.7@gmail.com>
- Date: Wed, 9 Aug 2023 18:39:10 +0100
- To: Marius Kleidl <marius@transloadit.com>
- Cc: Rob Sayre <sayrer@gmail.com>, Guoye Zhang <guoye_zhang@apple.com>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
- Message-ID: <CALGR9oYFC8m4f2DxodXoCw0=RD-MoCeB3i-p0KYEWXhooz-kRA@mail.gmail.com>
Hi Marius, Responding in-line On Wed, Aug 9, 2023 at 6:16 PM Marius Kleidl <marius@transloadit.com> wrote: > Hi Lucas, > > I had a brief read through the digest draft today while thinking how this > could be applied to resumable uploads. The new integrity fields with its > separation between content and representation should fit perfectly to the > approach of resumable uploads. I thought I would name a few concrete > examples to see if my understanding of the digest draft is correct: > > 1) Assume that the client knows the digest of the entire file that it > wants to upload. It can then set Repr-Digest in the Upload Creation > Procedure and upload the data. If the transfer is interrupted it is just > resumed as normal with HEAD and PATCH. Once the upload is complete, the > server can verify the integrity of the uploaded data by comparing it to the > Repr-Digest from the Upload Creation Procedure. If they don't match, the > server can reject the upload and return an error to the client or similar. > Yeah I think that mostly holds true. Note however that the representation depends on factors like content-encoding. Which makes me consider that a resumable upload is also tied to representation. I.e. it isn't possible to start an upload using gzip, have it fail part way through, and then resume an upload with brotli. If that's true, we probably want some text in the document to make it super clear. > > 2) If the client does not know the digest of the entire upload at the > beginning (because it is streamed from another resource), it can compute > the digest while the upload is running and then include Repr-Digest as a > trailer on the final Upload Creation Procedure or Upload Appending > Procedure. The server can then verify as in 1). > Yep > > 3) If the client does not compute the digest, but wants to query the > server's digest, it can include the Want-Repr-Digest in the Upload Creation > Procedure. On the final response, the server should then include the > Repr-Digest whose value corresponds to the entire upload that it has > received. The client then may verify the digest or provide it to the > client's users. > The Want- headers are a bit nebulous in reality. This might work or it might not. It would be equally fine for a server to just send the digest without observing a Want- header. But maybe you're thinking about trying to avoid unecessary server overhead? > 4) If the clients sets Content-Digest on an Upload Creation Procedure or > Upload Appending Procedure, it only applies to the specific request body. > So if the request transmission gets interrupted, the server is not able to > verify the integrity and must reject the entire request without appending > its content to the upload. This makes the request effectively > transactional, where either the entire body or nothing is appended to the > upload. This is a bit contrary to the intention of uploads that can be > resumed from the point where they failed, but if you split an upload into > multiple requests, this might still be interesting for some applications. > How endpoints deal with digest validation failures is not defined by the digest spec itself, meaning there a many possible options. I agree it seems wrong that a server would reject a partial upload if the content-digest didn't match. So perhaps the caveat would be the content-digest is only validated when the full content is received. > Do these examples align with the intentions behind the digest draft or did > I get something wrong? All in all, this appears like a great fit! > They seem to align to me, modulo some details or options. Initially I was hesitant to add to too much detail or proscribed behaviour but, given your list of use cases, perhaps we should consider what should be in scope for resumable clients or servers to do and state it clearly what the interop expectations would be. Thoughts from the WG? Cheers Lucas >
Received on Wednesday, 9 August 2023 17:39:27 UTC