Re: draft-ietf-httpbis-resumable-upload-01

Hi Lucas,

thanks for the feedback! I agree that is probably best if the uploads draft
briefly mentions the interaction with digest to ensure that implementations
are compatible.

> Note however that the representation depends on factors like
content-encoding. Which makes me consider that a resumable upload is also
tied to representation. I.e. it isn't possible to start an upload using
gzip, have it fail part way through, and then resume an upload with brotli.
If that's true, we probably want some text in the document to make it super
clear.

That's a good point. I agree that the encoding should not change throughout
the upload.

> It would be equally fine for a server to just send the digest without
observing a Want- header. But maybe you're thinking about trying to avoid
unecessary server overhead?

True, the server might include unsolicited digest. Avoiding additional
overhead is a good use for Want-, but maybe I was also just looking for a
nice use for this header :)

> I agree it seems wrong that a server would reject a partial upload if the
content-digest didn't match. So perhaps the caveat would be the
content-digest is only validated when the full content is received.

That's a though spot. I think that dropping the Content-Digest headers is
probably equally wrong, because then integrity issues might slight in while
the client is relying on the digest headers to verify the integrity. Let's
see what other think about this.

Best regards
Marius

On Wed, Aug 9, 2023 at 7:39 PM Lucas Pardue <lucaspardue.24.7@gmail.com>
wrote:

> Hi Marius,
>
> Responding in-line
>
>
>
> On Wed, Aug 9, 2023 at 6:16 PM Marius Kleidl <marius@transloadit.com>
> wrote:
>
>> Hi Lucas,
>>
>> I had a brief read through the digest draft today while thinking how this
>> could be applied to resumable uploads. The new integrity fields with its
>> separation between content and representation should fit perfectly to the
>> approach of resumable uploads. I thought I would name a few concrete
>> examples to see if my understanding of the digest draft is correct:
>>
>> 1) Assume that the client knows the digest of the entire file that it
>> wants to upload. It can then set Repr-Digest in the Upload Creation
>> Procedure and upload the data. If the transfer is interrupted it is just
>> resumed as normal with HEAD and PATCH. Once the upload is complete, the
>> server can verify the integrity of the uploaded data by comparing it to the
>> Repr-Digest from the Upload Creation Procedure. If they don't match, the
>> server can reject the upload and return an error to the client or similar.
>>
>
> Yeah I think that mostly holds true. Note however that the representation
> depends on factors like content-encoding. Which makes me consider that a
> resumable upload is also tied to representation. I.e. it isn't possible to
> start an upload using gzip, have it fail part way through, and then resume
> an upload with brotli. If that's true, we probably want some text in the
> document to make it super clear.
>
>>
>> 2) If the client does not know the digest of the entire upload at the
>> beginning (because it is streamed from another resource), it can compute
>> the digest while the upload is running and then include Repr-Digest as a
>> trailer on the final Upload Creation Procedure or Upload Appending
>> Procedure. The server can then verify as in 1).
>>
>
> Yep
>
>
>>
>> 3) If the client does not compute the digest, but wants to query the
>> server's digest, it can include the Want-Repr-Digest in the Upload Creation
>> Procedure. On the final response, the server should then include the
>> Repr-Digest whose value corresponds to the entire upload that it has
>> received. The client then may verify the digest or provide it to the
>> client's users.
>>
>
> The Want- headers are a bit nebulous in reality. This might work or it
> might not. It would be equally fine for a server to just send the digest
> without observing a Want- header. But maybe you're thinking about trying to
> avoid unecessary server overhead?
>
>
>> 4) If the clients sets Content-Digest on an Upload Creation Procedure or
>> Upload Appending Procedure, it only applies to the specific request body.
>> So if the request transmission gets interrupted, the server is not able to
>> verify the integrity and must reject the entire request without appending
>> its content to the upload. This makes the request effectively
>> transactional, where either the entire body or nothing is appended to the
>> upload. This is a bit contrary to the intention of uploads that can be
>> resumed from the point where they failed, but if you split an upload into
>> multiple requests, this might still be interesting for some applications.
>>
>
> How endpoints deal with digest validation failures is not defined by the
> digest spec itself, meaning there a many possible options. I agree it seems
> wrong that a server would reject a partial upload if the content-digest
> didn't match. So perhaps the caveat would be the content-digest is only
> validated when the full content is received.
>
>
>> Do these examples align with the intentions behind the digest draft or
>> did I get something wrong? All in all, this appears like a great fit!
>>
>
> They seem to align to me, modulo some details or options. Initially I was
> hesitant to add to too much detail or proscribed behaviour but, given your
> list of use cases, perhaps we should consider what should be in scope for
> resumable clients or servers to do and state it clearly what the interop
> expectations would be. Thoughts from the WG?
>
> Cheers
> Lucas
>
>
>
>>

Received on Wednesday, 9 August 2023 19:02:15 UTC