Re: Resumable Upload draft updates

(top-quoting myself)

I looked at the minutes and the open GitHub issues.* No strong objections
at all, especially if the authors have deployed something similar. The
concerns about up-front digests almost never matter in practice, because
the network is way slower than just calculating the hash on the disk that
is local. Someone will always pipe up about their "resource constrained
device" here, but that might require a different protocol. Trying to
accommodate every kind of device isn't a requirement.

Sometimes, you can't predict the whole file, as with streaming uploads of
videos. Then, you just do chunk integrity checks, as the end of the file is
indeterminate anyway.

I tried it on a ~170MB MP4 file I had:

% ls -l Downloads/dnb.MP4
-rw-r--r--@ 169842846 Nov 27  2021 Downloads/dnb.MP4
% shasum -a 256 Downloads/dnb.MP4
4badd94953d0364e8d44d63d5f252a08bf9e453cf8717c67fbf236f9f44c1986
 Downloads/dnb.MP4
% time shasum -a 256 Downloads/dnb.MP4
4badd94953d0364e8d44d63d5f252a08bf9e453cf8717c67fbf236f9f44c1986
 Downloads/dnb.MP4
shasum -a 256 Downloads/dnb.MP4  0.54s user 0.03s system 99% cpu 0.576 total

Seems accurate. The SHA256 did seem to take about a half second.

thanks,
Rob

* https://github.com/httpwg/http-extensions/labels/resumable-upload

On Thu, Jul 25, 2024 at 1:19 PM Rob Sayre <sayrer@gmail.com> wrote:

> On Thu, Jul 25, 2024 at 12:58 PM Lucas Pardue <lucas@lucaspardue.com>
> wrote:
>
>> Integrity using standardized  HTTP digests is described in
>> https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-resumable-upload-04#name-integrity-digests.
>> Integrity for parts, or whole is covered by the Content-Digest or
>> Repr-Digest.
>>
>> During the standardisation of RFC 9530, we did a survey and found many of
>> these upload services tend to use the Content-MD5 field to some extent,
>> which is sad because it was obsoleted by RFC 7231 due to implementation
>> inconsistencies.
>>
>
> Ah, ok. I didn't know that one got finished. This sounds like a fine
> approach. Here's the Amazon way (for other readers, sounds like the authors
> have made an informed decision):
>
>
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html
>
> Not necessarily MD5, but it can be. I think a lot of those applications
> don't really care about standardization, because they expect people to use
> their client SDKs anyway.
>
> thanks,
> Rob
>
>
>

Received on Friday, 26 July 2024 20:05:16 UTC