Re: Feedback for Resumable Uploads from Marius Kleidl on 2025-07-19 (ietf-http-wg@w3.org from July to September 2025)

From: Marius Kleidl <marius@transloadit.com>
Date: Sat, 19 Jul 2025 16:50:56 +0200
To: Austin Wright <aaa@bzfx.net>
Cc: ietf-http-wg@w3.org
Message-ID: <CANY19Nt6ZWHtvxFpBAJbYUsk=2uPj7vYwLnXhk7p4qFk_19tMA@mail.gmail.com>
Hello Austin,

thank you very much for the feedback! It's greatly appreciated. Please find
my comments inline.

On Wed, Jul 16, 2025 at 10:16 PM Austin Wright <aaa@bzfx.net> wrote:

> Hello HTTP WG and Resumable Uploads authors,
>
>
> This document moved in a very good direction since I last reviewed it! The
> fundamental design is on point, and my comments mostly relate to expanding
> the interoperability with other applications of HTTP:
>
>
>
>
> *4.2.1. Client Behavior*
>
>
> The section says "All request methods that allow content are possible.” Do
> you anticipate compatibility with QUERY?
>

Yes, QUERY would also be covered. If the content size is rather small, it
might be easier and more efficient for the client and server to just retry
the entire request without using resumable uploads. It's up to the client
to ask for resumable uploads if it deems them useful, but the server can
also choose to not offer resumability if the overhead is not worth it.
However, that's the case for any request and not just QUERY.

*5. Status Code 104 (Upload Resumption Supported)*
>
>
> Is the “Upload-Offset” field found in an Upload Resumption Supported
> response usable by the client in place of making the Offset Retrieval
> request?
>

The Upload-Offset header field in 104 responses is not usable to resume an
upload after interruptions as the resource might have received additional
representation data since it sent the last 104 response. Since upload
resources don't support overwriting representation data, append requests
cannot start before the current offset. In addition, if the client chunked
the upload into multiple requests and received a 2XX response, it can just
use the Upload-Offset header field from the final response.

In general, the offsets indicated in the interim responses are intended to
show progress to the client and let it know that the offset will never fall
below the announced value. Thus allowing the client to free associated
buffers.


> Considering that the server does not acknowledge every single byte it
> receives, how does this work if Upload-Offset in an Upload Append operation
> is expected to exactly match the current upload offset?
>

The client should obtain the current offset either from a HEAD request (if
the previous upload request was interrupted) or from the final response to
the previous upload requests (if it received a 2xx response, but didn't
finish the upload). In the latter case, the client should know anyways how
by how much the offset advanced since the entire request content was
appended.


>
> Now even if the client can’t reliably use Upload Resumption Supported
> alone to determine where to start an upload append operation, section
> 4.3.1. Client Behavior says “The client is expected to handle backtracking
> of a reasonable length”. I think this would be a good place to explain the
> Upload-Offset field in a 104 (Resumption Supported) response indicates data
> that has been committed to the upload resource, and need not be stored by
> the client for future use by this upload resource, but can be freed.
>

This is already covered in
https://www.ietf.org/archive/id/draft-ietf-httpbis-resumable-upload-09.html#section-4.1.1-4.
Is this paragraph sufficient or do you think it needs expansion?


> In general, the ability for a server to signal that a certain amount of
> the input has been committed and processed would be very useful to have
> more generally. The only similar functionality I know of is the 102
> (Processing) status, and it doesn’t actually have a way of indicating
> progress, just activity. Could this use of Upload-Offset be defined more
> generally?
>

I see how this could be useful for other applications as well, but I wonder
what the exact semantics of such progress updates would. For some
applications, these progress reports mean that the data has been fully
committed, while in other cases it could mean that the resource just
received a certain amount of data but might lose it if an error appears at
a later stage in the request. It might be worth talking about this in the
httpapi WG, though.


> *6. **Media Type application/partial-upload*
>
>
> After our early discussions I floated a substantially similar media type
> for draft-httpapi-patch-byterange, essentially being “apply the enclosed
> bytes to the range of bytes in the Content-Range header”, but I recall
> there was some push-back that a media type would rely on out-of-band
> information in HTTP header fields.
>

Yes, I'm concerned about the fact that message/byterange places the offset
inside the request content instead of transmitting it inside the header. I
don't see a reason why the offset shouldn't be put in the request header
for such requests as it avoids the need to parse the request content.


>
> It appears this document satisfies that concern by instead reading the
> `Upload-Offset` header first, and evaluating it as a condition, before
> proceeding with evaluating the contents of the body (at which point, the
> specific value of the header does not come into play).
>
>
> Such a header, a new conditional request header that tests the length of
> the existing resource, could be useful in other applications, especially
> log writing and journal synchronization applications. What if
> “Upload-Offset” for making requests was called “If-Length”?
>

Yes, the Upload-Offset request header acts as a precondition for
conditional requests. It would also be possible to name the request header
If-Upload-Offset, I presume.


>
> And likewise, the media type itself could be a generic media type that
> represents “content to append to the target”, so it could be re-used more
> generally outside the resumable uploads protocol?
>
>
>
Such a more generic media type was briefly discussed in
https://github.com/httpwg/http-extensions/issues/2962#issuecomment-2500455810,
but it didn't seem to attract much interest so far.


>
> *11. Responses to Uploads*
>
>
> Content-Location specifies the URI of the attached response body.
> Normally, it’s for Content-Type negotiation, but as a consequence of its
> definition, if it were found in a 4xx response, I would expect it to
> identify a permanent URI that I can use to reference that error later.
>
>
> Likewise, if included in 1xx, this would suggest that the client has the
> URI where it can re-request the result of the operation, whether it’s an
> error or a success. This doesn’t let you retrieve the status code, but for
> API applications that can infer success or error from the response body, so
> this may still be an improvement.
>
>
> So, to assist in recovering the response in the case the response is lost,
> would there be a benefit to mentioning Content-Location field for 104
> (Upload Resumption Supported)?
>

The latest iteration of the draft (-09) doesn't mention Content-Location
anymore as there were concerns raised that the previously recommended use
of Content-Location didn't align with the header field's semantics. Hence,
we reworded the section to be more generic in that it recommends servers to
put information that should be recoverable either in header fields or point
in header fields to resources where such information can be fetched from.


> *Are resumable upload resources resumable?*
>
>
> The answer seems to be “no,” as there is a separate, well-defined process
> for recovering from a failed upload append. However, it may still be worth
> noting something like:
>
>
> An upload resource MUST NOT itself be resumable. An interrupted request to
> an upload resource is simply retried from the “Offset Retrieval” step.
>

Good point, it would be helpful to point out that it doesn't make much
sense to make PATCH requests to upload resources resumable as well. In
theory, it should work, but in practice there is no use for this and might
just enable new abuse patterns.


>
> *Preserving Incomplete Uploads with `Incremental: ?1` and/or `Prefer:
> transaction=persist`*
>
>
> Resumable uploads are necessarily incremental messages, in that every byte
> uploaded necessarily changes the state of the server. So, would it be worth
> suggesting the use of the `Incremental: ?1` field from
> draft-ietf-httpbis-incremental-latest
> <https://httpwg.org/http-extensions/draft-ietf-httpbis-incremental.html>,
> and/or the `Prefer: transaction=persist` preference in
> draft-ietf-httpapi-patch-byterange
> <https://www.ietf.org/archive/id/draft-ietf-httpapi-patch-byterange-03.html#name-preserving-incomplete-uploa>,
> in the request headers?
>
>
>
Recommending the use of such headers would be a good idea, but I'm hesitant
about adding a dependency on another draft. We hope to wrap up work on
resumable uploads rather sooner than later and such a dependency might
cause more delays.


> Thanks,
>
>
> Austin Wright.
>
>
>
>
Thanks, Austin! Overall, your feedback reads as if you would prefer the
draft to use less upload-specific approaches (e.g. to communicate offset,
progress information and data to append). I can understand that desire but
I worry that we would end up in a rabbit hole trying to find the correct
semantics. I wonder how the working group thinks in general about such
situations. When should more generic solutions be preferred over specific
ones that are catered to concrete use cases.

Best regards,
Marius
Received on Saturday, 19 July 2025 14:51:13 UTC