draft-ietf-httpbis-resumable-upload-09: not ready: impl details

draft-ietf-httpbis-resumable-upload-09: not ready: impl details

Feedback on draft-ietf-httpbis-resumable-upload-09
https://datatracker.ietf.org/doc/draft-ietf-httpbis-resumable-upload/

Sections in doc marked with lines beginning '*'


* Introduction

"HTTP range requests ... support this concept of resumable data transfers for downloads from server to client"
Yes, some servers support HTTP Range request header for some resources.
Some servers also support partial PUT for resource uploads, but the draft does not mention the existing solution.

"backwards-compatible with conventional HTTP uploads"
What does that mean?  Is that a relic of tus-v2?
Resumable uploads purportedly aim to extend the accumulation of
a single request body across multiple requests, but the draft does
not say that concisely.

"Unlike ranged downloads, this protocol does not support transferring an upload as multiple requests in parallel."
That is an application limitation due to over-specified implementation details
and flaws in the resumable upload protocol design.  The client should manage the state of the resource through successful upload of chunks.  If a chunk fails for any reason, the chunk should be resent, perhaps in multiple smaller chunks.

* 3.1 Example 1

"The server reserves the required resources ..."
No, the server might check for resource availability, but requiring a reservation is an application implementation detail which can lead to denial-of-service.

authn/authz are mentioned only briefly in section 13. Security Considerations
but DELETE (and PATCH) should also mention authn/authz.

* 3.2 Example 2

"One use case is to overcome server limits on HTTP message content size"
That is a security concern, not a feature, and should be listed in 13. Security Considerations

"Subsequent, ..." -> "Subsequently, ..."

* 4 Upload Resource

"This upload resources is responsible for handling ..."

The server is required to maintain server-side state for the upload resource.
The upload resource does not "handle" anything.  It is a resource.

Related, the server is required to maintain server-side state including the original target.  The target is then unable to be later changed, e.g. a date in the title of the doc or blog post.  That does not matter in all case, but might matter in some.  In any case, it would be better if the target was required with Upload-Complete: ?1.  The target could still be provided in the initial request, in order for the server to do some early checks, but those checks should be repeated immediately before processing after Upload-Complete: ?1

"The server SHOULD keep the upload resource available for a reasonable amount of time after the upload is complete."
I wonder if the authors of this have written and maintained any server-side code.  This is over-specified.  If you need a transaction, then implement a transaction and job queue and the ability to check the status of completed jobs.  To increase certainty that the resource has uploaded properly, upload the resource fully with Upload-Complete: ?0.  Once complete, send Upload-Complete: ?1 as a separate request containing an empty body.  If for some reason you do not get a response, resend Upload-Complete: ?1 with an empty body until you do.  If you receive a 2xx response, you're done.  If you get an error that the upload is already complete, or that the upload resource is no longer available, then the server likely processed the result and you might verify by accessing the target resource.

"An upload resource SHOULD NOT reuse the URI from a previous upload resource ..."
It would be better to specify a positive, not a negative.  "An upload resource SHOULD be unique.  Reuse of a URI for a different upload resource SHOULD be avoided to reduce the chance of misdirected or corrupted upload resources."  This is even more important it light of recommendation earlier in doc to keep upload resources around "for a reasonable amount of time after the upload is complete."

* 4.1 State

These headings are misnamed
* 4.1.1 Offset        ->  Upload-Offset
* 4.1.2 Completeness  ->  Upload-Complete
* 4.1.3 Length        ->  Upload-Length
* 4.1.4 Limits        ->  Upload-Limit

Considering the doc states that the server is required to keep state, such as the target resource, it appears that important state items are not all clearly defined under the topic "State".

* 4.1.3 Length

"The request can include the Upload-Length request and response header field."  What is "and response" header field supposed to mean in the context of the request?

With HTTP/1.1 Transfer-Encoding: chunked and HTTP/2 and HTTP/3 framing, the lenght might never be specified.  That is okay, though its absense means that it can not be cross-checked with the upload resource size.

* 4.1.4 Limits

"The upload resource can stop the upload ..."
Do you mean the server?  The upload resource is a resource.
What do you mean by "stop"?  Abort?  Discard?  Invalidate?  Cancel?

"... the existence of a limit or its value MUST NOT change ..."
Over-specified.  Maybe SHOULD NOT.  A more intelligent server might want to supply hints or make adjustments to adapt to changing conditions.  Many of these "limits" are over-specified and might be better as hints.

"Keys with values other than defined here MUST be ignored."  Do you mean only for the keys defined here?  Or is this implying there will never be other keys?  "Unrecognized keys MUST be ignored.  Keys define here but with values other than defined here MUST be ignored."

"A server that supports the creation of resumable upload resource ... for a target URI MUST include the Upload-Limit header field with the corresponding limits in a response to an OPTIONS request sent to this target URI."  Why "MUST"?  Why is Accept-Patch not required?

Related, why create Content-Type: application/partial-upload?  That does not match the Content-Type of the temporary resource.  Is Content-Type from the original upload also required to be kept in server-side state alongside the upload resource?  (Yes.)  Accept-Patch could theoretically be extended with a new patch type, "foo" which could be defined to require and use Upload-Offset request header.

More broadly, I think that there could be many better ways to do "discovery" of the resumable-upload application rather than what is defined in the doc, and I hope more isolated, too.  An independent request to create a temporary resource would be my starting point, maybe to a target under /.well-known/.

* 4.2.1 Client Behavior

Remove first "is" in "An Upload-Complete header field is set to falise is also valid."

"the upload is complete and the response belongs to the targeted resource processing the representation"
What?  "the response belongs"?  What?  Do you mean the uploaded resource?  What do you mean by "belongs to"?

"2x (Successful) status code and not the entire representation data was transferred ..."  This occurs in multiple places in the doc.  The "not" is misplaced.  Better: "and the entire representation data was not transferred ..."

* 4.2.2 Server Behavior

"The resource targeted by this initial request is responsible for processing the representation data transferred in the resumable uload according to the method and header fields in the initial request..."

This wording is confusing with wording earlier in the doc that the upload resource is responsible for keeping state.  The upload resource must keep state, including the original target resource.  Once the upload resource is complete, only then does the server pass the complete upload resoure to the original target resource for processing.

That is quite a requirement to be so subtle!  This suggests that the server MUST save the *ENTIRE* set of request headers of the initial request alongside the upload resource, and then manipulate the request headers with final length, possibly remove Upload-*, or more, before finally sending a complete upload resource to the initial resource target for processing.

"If the request content was full-received, no resumable upload is needed and the resource proceeds to process the request and generate a response." What do you mean by "the resource"?  I presume you mean that the complete request is sent to the resource specified by the URI.  "the resource" is underspecified for a lay-person reading this.

"The server MUST record teh length according to Section 4.1.3 if the necessary header fields are included in the request."  This mistake is made multiple times in the document: "if the necessary fields" should be more precise: "Upload-Length" is the necessary field.

"The interim responses MUST NOT include the Location header field"
Over-specified.  MUST NOT should be SHOULD NOT, and/or the doc should say that Location should not be included in any 104 response other than in the initial 104 response when the upload resource is created, and that a client SHOULD ignore Location in any 104 response besides the initial 104 response for the upload resource.  Icky.  It would also be okay to say the Location of the upload resource MUST NOT change if Location is sent more than once.  A more robust approach would be a request for creation of a temporary resource with a 0-length request body, which is separate from requests sending the request body to the upload resource.

"If the server does not receive the entire request content, for example because of canceled requests or dropped connections, it SHOULD append as much of the requst content as possible to the upload resource."

This is confusing unless the doc has been read carefully multiple times.  This also conflicts with wording elsewhere in the doc which suggests that the request should be aborted immediately when there is a new request for state of the upload resource (4.6 Concurrency).  Perhaps you meant that pending data should be read without blocking and appended to the upload resource, and that the server should then abort the request instead of waiting for additional data.

Not only is this fragile, but it imposes a requirement that the server maintain state for exclusive access to the upload resource, and have a new request affect a different request -- which are big implementation requirements -- tying together multiple HTTP requests which are supposed to be independent in HTTP.
** This is a huge failure and should immediately disqualify this draft from proceeding passed last call. **

Related is the fragile behavior specified in 4.3 Offset Retrieval:
"The client MUST NOT perform offset retrieval while creation (Seciton 4.2) or appending (Section 4.4) is in progress as this can cause the previous request to be terminated by the server as described in Section 4.6."

This fragility could potentially be avoided if the client managed the overall upload resource state rather than the server.  The client could send chunks of representation data, and then resend the whole chunk on failure.  (Maybe try to recover by sending smaller chunks.)  The server would manage the upload resource per-request and apply policies per-request.

* 4.3.2 Server Behavior

"MUST include the length in the Upload-Length header field if known"
Over-specified.  If it is okay to exclude it if not known, then this should be SHOULD, not MUST.

* 4.4.1 Client Behavior

"A client can continue the upload and append representation data by sending a PATCH request with the application/partial-upload media type ..."

Why Content-Type: application/partial-upload ?
Could instead use Range request header and a proper Content-Type (or application/octet-stream) since the PATCH request is a byte range, already defined by the RFC.
With OPTIONS, could use Accept-Patch.  See the RFC 5789 for PATCH.

Same awkward wordings mentioned earlier:
"the corresponding response belongs to the resource processing the representation according to the initial request".  The response is produced by the initial target resource to which the upload resource is sent for processing.
"2xx (Successful) status code and not the entire remaining representation data was transferred" -> move the 'not' to "was not transferred"

* 4.4.2 Server Behavior

Limitation of exclusive access by one request at a time to modifying the upload resource serially and sequentially.  This is a design limitation of draft-ietf-httpbis-resumable-upload.

"If the request content was fully received, the upload is marked as complete and the upload resource SHOULD generate the response that matches what the resource, that was targeted by the initial upload creation (Section 4.2), would have generated if it had received the entire representation in the initial request.  However, the response MUST include the Upload-Complete header field with a true value, allowing clients to identify whether a response, in particular error responses, is related to the resumable upload itself or the processing of the upload representation."
This is fragile.  There should be separation of concerns.  A more robust approach is to upload data to upload resource, and then separately send Upload-Complete: ?1 with a 0-length body.
The wording of "the upload resource SHOULD generate" is awkward and wrong.  The upload resource is a temporary file that is continued (in wording) to be treated by draft-ietf-httpbis-resumable-upload as a magic application object containing the original request headers and once complete acts as an agent executing a lamda function to send the complete request to the target resource.  (Please do not reuse any of the jargon that I intentionally misused.)  The upload resource should be a temporary file, and for draft-ietf-httpbis-resumable-upload, maintains state.  A better design might be only a temporary file and nothing more.

"If the request didn't complete the upload, successful or not MUST include the Upload-Complete header field with a false value, ..."
There are many reasons a server can send a response that is 2xx.  The requirement to send Upload-Complete: ?0 might involve a special check after all error paths.  Instead, a 4xx or 5xx should assume Upload-Complete: ?0 if the header is not present.  The protocol might require Upload-Complete: ?0 for 2xx responses if the upload is not complete, but why 3xx?  A 307 or 308 should not have to check add Upload-Complete: ?0.

"The upload resource MUST record the length according to Section 4.1.3 if the necessary header fields are included in the request."
Again "Upload-Length" instead of "necessary header fields"

"These interim responses MUST NOT include the Location header field."
Over-specified again?  Why MUST NOT?  This is in section 4.4 Upload Append.  Would it be clearer to suggest that 104 responses to PATCH should not include Location?

* 11 Responses to Uploads

If the upload resource is simply a file and the client manages overall state and the client finishes the upload before sending Upload-Complete: ?1 with an empty body, then a client can send a HEAD request for size details.  Upload-Offset would be redundant.  There would be little need to keep upload resources around after Upload-Complete: ?1.  There would be little need to make queries to try to approximate "transaction status" or "history" of the upload resource state.

In the event that the client did not receive a response to request with Upload-Complete: ?1 and empty body, the client could repeat that request.

This document is overly complex, in part due to unnecessary premature optimizations like this one.  If an upload is small, the upload can be repeated, if safe.  If an upload is not small, then an extra round trip to mark the upload as complete should almost never be a performance problem.


Assessment:

draft-ietf-httpbis-resumable-upload-09 is not ready for publication.


Cheers, Glenn

Received on Sunday, 10 August 2025 00:17:46 UTC