- From: Glenn Strauss <gs-lists-ietf-http-wg@gluelogic.com>
- Date: Mon, 11 Apr 2022 06:41:09 -0400
- To: Guoye Zhang <guoye_zhang@apple.com>
- Cc: Eric J Bowman <mellowmutt@zoho.com>, Julian Reschke <julian.reschke@gmx.de>, ietf-http-wg <ietf-http-wg@w3.org>
The tus-v2 specification reads to me as an application. Much thought and effort has been put into this application, but it is an application with a protocol for client and server. Guoye Zhang seems to be proposing tus-v2 to implement "upload (resumable) *transactions*" and define new behavior that the server must track and handle partial uploads and be able to report the sparse areas back to the client. For that, the rsync protocol, among others, already exists, but is this necessary? (For parallel uploads, it might be, though other protocols like zchunk may help client discovery of server-side state.) Large incremental uploads are already achievable using servers which support partial-PUT, including SabreDAV and lighttpd mod_webdav: upload file serially in chunks, extending the file with each chunk. Cancellation is achievable with WebDAV DELETE. The client can choose an alternate filename while uploading, and then use WebDAV MOVE to rename the file into place once the upload is complete. Support for partial-PUT is a non-standard extension to WebDAV, at least in part due to implementation tradeoffs (resource usage, lock timeouts, impacting UX). tus-v2 may be trying to address this. A robust PUT implementation (whole file replacement) allows downloads to proceed while a new version is uploaded. This may be mapped onto typical filesystems by uploading to a temporary file and atomically renaming into place when the upload is complete. WebDAV LOCK can be used to ensure only one uploader at a time. When making changes to a file, whole file replacement works well for small files and often well-enough for medium-sized files. It is the case of large files that network bandwidth, reconnection and re-upload costs, server disk space, and other resource usage might have a larger impact. On modern filesystems with support for cloning, a server might be able to clone the file extents into a temporary file, PATCH a portion of the file, and then atomically rename the new file into place. This could be done with PATCH or with partial-PUT. For large files, enforcing use of WebDAV LOCK is recommended to avoid excessive numbers of large temporary files, especially if copying a large file to a temporary file instead of cloning. Another solution for PATCH-like behavior is to use DVCS protocols, e.g. git, and serve the completed files from a repository working copy. From my reading of the tus-v2 spec, only parallel upload is not addressed by the solutions above. Parallel upload and file reconstruction is something that is currently achievable by application-specific implementations, including tus-v2, and potentially by some DVCS. Eric Bowman makes numerous excellent points in prior messages, and I would like to repeat one: Eric J Bowman wrote: > Unless you're coding an endpoint instead of a resource, in which > case the only help I can offer you, is to think in terms of resources > not endpoints. > Indeed! In your example /upload is a tightly-coupled RPC endpoint; > if the request body is the file content why are you using POST instead > of PUT? 'HEAD/resource' lets the server know exactly which "upload" is > referenced: if it was interrupted, the server knows it, and responds > 206. Because REST. If tus is aimed at /upload, an endpoint, then in my mind tus is an application handling that endpoint. Given the alternatives mentioned above for handling resources, I do not see why a web server would implement tus as an HTTP standard when end-users can configure tus as an application running behind a web server to handle configured endpoints such as /upload. As others have pointed out, there is room for improvement in PATCH, e.g. defining a new media type and associated behavior for PATCH. Mark Nottingham wrote: > PATCH intentionally leaves everything up to the media type of the > PATCH request, not the implementation. With hindsight, at least one > or two well-defined PATCH media types should have been defined at the > same time as 5789 - their absence (especially JSON's) created a lot > of confusion. Eric J Bowman wrote: > I think you and mnot are correct that we need better-defined PATCH > media types, I believe that's where to solve this problem, but how > any media type is rendered has traditionally and properly been a > client-side concern in HTTP. > I don't think anyone has even meant to imply that this isn't a > problem worth solving. That being said... 20 years ago we figured > _every_ upload was *replaceable* on failure, while realizing PATCH > would increase in value hand-in-hand with filesize over time. > Reckoning day has arrived! ;) Thanks for your contribution, and I > mean that, otherwise I wouldn't bother. Cheers, Glenn
Received on Monday, 11 April 2022 10:41:31 UTC