Re: Draft for Resumable Uploads from Austin William Wright on 2022-04-06 (ietf-http-wg@w3.org from April to June 2022)

From: Austin William Wright <aaa@bzfx.net>
Date: Wed, 6 Apr 2022 13:08:16 -0700
To: Eric J Bowman <mellowmutt@zoho.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, Guoye Zhang <guoye_zhang@apple.com>, ietf-http-wg <ietf-http-wg@w3.org>
Message-Id: <54BB33F9-DF98-48DB-BA2B-C8A63208BA21@bzfx.net>
On Apr 6, 2022, at 04:06, Eric J Bowman <mellowmutt@zoho.com> wrote:
> 
> >
> > So then I think a simple modification of my “Partial Uploads”
> > document (draft-wright-http-partial-upload-01 <https://tools.ietf.org/id/draft-wright-http-partial-upload-01.html>) would work 
> > well.
> >
> 
> Kudos for how well you've written your document. But please don't suggest using PATCH to create a resource? Best decision I ever made regarding httpd coding was only 3 years or so ago, I eschewed Windows compatibility in favor of tight coupling to UNIX filesystem attributes. Speaking in terms of static files for simplicity...

Explain for me how a Unix filesystem makes PATCH impractical?

> DELETE on my httpd doesn't remove the existing resource, it sets it to zero bytes, such that subsequent GET/HEAD requests respond 410, and may or may not Allow: PATCH. No file? 404, which may Allow: PUT. I can't help but think that a PATCH request to a nonexistent resource should return 404.

I think it will make more sense if you think of PATCH as a superset of PUT.

PATCH, just like PUT, creates the resource if it doesn’t exist. Consider how these two requests are similar:

PUT /upload-target HTTP/1.1
Content-Length: 400
Content-Type: text/plain

[400 bytes...]

And

PATCH /upload-target
Content-Type: message/byterange

Content-Range: 0-399/400
Content-Type: text/plain

[400 bytes...]

The effects will be identical.

The advantage of the PATCH form is that you can start writing at a non-zero offset, or create the file while indicating it’s larger than the enclosed body. PUT cannot do these.

> >
> > First, it defines the message/byterange media type, 
> > for making changes to a specific byte range. This is the 
> > bulk of the desired functionality, I think.
> >
> 
> I would encourage you to document that media type independently from the rest of your draft.
> 
> >
> >---------------
> Second, 2__ Sparse Resource would indicate that the resource has some regions filled in by the server, and might not be valid according to the media type definition. But I’m not confident that all user-agents would safely handle a 2__ Sparse Resource. If the resource represents executable code, the result could be very bad. Maybe I remove this? It seems wrong to me that a server could send back a document it knows is invalid according to the media type. Maybe there should be an error for clients that request a sparse resource without indicating they can support the response.
> >---------------
> >
> 
> Hmmm... I'm a bit rusty at explaining this, but here goes. Media-Type in HTTP has nothing to do with file format. My server intent may be to "view source" of an HTML file, so I use text/plain to prevent it from being parsed and rendered as HTML. My server intent may be to present a broken PNG, so I use image/png to have the client attempt to render it as such. Being a valid representation of said media type is not implied, and I assure you there's nothing wrong with that. If a UA fails to render, it reports the error to the User i.e. "broken image icon". Application-layer error, not protocol-layer error.

Well, the media type tells you how to decode the response. Yes, if you receive an image/png that is half zeros, it might render (but appear corrupt to a human). But how would the user-agent know this isn’t intentional? How would it know the file is actually incomplete?

It sounded like we would avoid this by responding with a 206 Partial Content response that excludes undefined bytes. I like this solution.

> 
> I think you've gone astray by essentially modeling "sparse" as a subtype of any given media type. Or qualifying a resource ('s representation) as being the result of executable code -- excluding static files of expected types, 99% of my resources map to server-side processes. Receiving a borked representation has no effect on the executable, because the server-side resource coding is decoupled from its representations, and hidden behind a Uniform Interface.

I didn’t intend to suggest this.

Suppose I am uploading an executable binary, how would I ensure that users cannot download it while it’s still incomplete? (Verifying a checksum would prevent major problems, but how would a Web browser know that the resource is incomplete?)

Thanks,

Austin.

> 
> -Eric
> 
>
Received on Wednesday, 6 April 2022 20:08:32 UTC