Re: [google-gears-eng] Re: Deploying new expectation-extensions from Julian Reschke on 2008-09-15 (ietf-http-wg@w3.org from July to September 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 15 Sep 2008 15:13:50 +0200
To: Mark Nottingham <mnot@yahoo-inc.com>
CC: Charles Fry <fry@google.com>, gears-eng@googlegroups.com, Alex Rousskov <rousskov@measurement-factory.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <48CE5F8E.5070803@gmx.de>
Mark Nottingham wrote:
> On 12/09/2008, at 5:25 PM, Julian Reschke wrote:
> 
>>>> As far as I can tell, if you use the ETag based approach, and 
>>>> multiple clients try to post to the same collection (POST URI), then 
>>>> you'll have to disambiguate the requests. That problem would go away 
>>>> if each of them would use a different URL.
>>> I read the doc as saying that the server would provide unique ETags 
>>> somehow...
>>
>> Disambiguating by ETag probably would work, but that doesn't feel 
>> right to me. If multiple resumable transfers can be in progress at the 
>> same point of time, then this really sounds like multiple resources 
>> (thus multiple URIs), not multiple variants of the same resource to me.
> 
> 
> Huh. That's very revealing, I think (if unintentional :) POST can 
> already create a new resource with a new, server-selected URI, and the 
> pattern for doing so is already described with POST, 201 and Location.

Of course POST can do that. Did anybody argue something else?

> Question: If I want to make this sort of request resumeable, do I do this?
> 
> REQ: POST /a
> REQ: Content-Range: bytes */100
> RES: 308 Resume Incomplete
> RES: Location: /b
> 
> REQ: POST /b
> REQ: Content-Range: 0-100/100
> REQ: [bytes]
> RES: 200 OK
> 
> or this?
> 
> REQ: POST /a
> REQ: Content-Range: bytes */100
> RES: 308 Resume Incomplete
> RES: Location: /z
> 
> REQ: POST /z
> REQ: Content-Range: 0-100/100
> REQ: [bytes]
> RES: 201 Created
> RES: Location: /b
> 
> ?
> 
> The important part here is: is this protocol defining a "temporary" 
> resource (with a very specific interface) for the Location in a 308 
> refers to, or is the Location in a 308 referring to a "regular" resource 
> that's used for more than that?

If I understand correctly, in the first example the server immediately 
assigns the URI for the resource-to-be-created, let's the client know 
it, and lets it transfer the remaining bytes to that resource. In this 
case, the server would need to make sure that this resource is only 
available to the client until the transfer is completed.

In the second case, the server assigns a temporary URI which is just 
used to complete the transfer. Once that's done, the "final" resource is 
being created. This looks similar to what Roy proposed in 
<http://lists.w3.org/Archives/Public/ietf-http-wg/2008AprJun/0082.html>.

I think the second approach is more versatile, because it also covers 
cases where the server wouldn't create a new URI upon POST.

> It's interesting to note that the second approach (with the temp 
> resource) preserves the 201 status code in the interchange, while in the 
> former approach, it's not there (308 usurps it).
> 
> Now look at it with PUT (to a not-yet-existent resource);
> 
> REQ: PUT /a
> REQ: Content-Range: bytes */100
> RES: 308 Resume Incomplete
> RES: Location: /b
> 
> REQ: PUT /b
> REQ: Content-Range: 0-100/100
> REQ: [bytes]
> RES: 201 Created
> RES: Location: /a
> 
> Here, if we use URIs, /b *has* to be a "temporary" resource with a very 
> specifically defined behaviour; it accepts PUTs and has a side effect of 
> having its bytes copied to /a (presumably when the final 201 is sent).

Yes.

BTW: it wouldn't necessarily be PUT; it could be anything that allows 
"appending", such as POST or PATCH.

> My point here is that there are actually some pretty deep differences 
> between the URI approach and the ETag approach; the URI approach is much 
> more intrusive and needs to be specified in a different way (e.g., 
> talking about what methods to use, the nature of the resource created, 
> etc.).

Well, not entirely.

Let's say you've got three concurrent resumable transfers started to /a, 
and the server has assigned the etags "C1-1", "C2-1" and "C3-1" to the 
three clients.

Client 1 starts its upload:

REQ: POST /a
REQ: If-Match: "C1-1"
REQ: Content-Range: 0-50/100
REQ: [bytes]
RES: 200 OK
RES: ETag: "C1-2"

...what I'm concerned if is that we're essentially introduce ETag-based 
variant-selection here.

So what do these ETags represent in requests *other* than resumable 
uploads, such as in:

REQ: GET/a
REQ: If-Match: "C1-1"

?

Will it have an effect?

So just avoiding new URIs may look simpler first, but it also requires 
additional specification work.

> Back to your comment;
> 
>> Disambiguating by ETag probably would work, but that doesn't feel 
>> right to me. If multiple resumable transfers can be in progress at the 
>> same point of time, then this really sounds like multiple resources 
>> (thus multiple URIs), not multiple variants of the same resource to me.
> 
> I don't know that I agree; with PUT, it's very natural to use ETags (you 
> avoid creating the temporary resource, and have the option of 409'ing 
> any concurrent PUTs after the first), whereas with POST, you're just 

That assumes that PUT with Content-Range can be used today, which really 
isn't the case, unless the client can be confident that the server 
actually understands PUT with ranges.

> pushing the assignment of a final identifier for the created resource 
> until the entire request entity is received (which is the case with the 
> URI-based approach anyway, unless you're arguing that POST is a special 
> case and *doesn't* create a temporary resource, unlike PUT), and you 
> still have the option of not assigning it any identity (just as many 
> POST processors do today).
> 
> So, I'm firmly leaning in the direction of the ETags-only approach now; 
> I think the selection of a URI for created resources is separable, and 
> should be separate.

I agree with that part; the URI assigned for the upload really should be 
temporary.

With that, the approaches are almost identical; in both cases unique 
identifiers are minted (ETags or URIs), the server needs to deal with 
house keeping, and the impact of other methods must be understood and 
specified.

BR, Julian
Received on Monday, 15 September 2008 13:30:55 UTC