Re: FYI: I-D ACTION:draft-dusseault-http-patch-02.txt from Lisa Dusseault on 2004-05-21 (ietf-http-wg@w3.org from April to June 2004)

From: Lisa Dusseault <lisa@osafoundation.org>
Date: Fri, 21 May 2004 12:46:48 -0700
To: Julian Reschke <julian.reschke@gmx.de>
Cc: ietf-http-wg@w3.org
Message-Id: <994F498A-AB5F-11D8-8F35-000A95B2BB72@osafoundation.org>
On May 21, 2004, at 3:14 AM, Julian Reschke wrote:

> Comments on the latest draft:
>
> >    Note that byte ranges are already used in HTTP to do partial
> >    downloads (GET method).  However, they are not defined for 
> uploads,
> >    and there are some missing pieces for uploads.  For example, the 
> HTTP
> >    specification has no way for the server to send errors if the byte
> >    range in a PUT is invalid.  Byte ranges (or some other kind of 
> range)
>
> The server could always send a 4xx status code. Am I missing something?

Yes, the server could, but HTTP doesn't define this as a way to respond 
to partial upload requests, so clients might find it hard to 
interoperate well.  Anyway I think it's moot, unless we decide to 
design a partial upload request standard based around byte-range 
headers -- I think sending a diff is far preferable since diff formats 
are already well understood, used in HTTP, and more powerful.

>
> >    This standard defines the new method PATCH to alter a single 
> existing
> >    resource, in place, by applying a delta or diff file.  The 
> operation
> >    is atomic.  Note that WebDAV MOVE and COPY requests, if supported 
> by
> >    the HTTP server, can be useful to rename or copy a different 
> resource
> >    before applying PATCH.
>
> I'm not sure how this is supposed to work. It may be an interesting 
> feature, but to fully define it, the spec will need to say more about 
> that topic (I personally feel that overloading COPY and/or MOVE with 
> content modification operations would be a bad idea...).

This is just pointing out that COPY and MOVE are useful -- this doesn't 
overload them or even explain how this works.  COPY and MOVE are fully 
defined in RFC2518 and are independent from PATCH.  The reason these 
are mentioned, is because somebody asked if PATCH could be used to get 
the source from one URL, but place the result in another URL -- and 
COPY or MOVE combined with PATCH can achieve nearly the same thing 
without any new specification required.  I will however try to make the 
text a little clearer in that MOVE and COPY can be used as is, 
independently of the PATCH proposal, to rename or copy an entity before 
applying PATCH to it.

>
> > 2.  Mechanisms
> >
> > 2.1  PATCH Method
> >
> >    The PATCH method requests that the request body (a patch 
> document) be
> >    applied to the resource named in the Request-URI.  The resource 
> named
> >    in the Request-URI MUST already exist (the server MUST NOT create 
> a
> >    new resource with the body of the PATCH method).  The target
>
> If one of the use cases for PATCH is creating sparse resources, it 
> would make sense to allow PATCH to create new resources.

I can't think of any benefit over using PUT.  It only makes PATCH more 
complicated to define how it acts on new resources.  Note that the 
client is perfectly capable of using PUT to create an empty resource, 
and diff formats are perfectly capable of applying a diff to an empty 
resource -- but I would think that generally clients would prefer to 
PUT at least a significant chunk of the resource content when creating 
the resource.

>
> >    The PATCH request MUST have a body.  It MUST include either the
> >    Content-Type header with a MIME type value indicating what the 
> body
> >    type is, or an IM header as defined in RFC 3229 [4], to identify 
> the
> >    delta format.  The IM header is only for gdiff and vcdiff, which 
> are
> >    instance manipulations, and the Content-Type header is for delta
> >    formats which have registered MIME types.  Either way, the 
> identified
> >    format MUST have the semantics of defining a change to an existing
> >    document (such as gdiff).
>
> What if it has both?

I'll address this, thanks.


>
> > 2.2  PATCH Response
> >
> > 2.2.1  Success Response
> >
> >    The basic success response code for PATCH is 204 No Content.  For
> >    this new method, 200 OK is not used because 200 OK implies a body 
> in
> >    the response, and 201 Created is not used because the resource 
> must
> >    already exist.
> >
> >    The server SHOULD provide a MD5 hash of the content after the 
> delta
> >    was applied.  This allows the client to verify the success of the
> >    operation.  The PATCH method obviously MUST cause the ETag to 
> change.
> >    So, if the server supports ETags, the server MUST return a strong
> >    ETag for use in future client operations.  If the server does not
> >    support strong ETags, then the server MUST return the 
> Last-Modified
> >    header instead.
>
> 1. "if the server supports ETags" -> "if the server supports strong 
> ETags".

OK

>
> 2. I'd say that Last-Modified should be returned always.

OK -- there are a few Web servers that don't keep a last-modified date 
at all, but I can at least make this a SHOULD.

>
> 3. Thought: there may be servers without good ETag support (for 
> instance when using filesystem-based backends). If we require the 
> server to compute MD5 checksums, we can also require it to check the 
> MD5 of the given entity *before* the modification. In WebDAV speak, 
> that would be a new state token for the "If" header; however for a 
> spec that doesn't require WebDAV as base protocol, we may have to 
> define a new request header, such as "If-Content-MD5-Match". This 
> would make PATCH extra-safe, and we wouldn't need to worry ETag 
> support.

This would be a great extension in general, completely independent of 
PATCH, and I would support such an extension.

>
> > 2.2.2  Error handling
> >
> >    This proposal uses the same mechanism as DeltaV to add much-needed
> >    info to base HTTP error responses.  Existing HTTP status codes are
>
> (reference missing)

HTTP reference appears in the Introduction.  I added a DeltaV reference 
to the introduction as well, where DeltaV is first mentioned.  I don't 
know how I overlooked that!

>
> >    not infinitely extensible but XML elements and namespaces are more
> >    so, and it's simple to treat the HTTP error code as a rough 
> category
> >    and put detailed error codes in the body.
> >
> >    The PATCH method can return the following errors.  Please note 
> that
> >    the notation "DAV:foobar" is merely short form for expressing "the
> >    'foobar' element in the 'DAV:' namespace".  It has meaning only in
> >    English, not on the wire.  Also note that the string error codes 
> are
> >    not meant to be displayed but instead as machine parsable known 
> error
> >    codes (thus there is no language code).
>
> I think this spec shouldn't use the "DAV:" namespace. Also, if this 
> spec adopts RFC3253-style error handling, it should stay consistent 
> with RFC3253 (in RFC3253, the names identify *conditions*, not 
> *errors*; thus the individual codes need to be reversed, such as 
> "delta-format-supported" instead of "delta-format-unsupported").

I suppose we could define a new namespace just for PATCH; but you're 
the only one who's mentioned this so I don't think it's likely to be 
much of a problem.  Since error codes are simply identifying strings, I 
see no problem with using identifying strings which make sense to the 
reader/developer/debugger.

>
> >    DAV:delta-format-unsupported: Used with 501 Not Supported status
> >       code.  Returned by the server when it doesn't support the delta
> >       format chosen by the client.
>
> I don't think we want 501 here:
>
> "The server does not support the functionality required to fulfill the 
> request. This is the appropriate response when the server does not 
> recognize the request method and is not capable of supporting it for 
> any resource." 
> <http://greenbytes.de/tech/webdav/rfc2616.html#status.501>

OK, that's a fine point, we can use something else.  I wish 501 
Unsupported were less specific though!

>
> >    DAV:delta-format-badly-formatted: Used with 400 Bad Request when 
> the
> >       server finds that the delta document provided by the client was
> >       badly formatted and non-compliant.
>
> Suggestion: be more precise about what "badly formatted" means.

OK.

>
> > 2.3  Delta Formats
> >
> >    A set of changes for a resource is itself a document, called a 
> change
> >    document or delta.  The delta format is uniquely identified 
> through
>
> Make it a definition; and use the same term (I'd prefer "delta 
> document") everywhere.

A matter of readability; I think the draft is already pretty readable 
here, because we're not defining any special meaning for the word, it's 
already well used in other context.

>
> a
> >    MIME type.  Servers advertise supported delta mechanisms by
> >    advertising these MIME types, and clients specify which one 
> they're
> >    using by including the MIME type in the Content-Type header of the
> >    PATCH request.
>
> What about the "IM" header?

Good catch.  I'll fix that.

>
> >
> >    The VCDIFF [5] format identifier is "vcdiff".  The GDIFF [6] 
> format
> >    identifier is "gdiff".  Servers SHOULD support VCDIFF for all 
> static
> >    resources.
>
> What's the relation of the aforementioned identifiers and the MIME 
> type? And what is a "static" resource?

Clarified.  I'll change "static" to "resources with static content, 
that is, content that is not generated dynamically."  This brings the 
text in line with the wording used in RFC2616.

>
>
> > 2.4  Advertising Support in OPTIONS
> >
> >    The server advertises its support for the features described here
> >    with OPTIONS response headers.  The "Allow" OPTIONS header is 
> already
> >    defined in HTTP 1.1  to contain all the allowable methods on the
> >    addressed resource, so it's natural to add PATCH.
>
> Indeed.  In fact this is so obvious that I think it shouldn't be said 
> at all.

Yeah -- a lot of the things you've asked for clarifications in this 
email are "obvious" to other people!  Obvious is *definitely* in the 
eye of the beholder.  This clarification/explanation is in there 
because of previous discussions where this obviously wasn't "obvious" 
to everybody (including, I think, me.)

>
> >    OPTIONS * presents a bit of a special case, as it does not 
> address a
> >    resource, and does not always show all the server's features.  In
> >    responses to OPTIONS *, a server supporting this specification 
> SHOULD
> >    include the PATCH method in the "Allow" header and SHOULD show the
> >    union of all supported diff methods in the "Accept-Patch" header.
>
> This SHOULD level requirement in general can not be implemented using 
> the Java servlet API; it's also very unclear how it actually helps the 
> client in any way, as the supported delta formats may vary across 
> resources. Please explain the rational.

The rationale for putting this in is that a lot of servers *do* 
indicate server features in the responses to OPTIONS *.  This helps the 
client because it indicates they can probably cache the feature support 
information and may not need to do OPTIONS on every resource.  Some 
clients do make OPTIONS * requests.

It's not a MUST because as you point out, some servers cannot do this.  
This tension between the desirability of OPTIONS * and the 
infeasibility of it is a tension that I do not intend to resolve in 
this specification.

>
> >    Example: OPTIONS request and response for specific resource
> >
> >
> >        [request]
> >
> >        OPTIONS /example/buddies.xml HTTP/1.1
> >        Host: www.example.com
> >
> >        [response]
> >
> >        HTTP/1.1 200 OK
> >        Allow: GET, PUT, POST, OPTIONS, HEAD, TRACE, DELETE, PATCH
> >        Accept-Patch: example/xcap+xml
>
> Shouldn't "vcdiff" appear here?

I'm of two minds about this.  It might be undesirable for servers to 
support 'vcdiff' if they support something that's better in this 
particular circumstance.  I guess it's better for the example to show 
what you'd see if the server followed every SHOULD.

>
> >    [6]  van Hoff, A. and J. Payne, "Generic Diff Format 
> Specification",
> >         August 1997.
>
> URL?

It's funny -- it's in the XML source, but the generator didn't see fit 
to put the URL in the text output.  I guess that URLs are just not 
required in the text version.

>
>
> Best regards, Julian


Thanks for the review.

Lisa
Received on Friday, 21 May 2004 15:47:14 UTC