RE: Some problems with the WebDAV protocol (part 2: DELETE) from Jim Whitehead on 1999-04-25 (w3c-dist-auth@w3.org from April to June 1999)

From: Jim Whitehead <ejw@ics.uci.edu>
Date: Sun, 25 Apr 1999 16:32:21 -0700
To: Yoram Last <ylast@mindless.com>
Cc: WEBDAV WG <w3c-dist-auth@w3.org>
Message-ID: <001101be8f73$ddbb4040$d115c380@ics.uci.edu>
Yarom,

Here's part 2, discussion on your rebuttal to my specific points about
DELETE.

> > Just to set some groundwork here, if a WebDAV server executes a
> DELETE on a
> > collection, and the delete is completely successful, then the
> response code
> > should be a 204 (No Content), although 202 (Accepted) and 207
> (Multi-Status)
> > are also acceptable.  I think we agree there are no interoperability
> > problems in this case.
>
> In principle, there shouldn't be interoperability problems, but
> in practice, a poorly designed HTTP/1.1 client may very well
> encounter problems if it gets a 207.

Of course, if the existence of a single non-compliant client were to veto
the definition of new status codes, this would mean that HTTP is no longer
extensible along a pre-defined axis of extensibility.

> > On the other side, if a WebDAV server executes a DELETE on a
> collection, and
> > the delete completely fails, then the response code should be a 404 (Not
> > Found) or a 403 (Forbidden), depending on what caused the problem (404 -
> > nothing was there to delete, while the 403 could handle access control
> > problems).  For a complete failure, the 207 (Multi-Status) should not be
> > returned (although I will admit that, upon reading RFC 2518, this latter
> > point is probably not clear, and should be clarified in future drafts).
>
> I agree with the 404 in the event that the destination does not exist.
> However, RFC 2518 clearly states: "If an error occurs with a
> resource other than the resource identified in the Request-URI then the
> response MUST be a 207 (Multi-Status)." Now clearly, if there are any
member
> resources (let alone all of them) that could not be deleted, then an error
> occurred with a "resource other than the resource identified in the
> Request-URI", and so I should return a 207.

Good point.  This requirement is contrary to our intent, which was that if
all resources were affected by the same error, just a single HTTP status
code for that error need be reported.  I'll add an item to the WebDAV
Distributed Authoring Protocol issues list for this, since this should be
cleaned up when we go from Proposed to Draft Standard.

(The issues list is at:
http://www.ics.uci.edu/pub/ietf/webdav/protocol/issues.html)

> The only case I can return a 403 is if everything was
> deleted except for "the resource identified in the Request-URI."
> Now I don't see how you can say that something else is supposedly written
> here, and particularly, that the spec provides for returning a 403 in case
> that there where any internal members that where not deleted.

This is another good point, since if all other resources were deleted, but
the Request-URI was not, then I would expect a Multi-Status response.  The
current language does not state this (in fact, it indicates a Multi-Status
should not be returned), thus I will also add this to the issues list for
cleanup between Proposed and Draft Standard.

>
> > Assuming a DAV server does not return a 207 for this case, then complete
> > failure should also not generate any interoperability problems
> with HTTP/1.1
> > delete.
> >
> > So, any potential problems would be with a partial completion of the
delete
> > operation.
>
> Does the 'rm' command on any UNIX system has a switch that would enable
> it to run in a mode where it would issue a warning if and only if *all*
> target files failed to get deleted, but would remain totally silent if
> *some* failed to get deleted? How about its DOS counterpart? Or have you
> ever seen a single file manager that would even have an option to respond
> to directory deletion in this way? Or in fact a file manager that would
> not give an error message in case that the deletion of even a single file
> in a directory failed? Or maybe it is valid for a web server to respond to
> a PUT request with a 204 in case that it only got %60 of the
> 'Content-length' of the file? (It mostly succeeded, didn't it?)
> If some of the request failed, then it is an error that should be noted
> as such. It is a fundamental principle of virtually any program in
> existence that provides similar functionality. Accordingly, getting
> appropriately corresponding status codes is also a design assumption of
> any existing HTTP/1.1 application. The philosophical question of whether
> "partial success" is "success" or "failure" is irrelevant. The fact is
> that your decision to use a single multistatus code was made under the
> explicit assumption that this code would not be sent to clients that are
> not supposed to understand it, and the fact that your protocol now
> specifies otherwise, means that it has an interoperability problem with
> HTTP/1.1 applications.

You are stating that sending a 207 to a client that doesn't understand this
status code would cause an interoperability problem for that client.  You
seem to think I disagree with this point.  I do not.

> > For this case, let me note:
> > 1) The behavior of a DELETE on a collection in HTTP/1.1 is
> problematic for
> > file-based servers.  If a DELETE is issued to a resource which has a URL
> > which ends in a slash "/" (e.g., "testdir/"), and there are
> other resources
> > which have URLs which add a path onto this slash (e.g,
> "testdir/one.html",
> > "testdir/two.html"), HTTP/1.1 doesn't give any guidance as to
> what should be
> > done with the resources which have URLs which come after the slash.  It
> > seems to me that, for the same reason that filesystem-based
> servers create
> > intermediate paths, these same servers would want to delete the
> resources
> > which have "slash plus path" URLs. This leaves servers with the
> choice of
> > either a) deleting the collection, plus the "slash plus path"
> resources, b)
> > doing nothing (reporting an error), or c) internally marking
> the collection
> > as removed, and not affecting the "slash plus path" resources.  My
> > interpretation of the HTTP spec. is that either (b) or (c) is what was
> > intended by the spec., but I wouldn't be surprised if a filesystem-based
> > server has implemented (a).
>
> As I explained in the long discussion above, it is outside the scope of
> HTTP/1.1 to specify any given behavior here. Saying that "The behavior of
> a DELETE on a collection in HTTP/1.1 is problematic" is a misunderstanding
> of HTTP and how this particular HTTP method is used (and should be used)
> in existing (or future) HTTP applications. I know implementations that
> do either (a) or (b) (and (a) seems to be more popular). I never saw (c).
> But anything that would make sense in the context of any given
> implementation
> is legitimate. Obviously, a user (or client) would need to know in advance
> how a particular server implements this method in order to make safe and
> effective use of it. This basic freedom does not contradict the possible
> existence of common practices (or even written standards) that would limit
> the actual types of implementations that one encounters to some
> finite set of options (or even a single common option), but it is
> important to understand that this would be inherently outside the
> scope of HTTP itself.

Or, alternately, that DELETE in HTTP/1.1 is just plain underspecified.

> > For reference, mod_put's implementation of DELETE does (b),
> since it will
> > attempt to perform an unlink() on the directory, which will
> fail, causing it
> > to report a 403 (Forbidden). It's hard to tell whether this is
> intentional,
> > or if the implementor didn't consider that a directory could be
> removed.  At
> > the very least it is suggestive that HTTP/1.1 delete on a
> collection is rare
> > enough not to be worth implementing.
>
> This is a minimal (maybe just trying to be as safe as possible)
> implementation of DELETE. Netscape servers and AOLserver do your
> (a) (namely, the same behavior as specified by WebDAV). To the
> extent that a common practice exists here, I believe it is your (a).

This is almost certainly counter to the design of HTTP, which envisioned the
URL to resource mapping as a giant hash table.  According to this view,
DELETE should only affect a single entry in the hash table, and perhaps also
destroy the persistent state associated with the resource, but there is
nothing in my reading of DELETE which would permit a server to do more than
this.  It is only the implementation imperatives of trying to map resources
into a hierarchical file system which leads to the desire to have a DELETE
on one URI affect other URIs as well.

>
> > 2) Though you've brought it up already, I do think it is worth
> stating again
> > that HTTP/1.1 clients, if designed correctly, should not depend on *any*
> > state change occurring on the server as the result of a delete.
>  As HTTP/1.1
> > states, "The client cannot be guaranteed that the operation has
> been carried
> > out, even if the status code returned from the origin server
> indicates that
> > the action has been completed successfully."
>
> This is a patently absurd interpretation of both HTTP/1.1 and the
> practical implications of returning faulty status codes. I don't
> even know what you mean by "should not depend on *any* state change
> occurring on the server" (do you?).

As a matter of fact, I do know what I mean when I say "state change".  A Web
resource in HTTP/1.1 has associated with it some persistent state.  In a
file system implementation, this persistent state is held in one or more
files.  For a simple, static Web page, the persistent state is a single
file, the file which contains the HTML.  For a CGI script, the persistent
state is the source code for the CGI script.  For a resource which is
subject to content negotiation on natural language, the persistent state may
several files, one for each natural language variant.

For a server which is implemented on a database, persistent state of a
resource is mapped to a database entry.  For an embedded server, the
persistent state of a resource might be a sequence of bytes in an EPROM.

Thus a "state change" is a modification to the persistent state of a
resource.

> Those Clients that I'm familiar with that use DELETE (like
> AOLpress and Netscape's Web Publisher)

AOLpress is a poor choice here, since it was essentially written to work
against a single server, AOLserver.

> would remain silent if they get
> a success code (indicating success to the user in the most commonly used
> way), and they will pop up a window with an error message if they get an
> error response. They will also adjust their display of the site's "file
> system" to reflect the change that they think (based on the server's
> response) took place. So they will simply convey to the *user* the wrong
> information regarding the outcome of his actions, and they will also keep
> displaying a faulty map of existing (or available) resources.

If these clients refresh their view of the remote site without first asking
for a listing of the remote site's contents are susceptible to trouble from
HTTP/1.1 servers as well as from WebDAV servers.  A well-written client
should query the remote server for a file listing after a DELETE if their
future actions depend on having an accurate view of the remote server.  The
WebDAV Explorer client always does a PROPFIND after a DELETE for this
reason.

> Having a
> server that returns bogus codes, is exactly the same as having your
> operating system returning bogus codes for system calls like unlink().
> Your file manager (or rm command) would indicate to you that the command
> was successful in cases where it was not. Do you "depend" in any way on
> getting the correct indication regarding the success of such commands?
> The theoretical possibility that on some HTTP/1.1 compliant servers
> success need not equal confirmed deletion of the resource is about as
> relevant to this point as to ask how final an "unlink()" really is.
> Depending on your OS and file system it may have different meanings in
> different situations, and it is usually *not* a true deletion of the
> resource. But, regardless of that, there is a clear notion of when it
> succeeds or fails, and if you return the wrong code, the program conveys
> *false information* to the user. Besides, it will be particularly
> irrelevant on WebDAV servers since they are not allowed to have
> mechanisms to delay or override deletion. A WebDAV server knows at the
> time of responding the exact status of things, and it will be informing
> HTTP/1.1 clients (which in practical terms means the users of these
> clients) that their request succeeded in cases where it failed.

Let me note that:

a) I really, really do understand what you're trying to express with your
analogy.  I actually do understand your point.  Like the PUT case, we differ
markedly in our view of how serious this problem actually is, and what its
likely impact will be on downlevel clients.

b) Your analogy does not hold because the definition of unlink() does not
state that a success code will be returned  for anything other than a
removal of a hard link to the persistent state of the file.  In HTTP, the
definition of DELETE states that this is possible.  The two definitions are
different, thus the two situations are not truly analogous.

>
> > So, since an HTTP/1.1 client cannot depend on the response code
> to a DELETE,
> > and since the existing definition of DELETE is ambiguous for
> collections,
> > and since existing implementation practise suggests that delete on a
> > collection is an infrequent (perhaps even never executed) operation for
> > HTTP/1.1 clients:
>
> You are so very very wrong. AOLpress might not be a purely HTTP/1.1
> application, but it uses HTTP/1.1 semantics of DELETE. Now if AOL would
> want to add WebDAV support to their servers (like to their PrimeHost
> hosting service) they would clearly be facing a conflict. Exactly the
> same thing holds for Netscape`s Enterprise server. It provides WebDAV-like
> functionality through implementing a whole zoo of its own HTTP methods,
> and is so designed to provide authoring capabilities using it's own Web
> Publisher client. It so happens that even though it has all those other
> methods, both file and directory removal on this server depend on the
> HTTP/1.1 DELETE method. Of course, the fact that there are only about
> 300,000 of these servers on the net does not indicate much. Or does it?

You tell me -- how many of these servers are used for authoring?
Furthermore, since both AOLserver and ES define extensions to HTTP/1.1 for
remote authoring (in fact, this fact that existing authoring clients needed
to define extensions to HTTP to create viable authoring tools was a primary
argument in favor of creating the WebDAV standard), how likely is it that
tools which work against these servers and depend on their non-standard
extensions will work at all against a WebDAV server?

For example, what do these servers report in the case where a DELETE is
issued against a collection that has two members, one that can be deleted,
and one that has write permission disabled, and cannot be deleted?  Does the
server have all or nothing semantics (implying some kind of transaction
support), so that if it cannot delete all the resources, it deletes none?
Does the server have best-effort semantics, where it would delete the one
resource, but not the other two (the collection and the resource which
cannot be deleted due to access control restrictions)?  What status codes
does it report in these cases?  This behavior is not specified, and as a
result an HTTP/1.1 client does not know what will happen when it issues a
DELETE against one of these servers.

In my personal opinion, I think the reason this behavior isn't specified is
because it was never intended to occur.  DELETE should affect only the
Request-URI, and if an implementation does more, it is non-conformant.

> To some this up in a somewhat less sarcastic tone: While I might not know
> the extent to which PUT-based creation of directories is used, it
> is largely
> because the main commercial HTTP authoring servers have their own MKDIR
> command that is equivalent to the WebDAV MKCOL. But, it so happens, that
> they do use DELETE to delete directory trees, and your convenient
> *decision*
> that HTTP/1.1 DELETE isn't being used is about as wrong as it can get.
>
> > a) this problem does not warrant a re-issue of RFC 2518
> > b) it is not clear that this problem warrants any changes to the
> > specification at all, since at worst it would cause user
> confusion for an
> > error case on an infrequently (if ever) used option of an infrequently
> > executed method.
>
> Most of your assertions in (b) above are simply plain false.

You have provided no usage data to back up your assertion (to be fair,
neither did I).

>
> > While this might work, it's a bad design to use the Depth
> header to signal
> > this information.
>
> Bad design? Maybe. But the design you have without it is flawed
> at the core.

"flawed at the core" is a pretty strong assertion, especially since we've
been talking about pretty peripheral use cases.  Why don't we stick to
arguing specific technical issues, and stay away from these emotional (and
ultimately, not very helpful) value judgements, OK?

> As I tried to explain in a previous posting, it isn't even within the
> legitimate scope of WebDAV to redefine the semantics of HTTP/1.1 methods,
> and there is clearly no real technical need for it to do it either.

Actually, the definition of "container-specific semantics" is explicitly
stated in our charter (http://www.ics.uci.edu/pub/ietf/webdav/charter.html).
Our charter was approved by the Internet Engineering Steering Group (IESG),
granting WebDAV WG the authority to work within the scope of this charter.

> As a side remark, you should note that WebDAV would not have suffered one
> bit from having two notions of PUT/DELETE within its context (which could
> be distinguished either by a header or by different method names). The
> WebDAV methods can be defined as you find fit, while the existing HTTP/1.1
> methods maintain their flexibility to be used as people find it
> appropriate for their applications.

I'm sure that someone would have attacked us for providing *no*
interoperability for downlevel HTTP/1.1 authoring clients in this case.
Although, in retrospect, this idea does have some merit.

> Now whether you like it or not, you created a situation where your
> protocol is in conflict with HTTP/1.1. While some aspects of this
> conflict are of yet unknown and hard to determine magnitude, there
> are other aspects of it that are clearly significant. The only way
> to comply with both HTTP/1.1 and WebDAV is to forbid DELETE altogether.
> There is no way of building a fully functional fully compliant WebDAV
> server that is also fully compliant with HTTP/1.1, and since HTTP/1.1
> compliance is required by WebDAV itself, it is even a self-conflict.
> The DELETE method, within its HTTP/1.1 semantics, is currently used
> on hundreds of thousands of HTTP authoring servers, and your protocol
> specifies a behavior that breaks the proper functionality of virtually
> any existing client that uses this method. If that isn't an
> interoperability problem of your protocol, then I don't know what is.

We clearly differ on the significance of this interoperability issue.

>
> Now the implications of this basic protocol error on real-world
> application
> design are at least the following (I only relate to the DELETE issue here,
> and I further *assume* that there is really some clear notion of WebDAV
> semantics for DELETE, even though you say that it should behave
> *differently*
> from what RFC 2518 specifies):
> 1) Server implementors would need to choose between:
>  a) Keep their HTTP/1.1 semantics for DELETE.
>  b) Implement the WebDAV semantics for DELETE.
>  c) Implement both behaviors along with some (ugly and inherently
> unreliable)
>    mechanism for distinguishing different clients (such as using
> a database
>    of known clients).
>  For those implementors that would want to add WebDAV support to a current
>  product framework that already relies on HTTP/1.1 DELETE (like
> AOL/Netscape),
>  (b) isn't even a viable option, and they would need to choose between (a)
>  and (c).
> 2) Client implementors that would like to maximize the interoperability of
>   their program MUST NOT assume any specific DELETE semantics, and MUST be
>   able to deal with both possible behaviors.
>
> Is it the end of the world? No, because implementing those
> workarounds isn't
> that much of an issue and they should work quite well in most cases. So it
> would only cost some extra effort/money and somewhat decrease overall
> reliability, interoperability, and performance. The biggest problem would
> be for naive implementors that would not realize this state of
> affairs from
> the beginning and would need to discover it in the hard way. But, from the
> point of view of proper protocol design, creating this situation
> is a clear failure.
>
> So I'm really very sorry this is the case, and I tried to explain it as
> well as I could, and I'm sorry if I offended anyone in the process, and
> I'm tired myself of the whole thing, but I think that your protocol is
> broken and that you should fix it. It is your protocol and you can
> obviously do what you want, but trying to dismiss a significant design
> error by using faulty arguments will not solve the problem.

I'm not sure you have this "your protocol" distinction.  The WebDAV protocol
was developed by the WebDAV working group of the IETF.  It achieved rough
consensus of the membership of the mailing list, and was subsequently
approved by the IESG. The IETF doesn't have any membership requirements, or
any notion of "members" really, but for the purpose of establishing
consensus in a working group, the working group's membership is defined to
be the subscribers to the group's discusssion list. Since the spam filter
isn't bouncing your emails, you are a member of this list, and hence a
"member" (for purposes of establishing rough consensus on an issue) of the
WebDAV working group.  Thus WebDAV is now as much your protocol as it is
"our" protocol.  *I* certainly don't own the protocol, even if I do seem to
be the one who has stepped up to respond to your posts.

In fact, the IETF is based on "rough consensus and running code", which
translates directly into the IETF requirement that every feature of the
WebDAV protocol must be implemented in two independent code bases on the
client and server side for the protocol to advance from Proposed Standard to
Draft Standard.  Since it sounds like you're implementing a WebDAV server,
in a very real sense *you*, along with other implementors, now own the
standard.

- Jim
Received on Sunday, 25 April 1999 19:38:13 UTC