- From: Yoram Last <ylast@mindless.com>
- Date: Sat, 24 Apr 1999 06:08:28 +0300
- To: ejw@ics.uci.edu
- CC: WEBDAV WG <w3c-dist-auth@w3.org>
It seems to me that some of your bogus (in my mind, at least) arguments below are the result of a misinterpretation of what the HTTP protocol is and how it works. I would thus like to start with a fairly general description. consider the following HTTP request: GET /dir1/dir2/fname.ext?id=123&com=ls HTTP/1.1 Host: ... Now if we ignore some additional information that may be contained in the headers, this is a request which consists of two main ingredients. A method and a URI. Surely you recognize a part of the URI as referring to a resource (most probably, the name of a file) that is a member of a collection, which in itself is the member of another collection, and then another portion which is a query string. And since the request method is GET, you know that it "should not have the significance of taking an action other than retrieval." But, even though you know all that, the request itself doesn't tell you anything about the nature of the action that will be taken in response to it, nor about the kind of response that would be returned. If you are issuing this request, you should have had some prior knowledge concerning the nature of that resource (like the fact that it is a CGI program capable of processing certain parameters). Moreover, even those things that you did know about the URI, are completely outside the scope of the HTTP protocol. HTTP itself does not specify what is a collection, nor does it specify what is a query string, nor does it specify how a request of this kind should be handled by the server. All of these things are completely outside of its scope, because HTTP is mostly a communications protocol. It specifies how to *submit* requests and how to *send* responses. Not how requests should be handled or what kind of responses should be provided. This nature of HTTP is what makes it such a flexible protocol. It enables many things to be layered on top of it. The actual way requests are being handled depends on the server. It may or may not involve other standards such as CGI, but in any event, it is mostly outside the scope of HTTP. Now the fact that the HTTP protocol doesn't know what collections and query strings are, does not mean that clients and servers and users don't know what they are. These objects may (even if they don't need to) fully exist in the context of HTTP based communication in pretty much the same way that URIs exist in the context of TCP/IP based communication. Now lets look at PUT and DELETE. These methods are not as flexible as GET and POST, but the general principles apply here just as well. HTTP specifies what it does about PUT, and beyond that it is up to the server implementation to determine what it will or will not do. A server may implement some crazy CGI-based mechanism to enable entities to be PUT into URIs that contain query strings, and it may, just as well, create new collection resources in the process of creating a new resource URI. Similarly for DELETE: HTTP describes this method as requesting to "delete the resource identified by the Request-URI". Since HTTP doesn't say what a resource is, nor does it distinguish between different types of resources, nor does it say if other resources should or should not be deleted (or maybe even created) in the process of deleting a given resource, all of these things (and more) are left to be decided by the server implementation. Furthermore, HTTP even allows for the server implementations to have extra control mechanisms, so that the actual deletion may be postponed or even canceled at a later time. Now this flexibility in HTTP has its consequences, and one of the main consequences concerning DELETE, in particular, is that, in order to use it safely and effectively, a client would need to know how it is implemented by the server. The fact that a given HTTP server supports the DELETE method, does not provide enough information to determine the behavior of that method on this server. This is quite similar to the CGI example above. You need to have additional information (other than the fact that HTTP is being used as the communication protocol) in order to be able to predict the possible outcomes of essentially any kind of HTTP request. Now the issue concerning the supposed allowance of a failed DELETE to be reported as success is fully addressed within this framework, because it simply means that the success of a DELETE should be interpreted in the context of any particular server implementation. On some servers, a successful DELETE means "the file was deleted", on others it might mean "the file will be deleted tomorrow morning if my boss approves it", and yet on others it might mean "the file was marked for later deletion; while it is still available for GET requests, no further editing of it can be done", but in each of these cases success means success in the context of that implementation, and failure means that the request failed. In most practical cases, clients (or more precisely users) should be familiar with the servers they work with, and they would know the real (or practical) meaning of a successful DELETE, and it is different from failure. Furthermore, in practice, "off the shelf" servers that support DELETE and all common Apache/CERN/NCSA implementations that support delete (except mod_dav, I suppose) try to delete the specified target and return success or failure according to what really took place. Now you may or may not like how HTTP works, and I would fully agree with you if you say that it is not optimally tuned for content management. But that's what it is, and it has its own advantages and disadvantages. WebDAV, on the other hand, is a totally different protocol with totally different goals and totally different design philosophy. It deals with defining a great deal of server-side structure, and specifying a great deal about how requests should be handled and responded to, and it leaves very few things open to interpretation or handling by other protocols. It is certainly not a communication protocol. This is nice and legitimate (except for those aspects of it that are flawed), but you can't interpret HTTP as being the same thing as WebDAV. WebDAV could have been designed as being purely layered on top of HTTP as it is. The fact that you *choose* to design it through HTTP extensions does not make these two different protocols the same. (which is all the more reason why it was a bad design error to put them in conflict.) > Yoram Last, on April 17, 1999 wrote: > > 2) The main problem with DELETE doesn't so much effect functionality as > > it effects compliance with HTTP/1.1 and has the potential of confusing > > HTTP/1.1 compliant clients. In connection with DELETE for collections, > > RFC 2518 says: "If an error occurs with a resource other than the resource > > identified in the Request-URI then the response MUST be a 207 > > (Multi-Status)." Since 207 is not a valid HTTP/1.1 response, HTTP/1.1 > clients > > are not supposed to be able to understand it. They are likely to consider > it > > as a success code (it's a 2xx) even though in this particular case it > actually > > indicates failure. > > Yoram Last, on April 20, 1999 wrote: > > You don't say anything about the DELETE issue. Do you think that it's a > > kosher thing to send 207 responses that are really error responses to > > (non-WebDAV) HTTP/1.1 clients? Some people previously suggested that since > > HTTP/1.1 says that a 2xx response to a DELETE is not an absolute > commitment > > to having the resource deleted, then you are allowed to always send a 2xx > > regardless of the outcome. This is a distorted interpretation of HTTP/1.1. > > It says that a 2xx response to a DELETE indicates the acceptance of the > > request as such, and that there might still be a chance that this request > > will be rejected in the future due to the possible existence of further > > control mechanisms such as human intervention. This is not the same as > > responding with a 2xx in cases where the server fully and clearly rejects > > the request, and nobody in his right mind will design a purely HTTP/1.1 > > server to behave in this way. > > Just to set some groundwork here, if a WebDAV server executes a DELETE on a > collection, and the delete is completely successful, then the response code > should be a 204 (No Content), although 202 (Accepted) and 207 (Multi-Status) > are also acceptable. I think we agree there are no interoperability > problems in this case. In principle, there shouldn't be interoperability problems, but in practice, a poorly designed HTTP/1.1 client may very well encounter problems if it gets a 207. > On the other side, if a WebDAV server executes a DELETE on a collection, and > the delete completely fails, then the response code should be a 404 (Not > Found) or a 403 (Forbidden), depending on what caused the problem (404 - > nothing was there to delete, while the 403 could handle access control > problems). For a complete failure, the 207 (Multi-Status) should not be > returned (although I will admit that, upon reading RFC 2518, this latter > point is probably not clear, and should be clarified in future drafts). I agree with the 404 in the event that the destination does not exist. However, RFC 2518 clearly states: "If an error occurs with a resource other than the resource identified in the Request-URI then the response MUST be a 207 (Multi-Status)." Now clearly, if there are any member resources (let alone all of them) that could not be deleted, then an error occurred with a "resource other than the resource identified in the Request-URI", and so I should return a 207. The only case I can return a 403 is if everything was deleted except for "the resource identified in the Request-URI." Now I don't see how you can say that something else is supposedly written here, and particularly, that the spec provides for returning a 403 in case that there where any internal members that where not deleted. > Assuming a DAV server does not return a 207 for this case, then complete > failure should also not generate any interoperability problems with HTTP/1.1 > delete. > > So, any potential problems would be with a partial completion of the delete > operation. Does the 'rm' command on any UNIX system has a switch that would enable it to run in a mode where it would issue a warning if and only if *all* target files failed to get deleted, but would remain totally silent if *some* failed to get deleted? How about its DOS counterpart? Or have you ever seen a single file manager that would even have an option to respond to directory deletion in this way? Or in fact a file manager that would not give an error message in case that the deletion of even a single file in a directory failed? Or maybe it is valid for a web server to respond to a PUT request with a 204 in case that it only got %60 of the 'Content-length' of the file? (It mostly succeeded, didn't it?) If some of the request failed, then it is an error that should be noted as such. It is a fundamental principle of virtually any program in existence that provides similar functionality. Accordingly, getting appropriately corresponding status codes is also a design assumption of any existing HTTP/1.1 application. The philosophical question of whether "partial success" is "success" or "failure" is irrelevant. The fact is that your decision to use a single multistatus code was made under the explicit assumption that this code would not be sent to clients that are not supposed to understand it, and the fact that your protocol now specifies otherwise, means that it has an interoperability problem with HTTP/1.1 applications. > For this case, let me note: > 1) The behavior of a DELETE on a collection in HTTP/1.1 is problematic for > file-based servers. If a DELETE is issued to a resource which has a URL > which ends in a slash "/" (e.g., "testdir/"), and there are other resources > which have URLs which add a path onto this slash (e.g, "testdir/one.html", > "testdir/two.html"), HTTP/1.1 doesn't give any guidance as to what should be > done with the resources which have URLs which come after the slash. It > seems to me that, for the same reason that filesystem-based servers create > intermediate paths, these same servers would want to delete the resources > which have "slash plus path" URLs. This leaves servers with the choice of > either a) deleting the collection, plus the "slash plus path" resources, b) > doing nothing (reporting an error), or c) internally marking the collection > as removed, and not affecting the "slash plus path" resources. My > interpretation of the HTTP spec. is that either (b) or (c) is what was > intended by the spec., but I wouldn't be surprised if a filesystem-based > server has implemented (a). As I explained in the long discussion above, it is outside the scope of HTTP/1.1 to specify any given behavior here. Saying that "The behavior of a DELETE on a collection in HTTP/1.1 is problematic" is a misunderstanding of HTTP and how this particular HTTP method is used (and should be used) in existing (or future) HTTP applications. I know implementations that do either (a) or (b) (and (a) seems to be more popular). I never saw (c). But anything that would make sense in the context of any given implementation is legitimate. Obviously, a user (or client) would need to know in advance how a particular server implements this method in order to make safe and effective use of it. This basic freedom does not contradict the possible existence of common practices (or even written standards) that would limit the actual types of implementations that one encounters to some finite set of options (or even a single common option), but it is important to understand that this would be inherently outside the scope of HTTP itself. > For reference, mod_put's implementation of DELETE does (b), since it will > attempt to perform an unlink() on the directory, which will fail, causing it > to report a 403 (Forbidden). It's hard to tell whether this is intentional, > or if the implementor didn't consider that a directory could be removed. At > the very least it is suggestive that HTTP/1.1 delete on a collection is rare > enough not to be worth implementing. This is a minimal (maybe just trying to be as safe as possible) implementation of DELETE. Netscape servers and AOLserver do your (a) (namely, the same behavior as specified by WebDAV). To the extent that a common practice exists here, I believe it is your (a). > 2) Though you've brought it up already, I do think it is worth stating again > that HTTP/1.1 clients, if designed correctly, should not depend on *any* > state change occurring on the server as the result of a delete. As HTTP/1.1 > states, "The client cannot be guaranteed that the operation has been carried > out, even if the status code returned from the origin server indicates that > the action has been completed successfully." This is a patently absurd interpretation of both HTTP/1.1 and the practical implications of returning faulty status codes. I don't even know what you mean by "should not depend on *any* state change occurring on the server" (do you?). Those Clients that I'm familiar with that use DELETE (like AOLpress and Netscape's Web Publisher) would remain silent if they get a success code (indicating success to the user in the most commonly used way), and they will pop up a window with an error message if they get an error response. They will also adjust their display of the site's "file system" to reflect the change that they think (based on the server's response) took place. So they will simply convey to the *user* the wrong information regarding the outcome of his actions, and they will also keep displaying a faulty map of existing (or available) resources. Having a server that returns bogus codes, is exactly the same as having your operating system returning bogus codes for system calls like unlink(). Your file manager (or rm command) would indicate to you that the command was successful in cases where it was not. Do you "depend" in any way on getting the correct indication regarding the success of such commands? The theoretical possibility that on some HTTP/1.1 compliant servers success need not equal confirmed deletion of the resource is about as relevant to this point as to ask how final an "unlink()" really is. Depending on your OS and file system it may have different meanings in different situations, and it is usually *not* a true deletion of the resource. But, regardless of that, there is a clear notion of when it succeeds or fails, and if you return the wrong code, the program conveys *false information* to the user. Besides, it will be particularly irrelevant on WebDAV servers since they are not allowed to have mechanisms to delay or override deletion. A WebDAV server knows at the time of responding the exact status of things, and it will be informing HTTP/1.1 clients (which in practical terms means the users of these clients) that their request succeeded in cases where it failed. > So, since an HTTP/1.1 client cannot depend on the response code to a DELETE, > and since the existing definition of DELETE is ambiguous for collections, > and since existing implementation practise suggests that delete on a > collection is an infrequent (perhaps even never executed) operation for > HTTP/1.1 clients: You are so very very wrong. AOLpress might not be a purely HTTP/1.1 application, but it uses HTTP/1.1 semantics of DELETE. Now if AOL would want to add WebDAV support to their servers (like to their PrimeHost hosting service) they would clearly be facing a conflict. Exactly the same thing holds for Netscape`s Enterprise server. It provides WebDAV-like functionality through implementing a whole zoo of its own HTTP methods, and is so designed to provide authoring capabilities using it's own Web Publisher client. It so happens that even though it has all those other methods, both file and directory removal on this server depend on the HTTP/1.1 DELETE method. Of course, the fact that there are only about 300,000 of these servers on the net does not indicate much. Or does it? To some this up in a somewhat less sarcastic tone: While I might not know the extent to which PUT-based creation of directories is used, it is largely because the main commercial HTTP authoring servers have their own MKDIR command that is equivalent to the WebDAV MKCOL. But, it so happens, that they do use DELETE to delete directory trees, and your convenient *decision* that HTTP/1.1 DELETE isn't being used is about as wrong as it can get. > a) this problem does not warrant a re-issue of RFC 2518 > b) it is not clear that this problem warrants any changes to the > specification at all, since at worst it would cause user confusion for an > error case on an infrequently (if ever) used option of an infrequently > executed method. Most of your assertions in (b) above are simply plain false. > While this might work, it's a bad design to use the Depth header to signal > this information. Bad design? Maybe. But the design you have without it is flawed at the core. As I tried to explain in a previous posting, it isn't even within the legitimate scope of WebDAV to redefine the semantics of HTTP/1.1 methods, and there is clearly no real technical need for it to do it either. You took to yourself a liberty that was never yours. This would have been a core design flaw even if it didn't generate the slightest real-world interoperability problem, because those HTTP/1.1 methods have flexibility that WebDAV does not provide, and thus have the potential of being used in future applications in ways that you are not likely to be able to consider right now. It is really very hard to predict the full consequences of something like this. That's why you shouldn't be doing it to start with. As a side remark, you should note that WebDAV would not have suffered one bit from having two notions of PUT/DELETE within its context (which could be distinguished either by a header or by different method names). The WebDAV methods can be defined as you find fit, while the existing HTTP/1.1 methods maintain their flexibility to be used as people find it appropriate for their applications. Now whether you like it or not, you created a situation where your protocol is in conflict with HTTP/1.1. While some aspects of this conflict are of yet unknown and hard to determine magnitude, there are other aspects of it that are clearly significant. The only way to comply with both HTTP/1.1 and WebDAV is to forbid DELETE altogether. There is no way of building a fully functional fully compliant WebDAV server that is also fully compliant with HTTP/1.1, and since HTTP/1.1 compliance is required by WebDAV itself, it is even a self-conflict. The DELETE method, within its HTTP/1.1 semantics, is currently used on hundreds of thousands of HTTP authoring servers, and your protocol specifies a behavior that breaks the proper functionality of virtually any existing client that uses this method. If that isn't an interoperability problem of your protocol, then I don't know what is. Now the implications of this basic protocol error on real-world application design are at least the following (I only relate to the DELETE issue here, and I further *assume* that there is really some clear notion of WebDAV semantics for DELETE, even though you say that it should behave *differently* from what RFC 2518 specifies): 1) Server implementors would need to choose between: a) Keep their HTTP/1.1 semantics for DELETE. b) Implement the WebDAV semantics for DELETE. c) Implement both behaviors along with some (ugly and inherently unreliable) mechanism for distinguishing different clients (such as using a database of known clients). For those implementors that would want to add WebDAV support to a current product framework that already relies on HTTP/1.1 DELETE (like AOL/Netscape), (b) isn't even a viable option, and they would need to choose between (a) and (c). 2) Client implementors that would like to maximize the interoperability of their program MUST NOT assume any specific DELETE semantics, and MUST be able to deal with both possible behaviors. Is it the end of the world? No, because implementing those workarounds isn't that much of an issue and they should work quite well in most cases. So it would only cost some extra effort/money and somewhat decrease overall reliability, interoperability, and performance. The biggest problem would be for naive implementors that would not realize this state of affairs from the beginning and would need to discover it in the hard way. But, from the point of view of proper protocol design, creating this situation is a clear failure. So I'm really very sorry this is the case, and I tried to explain it as well as I could, and I'm sorry if I offended anyone in the process, and I'm tired myself of the whole thing, but I think that your protocol is broken and that you should fix it. It is your protocol and you can obviously do what you want, but trying to dismiss a significant design error by using faulty arguments will not solve the problem. Yoram
Received on Friday, 23 April 1999 23:09:22 UTC