Re: WebDAV Bindings - Issue Yaron.AtomicDelete from ccjason@us.ibm.com on 2000-01-18 (w3c-dist-auth@w3.org from January to March 2000)

From: <ccjason@us.ibm.com>
Date: Tue, 18 Jan 2000 15:32:25 -0500
To: Yaron Goland <yarong@Exchange.Microsoft.com>
cc: w3c-dist-auth@w3.org
Message-ID: <8525686A.0070E47D.00@D51MTA03.pok.ibm.com>
>>
Because BIND redefines how collections work the effect of this paragraph is
to directly amend the behavior of RFC 2518 in a manner that is not
compliant with how RFC 2518 is currently written.
>>
That is correct.  We should consider amending the follow-on document to
2518 to use this revised defintion of DELETE.

>>
Things brings up the first issue. Redefining DELETE to be atomic without
using the mandatory mechanism to prevent confusion on the part of clients
and servers will destroy interoperability with RFC 2518 systems. A BIND
enhanced client issuing a DELETE against a RFC 2518 server will expect
atomic behavior and probably won't get it.
>>
That depends on the server.  2518 servers were free to implement it in the
fashion defined in the binding spec.  Some servers will interoperate with
no modification.  As for clients, it is unlikely that a client would
actually code itself to depend on the 2518 behavior.


>>
There are cases where it is worth creating incompatibilities, for example,
if you find out that their is a serious flaw in the protocol. However no
one has demonstrated that the non-atomic behavior of DELETE is a flaw.
>>
It is a serious flaw in the protocol.   It causes undeterministic behavior
and risks broken bindings other than the one requested... and the requester
is unlikely to detect this.  But other folks using other bindings into the
tree WILL notice.  It will impact others using the same tree.  And
correcting it after the fact is difficult.    In addition, a server
implementing the 2518 behavior in the way you suggest will almost certainly
be slow.  There is no need for this behavior and it needlessly complicates
the protocol.

>>
It is a pain, that is true. It certainly makes it difficult to write
clients. But as evidenced by the number of WebDAV clients it is clear that
the requirement is one that can be met.
>>
And what do those existing clients do in this situation?  If the operation
succeeds, they are unlikely to notice and even consider the damage that was
done.   Most of them interact on a single operation basis and don't care.
They just display the end result.  But if the DELETE is part of a sequence,
there's not much they can do with the 2518 behavior partial failure.   All
they can do is abort and perhaps beg the user to use some out of band
mechanism to restore the damage that has been done.

>>
Heck, as evidenced by the number of file system clients written in the last
30 years, it is clear that the requirement to handle non-atomic behavior
can be met.
>>
In systems that support multiple bindings, they all provide a mechanism to
remove a single binding.  Even in other systems the action is often atomic.
History doesn't suggest that the iterative delete is the proper semantic
for the client.


>>
Changing DELETE to atomic is especially egregious given that WebDAV does
not prevent a server from implementing DELETE atomically. Rather it puts
the client in the unfortunate situation of having to deal with systems that
can't necessarily support atomic DELETE.
>>
????

>>
 There are, I freely admit, circumstances in which a client MUST be able to
ensure that a DELETE is issued atomically. Clients in those cases will have
to choose to not interoperate with many WebDAV servers in order to gain
atomic delete.
>>
These clients can interoperate with an old iterative server just fine if
they also include 2518 support.  That's their choice.

We have asked around and it seems that server authors appreciate and are
willing to comply with semantic of an "atomic" removal of a single binding.


>>
 The proper way for them to express their requirement that a delete be
atomic is either to issue DELETE with a MAN extension or to create a new
method.
>>
The only reason this would be needed is if there actually were a client
side requirement to actually have it act in an iterative fashion.  There
does not seem to be any pressing client need for this and the possiblity of
damage is great... even if it isn't the requester that suffers from that
damage..
The only other reason would be for a client to  declare that it must *NOT*
act iteratively.  But of course old 2518 based servers wouldn't be looking
for this flag.  For this reason, the possibility of an UNBIND method
wouldn't be a bad thing, but the presence of this UNBIND method doesn't
mean that the DELETE method should be allowed to iteratively break bindings
other than the requested binding.


>>
But the behavior of the DELETE method is well defined by RFC 2518 and given
that there is no evidence that this behavior is broken in any way. In fact,
given that their is over whelming evidence that non-atomic DELETE works
just fine and leads to interoperable implementations.
>>
It's very broken.  It deletes bindings other than the requested binding.
Even upon success.  This can affect folks accessing a tree via another
binding.  For this reason it is very broken even when the request succeeds.
And because DELETE is a legacy method, it may be used by folks that aren't
aware of the damage that can be done.

>>
Let me clarify that DELETE was defined to not be atomic with malice of fore
thought. The non-atomic delete language was the result of nearly three
years of negotiations and represented a deep and broad consensus built up
amongst a huge community. Might I respectfully suggest to the BIND authors
that they should not be so ready to overthrow years of careful consensus
building.
>>
We've asked around.  Folks appreciate this behavior and are willing to
support it.

>>
Do not imagine that the lack of screaming on this issue reflects consensus.
Rather it reflects the fact that most of the WebDAV community is too busy
implementing RFC 2518 to pay much attention to BIND. The BIND
functionality, while I believe it will be important to WebDAV, is a bit
ahead of the majority of implementers so they just aren't reading or
reviewing it, yet.
>>
We have asked around and we didn't accept silence as a response.

>>
The key reason DELETE was not allowed to be atomic (which certainly would
have been a nice thing to be able to do) has to do with the way file
systems work. Most file systems do not support depth operations atomically.
So, for example, when you delete a directory what actually happens is that
the program does a depth first walk of the directory tree and deletes all
the individual files, walking backwards up the tree until finally deleting
the parent directory.
>>
You've pointed out that this is a problem on your file system.  We've found
no other.   We've tested it and our testing indicates that the MOVE option
you mention below works.  (We did not test with ACL's though... or multiple
bindings.)    We've also contacted folks at your organization and they
didn't see a problem.

>>
This brings us to our second issue, the argument has been made that a file
system could simulate an atomic DELETE by issuing a MOVE command against a
directory and ensuring that nothing exists at the destination. This would
make it appear that all the files have been deleted. The file system could
then later delete the files at its convenience. In addition this sort of
implementation would allow the type of functionality you see today in the
Windows GUI where deleted files are first placed into the waste basket
before actually being deleted.

This theory has several problems in practice.

Problem #1 - There exists a write ACL (which covers MOVE) and a delete ACL
(which covers DELETE).
It is possible, therefore, to have the right to delete a file but not the
right to move it. As such if a user is running a WebDAV server under their
own authentication then the server will have to fail the DELETE command,
even if they have the right access settings for delete, because the server
can't move the file first in order to simulate the atomic DELETE behavior.
The prevents file system based servers from implementing the "MOVE is an
atomic DELETE" hack.
>>
This ACL situation did not come up before.  so I just tested it.   It is
indeed possible to prevent deletion of an object via ACL's.  By deletion, I
mean the file system's concept of DELETE.   On the other hand, my testing
has shown that ACL's in a tree do not prevent the movement of the tree.
The only acl that can prevent this is an ACL on the collection resource
that is to be removed.  Testing that is quick.  It doesn't require
iterating through all the children to determine if there is at least one
down there with an uncooperative ACL.    ACL's don't come into play.

The only way the ACL's might possibly come into play is if there was an ACL
combination that allowed movement, but didn't allow garbage collecting
subsequently..   I believe there is and won't go to the trouble to check.
How and whether a server garbage collects isn't something we cover in the
spec though.  We've tried, but it was just a rat hole that defied
definition.  It is an interesting problem and it will be interesting to see
how creative servers will be addressing this issue.

Anyway, ACL's don't appear to pose a problem implementing the protocol.



>>
Problem #2 - The required mechanism could actually form a denial of service
attack. Imagine a server gives a space allocation to all of their users. A
user then shows up and knows that the server is running low on space.
Therefore the user decides to fill up their allocation with junk, delete
it, fill it up again, delete it, etc. Each time the server will have to
move all the user's junk to some temp space in order to simulate the atomic
delete. If the server can't delete the temp files fast enough then the
server's disk space will be overrun. One could attempt to counter this
attack by specifying that one can only DELETE something if one has enough
space left in one's space allocation for the MOVE. Of course this means
that if one has filled up one's allocation and wants to delete some files
to get more space the delete will fail because there is no room to do the
move. Of course, if there was room to do the move then one wouldn't have
wanted to delete the files since one would still have space. Catch-22.
>>
This is a bookkeeping issue.  Remember, the DELETE is not an allocation
issue.  It's removing the resource out of one or more places in the
namespace and that's the only verifiable semantic that we can talk about.
We've tried.  Whether the resource continues to exist after it loses its
last binding is up to the server.  If the server has a need, it is free to
delay returning from the DELETE until it has deallocated any resources it
feels it needs to deallocate.  Or it can delay a subsequent request from
that user.   It has these options.

I'll add, that for the time being, equating binding removal with the
freeing of resources should not be encouraged.   The person removing the
resource doesn't know if they have the last binding.   They don't know if
the server is allowed to deallocate a given unreachable resource.   They
may not be aware if the server keeps quotas.   Or if the resources that
became unreachable are part of it's quota.  Or if a request by them could
ever result in an automatic deallocation of a resource owned by someone
else.  A client doesn't even know that a server automatically garbage
collects.   Even if a client knows how one server works,  making any
assumptions here is risking interoperability.

>>
Problem #3 - File system servers are now getting smart enough to add
eventing support to their file systems. The idea is that you can register
to be told when a file is deleted or moved. However the delete and move
events are different. Forcing WebDAV file system servers to simulate a
DELETE as a move would cause the wrong event to get triggered. This means
that one can never be sure if that MOVE event was received because the user
intended to MOVE a file or because it was part of a DELETE operation. This
is especially important when one has a store that is accessible through
multiple means, say WebDAV and FTP. The events are registered directly on
the file system, not in the WebDAV or FTP implementation. So there is no
way for the WebDAV implementation to say "I know I'm doing a MOVE but you
really should just trigger the delete event."
>>
At the WebDAV level, it is a namespace operation.  Not an allocation issue.
If it has to send a MOVE event, then that's what it needs to send.  If the
server then decides to deallocate the resource then it should send the
appropriate message at that time.   It can even do so right away if it
doesn't affect the semantics of the request.  What I mean by this last
statement is that the server is free to do the delete if it knows all the
requests will succeed and the object is going to get GC in an instant
anyway.  That's just an optimization though.

I can imagine a situation where for any resource that becomes unreachable
will appear in the trash can of the owner of that resource.  If they have
an advisory lock on it, they will be informed of the movement to the trash
can.   This will even work if someone else made it unreachable.  And if the
owner throws out this portion of his trash, he might increase his available
space.  This is similar to how some popular desktops work now.   And I
don't want to suggest that this is the only way to do this.


>>
There is then the third and final issue, WebDAV begins with a "D" for a
reason. It's goal is to be distributed. Requiring atomic DELETEs would
essentially hinder all but the most expensive of systems from being able to
implement distributed namespaces across multiple physical servers. The
reason being that the atomic requirement means that these systems will have
to establish transactioning systems between themselves in order to issue
DELETEs if they share namespace.
>>
It's only one binding.  The goal isn't to be atomic, that's just a
fortunate side effect.  A system would fulfill the request in the same
fashion that it would normally do that in the absense of WebDAV.  If that
system is also distributed, then it already has to deal with the
distributed part of that issue.  If the server wants to make it distributed
on top of a non-distributed repository, then it has to handle all this
itself.   Lacking evidence to the contrary, I'd expect  deleting a single
binding would be easier to implement than deleting many of them.
Especially in a distributed environment.

>>
As such I move that the atomic DELETE language be struck from the BIND spec
on the grounds that it destroys interoperability, requires behavior that
would preclude file system based systems from supporting WebDAV and
significantly increases the cost of implementing WebDAV in a distributed
manner.
>>
I believe I've addressed all of these above.

It is very clear that DELETE should not be iterative or accept partial
results in a server that supports multiple bindings.  Legacy clients may
invoke DELETE without knowledge of the potential dangers.  Therefore, a
simple DELETE should delete a single binding.  --- There is no compelling
reason (so far) not to support single binding delete and it simplifies the
protocol and makes behavior consistant across all servers.  Single binding
delete should be the long term target behavior for WebDAV.

Jason.
Received on Tuesday, 18 January 2000 15:33:32 UTC