Re: DELETE Semantics

From: <jamsden@us.ibm.com>
Date: Fri, 24 Sep 1999 14:49:47 -0400
To: w3c-dist-auth@w3.org
Message-ID: <852567F6.00679E57.00@d54mta03.raleigh.ibm.com>

I agree with everything Geoff says below. The problems we're having result from
mixing the semantics of:
  1. a resource and its contents/properties
  2. URLs we use to access a resource. Note that a resource may have some
server-dependent objectId that distinguishes it from all other resources managed
by that server, but this is not a URL and is not exposed to HTTP clients. This
is the ID the server maps URL bindings to.
  3. membership in a collection

Here's a summary of what I think we agreed to:

1. all URL references to a resource are bindings, including the PUT or MKCOL
used to create the resource in the first place.

2. DELETE is effectively an UNBIND. A server is free to garbage collect and
actually destroy the resource if there are no remaining bindings, but this is
not defined by the protocol.

3.  There is no DESTROY method that deletes the resource and all its bindings.

4. LOCK locks the resource, not the bindings. If the namespace needs to be
controlled, then the user should lock the applicable parent collections.

5. MOVE is really REBIND (or BIND followed by DELETE). So the resource in the
repository is guaranteed to be the same resource and locks can be retained.

6. There is no MOVE operation that is effectively COPY followed by DELETE or
GET/PROPFIND followed by PUT/PROPPATCH and DELETE. If a MOVE operation fails
because the binding to the destination cannot be created, then the user is free
to do a COPY followed by a DELETE if that meets their needs. Client applications
are free to hide these operations inside a move menu item if they desire.

"Geoffrey M. Clemm" <gclemm@tantalum.atria.com> on 09/24/99 10:58:31 AM

To:   w3c-dist-auth@w3.org

Subject:  Re: DELETE Semantics

   From: jamsden@us.ibm.com

   <gmc/> I personally believe that the best answer is to fix the LOCK
   semantics so it *really* is just on the resource (and not on the
   name).  Then things are simpler and consistent, even in the case of
   multiple URL mappings to a resource.  Rather than "protecting" a URL
   to resource mapping, I'd propose that a locked resource be allowed to
   MOVE (this is just a change to the state of the parent collection, not
   to the state of the resource being moved), but that an attempt to
   access the MOVE'd resource with that lock just returns a 302 indicating
   where it has MOVE'd to.
<gmc/> Note: Amend this to use Edgar's proposal that the <URL, lock-token>
pair always access the locked resource.

   But some moves will result in a change in state of the resource being
   moved, and this is server dependent. The new parent collection may be
   in a collection that has different OPTIONS then the old parent, e.g.,
   in a different repository manager. It may also have different live
   properties. This isn't just a cross-server move issue.

<gmc/> A MOVE (as being proposed by the advanced collection protocol)
is not allowed to change the state of the resource - it just changes
the state of the source and destination collections that contain the
resource.  If a server cannot implement the MOVE without changing the
state of the resource, the MOVE MUST fail, and the client may resort
to a COPY/DELETE if it does not need MOVE (i.e. state preserving)

   The semantics of MOVE can't be defined as rebind (rename)
   and copy/delete at the same time.

<gmc/> I completely agree.  I should be defined as a rebind (rename).
I hope we're not bringing back the "logically equivalent to a
COPY/fixup/DELETE" dead horse?  It has been thoroughly flogged (:-).

   MOVE can however be implemented that way.

<gmc/> Not if it is going to support advanced collections (and not if
it is going to support most people's intuition of how a MOVE differs
from a COPY).  A MOVE produces no new resources, but changes one of
the bindings to an existing resource.  A COPY/DELETE produces a new
resource with just one binding, and leaves an existing resource with
one less binding.

   As a result, one doesn't know if the resource is new or not after
   a MOVE, and therefore locks can't be guaranteed to be retained.

<gmc/> I believe that if we leave the semantics of MOVE as vague
as they are in 2518 (i.e. some arbitrary "fixup" step is involved),
we will continue to see the confusion about what a MOVE does/should
mean that we see today. The proposed semantics are simple:
if you can't guarantee that locks are retained, the MOVE MUST fail.
If the client wants the locks to be removed, it can (and should)
explicitly remove them.

<gmc/> A key use case here is with multiple bindings.  You issue a
LOCK on /x/y.html.  It turns out that is bound to the same resource
as /a/b.html.  Now you move /a/b.html to /a/oldb.html.  So you now
lose your lock on /x/y.html?  I'm not a happy client if that's the case.

   Therefore, the semantics must pick the conservative case and not move
   locks. Take for example moves in typical file systems. Sometimes the
   file actually moves (gets a new INODE in UNIX) and sometimes it doesn't
   Users don't see this unless they are manipulating INODES directly which
   is playing with the implementation, not the protocol.

<gmc/> As a general comment, there is no reason for us to exactly
mimic Unix file behavior (although I agree that there is lots of
wisdom embedded in the Unix file system that we should learn from).
As a particular comment, as you point out, the INODE is part of the
file system implementation that is rarely exposed/used by a client.
The fact that the inode changes is largely not a visible state change
from a clients perspective, and that is the perspective that matters.

   Moving locks has lots of other problems too as there is a possible conflict
   with the potentially inherited lock from the new destination parent

<gmc/> This is not a problem (although I am against inherited locks
for other reasons).  If there is a conflict, the server simply MUST
fail the operation that would cause the conflict.  Better that than
removing locks as a side-effect of the MOVE operation.

   Lock tokens are server dependent, and may be repository dependent too.
   Seems like loosing the lock is the lesser of the evils.

<gmc/> What evil are we avoiding?  If the MOVE fails (because of
inability to keep the lock on the resource), the client is notified,
and is then free to explicitly removes the locks and then requests the
MOVE again.

   <gmc/> So there are really multiple threads here:
   - Should locking be on a resource or also/instead on a URL-to-resource
     mapping?  (we know what it is now, but what *should* it be)
   * I vote "on a resource".

   I agree. The resource is the thing being manipulated, not the URL. The
   URL is only a way to get to the resource. There may be other ways, and
   no way.

<gmc/> Whew ... at least we agree on that! (:-).

   - Does a DELETE delete all bindings to a resource, or just the one
   specified in the request-URL.
   * I vote "just the one named by the request-URL".

   I have to disagree with this one as it is not consistent with LOCK.

<gmc/> I disagree (see below).  But even if this were true, I'd
suggest that we fix the LOCK semantics rather than making DELETE
unusable against versioning servers.

   If LOCK, GET, PUT, etc. apply to the resource, then so should DELETE.

<gmc/> Why exactly?  I believe that what matters is getting the
semantics right so that clients and servers can interoperate.  I
believe it is important to have a definition of DELETE that works in
the presence of versioning, and the "delete-all-bindings" semantics
does not.

   If bindings are created with a BIND method, then they should be removed with
   an UNBIND method. Otherwise, URL to resource mappings (i.e., bindings)
   must be exposed as separate resources (direct and redirect referencs)
   so they can be managed discretely. DELETE should stick to manipulating
   resources as defined in HTTP/1.1.

Then a versioning server will have to refuse every DELETE operation
issued by a non-versioning aware server.  Roy Fielding has verified
that the single binding definition of DELETE matches his intentions
when the HTTP-1.1 spec was defined.  So what is the benefit we are
reaping that matches the cost of non-interoperability between versioning
servers and non-versioning aware clients?

   - Should a DELETE delete a LOCK?
   * I vote, "no".  A DELETE modifies the state of the collection containing
     the binding, not that of the resource.  In particular, all other
     mappings to that resource will continue to exist and display the
     LOCK'ed semantics.  If you want to prevent a DELETE, you put the
     LOCK on the collection whose state is being modified.
   I wish I could agree with this one, but I can't. DELETE deletes a resource
   and as a side effect it modifies the state of its parent collection(s).
   It is unfortunate that PUT and DELETE are resource behaviors instead
   of having addMember, removeMember be operations on the parent collection.
   It is hard to recover from the resulting mixed semantics, but WebDAV does
   a reasonable job already. I think we should leave this alone.

<gmc/> This is too broken to leave alone, and too easy to fix to not do so.
Define DELETE and MOVE as binding operations, and you get full compatibility
with existing HTTP behavior, simple semantics, and interoperability between
versioning/binding aware and versioning/binding unaware clients and servers.

