Re: Why members of a versioned collection revision must be versioned resources

jamsden@us.ibm.com
Sun, 17 Oct 1999 16:24:19 -0400


From: jamsden@us.ibm.com
To: ietf-dav-versioning@w3.org
Message-ID: <8525680D.0070A6BD.00@d54mta03.raleigh.ibm.com>
Date: Sun, 17 Oct 1999 16:24:19 -0400
Subject: Re: Why members of a versioned collection revision must be 	 versioned resources




---------------------- Forwarded by Jim Amsden/Raleigh/IBM on 10/17/99 04:24 PM
---------------------------


Jim Amsden
10/17/99 04:24 PM

To:   "Geoffrey M. Clemm" <gclemm@tantalum.atria.com>
cc:

Subject:  Re: Why members of a versioned collection revision must be versioned
      resources  (Document link: Jim Amsden)

OK, I see a fundamental difference that we should clear up before this
discussion can make sense. You (Geoff) seem to be treating servers as binary:
versioning or non-versioning. One uses CHECKOUT/CHECKIN on one, LOCK/UNLOCK on
the other. Checkout/checkin isn't available in a non-versioning server, and
lock/unlock aren't necessary (at least for overwrite conflict avoidance) in a
versioning server. I have been assuming that a versioning server could also
support un-versioned resources. It can do both. So lock/unlock still apply to
unversioned resources just like they do in a non-versioning server. The user
decides if a resource is versioned or not, and therefore what operations apply.
For versioned resources, I completely agree that lock/unlock are unnecessary
except for perhaps shared workspaces to protect working resources. But that's a
policy decision that users make, not clients or servers.




"Geoffrey M. Clemm" <gclemm@tantalum.atria.com> on 10/16/99 11:26:29 PM

To:   Jim Amsden/Raleigh/IBM@IBMUS
cc:

Subject:  Re: Why members of a versioned collection revision must be
      versioned resources



   From: jamsden@us.ibm.com

   See <jra> below... I'm not proposing we need this, I just want to explore why
we
   have the restriction. See the end of this note for a plausible use case.

Definitely worth exploring!

   On the client side, if an unversioned resource were a member of a
   versioned collection, the client would be faced with lost update
   problems.  The client can't checkout that resource because it is not
   versioned, and they can't put it under version control because
   that would require changing the binding in that collection revision
   (which you can't because a revision is immutable).  So the client
   would have to keep track of which members are versioned resources
   and which are not, since the update protocol for the two are very
   different.  In addition, when you try to make a baseline of that
   collection, do you allow it, knowing that the unversioned resource
   can change at any time and thus violate client expectations about
   the immutability of baselines?

   <jra>
   But this doesn't seem any worse than having an unversioned resource in
   an unversioned collection. The user will have to rely on locks, ACLs,
   or out of band control to manage updates on the resource. The client
   doesn't need to checkout the resource, its isn't versioned, so its
   already updatable (subject to access control and locking of
   course). The client doesn't want to put it under version control, so
   the parent collection doesn't need to be checked out. The client could
   put bindings in other collections anyway. The client can easily keep
   track of which ones are versioned or not by setting menu
   selectabilities.

This is a cost.  The user has to keep the two access patterns in
mind.  Sometimes checkout.  Sometimes lock.  Remember that the
things that need to be locked will change out from underneath you
unless you lock them, while the revisions won't.

   Baselining a collection that has checked out members
   must fail too or checkin the collection and create the baseline in a
   single overloaded operation. (Deep checkin by definition combines
   these two operations, but I think they might be better kept
   separate). Baselining a collection that has unversioned resource
   should fail because the baseline couldn't contain a revision of those
   resources.

Failure cases and user choices that need to be made.

   Baselining isn't the only reason for having versioned
   collections. Its a namespace change management mechanism.

I agree.

   So, I don't see any big problems here that the client doesn't already
   have.
   </jra>

Before versioning there was just one paradigm ... locking.
Replacing locking with versioning is good for a variety of reasons.
Doing locking *and* versioning and having users keep track of which
is which is probably worse than just locking.

   On the server side, unversioned collection revision members are
   very problematical since they are shared between every workspace
   that selects that revision, and thus prevent the standard optimizations
   that allow you to distribute workspaces to different servers.
   Working resources present no problem, because they are local
   to a workspace.  Revisions present no problem, because they are
   immutable so every workspace can just have its own copy.  But if
   you allow a collection revision to contain an unversioned resource,
   any change to that unversioned resource should be immediately visible
   to all other workspaces sharing that revision, making distributed
   workspaces (i.e. workspaces that live on different servers) infeasible.

   <jra>
   But this would be true for an unversioned resource in an unversioned
   collection too. I don't see why having it in a versioned collection
   changes anything.

Versioning is supposed to make things *better*.
With versioning, you've got the ability to do reliable parallel work
with workspaces on largely disconnected servers.  The server can make
all sorts of optimizations based on the immutablility of revisions and
the local nature of working resources.  You all lose that if you
add back in unversioned resources.

   The workspace can cache the member names, and then
   just answer if this version of this collection contains this URL
   segment. If yes, then the server can get the resource using the same
   mechanisms it uses to get any unversioned resource. I don't see the
   extra cost.

The cost is that it doesn't have to do that if the workspace only contains
versioned resources.
   </jra>

   We need to balance these costs against the benefits provided by
   having unversioned resources in collection revisions.  No benefits
   come to mind ... (:-).

   <jra> Its a state thing. You have a versioned collection. You check
   it out thinking you're going to make a change, but you're not
   sure. You just want to experiment for a while. You create a new
   resource as part of your changes, but you don't want to version it yet
   because you're not sure you want to keep it.

The client can hide all that from the user.  It just automatically
creates a new versioned resource when the user tries to do a PUT
to a null resource in a versioned collection.  The user just "works".
They don't have to think about versioning.

   Your just thinking and
   using incremental design by discovery. I think this is a pretty common
   use case. By state thing, I mean a resource has a life-cycle starting
   from nonexistent to unversioned to versioned, etc. Methods cause state
   transitions. I like having the flexibility to decide later if I want
   something versioned or not. I don't want to have to version it just
   because the collection I want to package it in is versioned.

You don't have to think about versioning just because it is a versioned
resource.  Until you decide to check it in, you just treat it as an
unversioned resource.  You update it, you get it, you update it --
just like any old non-versioned resource.  Why worry about "whether
its a versioned resource" if it just doesn't matter?

   This
   seems like unnecessary coupling, especially if its for some potential
   server optimization I might not know anything about, or other servers
   don't use.

All CM servers in use today take advantage of the fact that only
immutable revisions are shared between workspaces.  That's how they
do efficient distributed workspaces (or for that matter, easy to
implement distributed workspaces for the low-end vendors).

Cheers,
Geoff