Re: CHECKIN/CHECKOUT

jamsden@us.ibm.com
Mon, 25 Jan 1999 08:49:58 -0500


From: jamsden@us.ibm.com
To: "Geoffrey M. Clemm" <gclemm@tantalum.atria.com>
cc: ietf-dav-versioning@w3.org
Message-ID: <85256704.004C8534.00@d54mta03.raleigh.ibm.com>
Date: Mon, 25 Jan 1999 08:49:58 -0500
Subject: Re: CHECKIN/CHECKOUT



Geoff,

Some comments and questions  below in <jra> tags.





"Geoffrey M. Clemm" <gclemm@tantalum.atria.com> on 01/19/99 12:19:36 AM

To:   ietf-dav-versioning@w3.org
cc:    (bcc: Jim Amsden/Raleigh/IBM)
Subject:  CHECKIN/CHECKOUT





Even though we missed the Jan 11 deadline, Jim Amsden and I have been
hard at work on the WebDAV versioning model, and hope to have some
documents to send out by the end of January (Jim even flew up to
Mass in last week's ice storm for 2 days to work on it with me,
which definitely deserves a "beyond-the-call-of-duty" award).

Before I get my part of those documents finished up, I wanted to send
out the long-promised "CHECKOUT/CHECKIN" document.  This document has
been significantly affected by the results of our modeling efforts, and
so will probably bear only remote resemblance to what folks remember of
the various prior conversations on the subject.

So with that in mind, here's what I've got:

------------------------------------------------------------

CHECKOUT/CHECKIN


Putting a Resource under Version Control

When a resource is put under version control, it becomes unwriteable.
In order to modify a resource, it must first be checked out, then can
be modified one or more times, and then checked back in to indicate
you are done modifying it.  If your CHECKOUT fails, it means someone
else is currently modifying the document, so you should only do a GET
with the understanding that the results are only temporarily valid.


Checkout vs. Lock

Note the distinction between a (write) LOCK and CHECKOUT.  The LOCK
takes a resource that is writeable by everyone and temporarily makes
it unwriteable by everyone except the lock holder (until it is
UNLOCK'ed).  A CHECKOUT takes a resource that is unwriteable by
everyone, and temporarily makes it writeable (until it is CHECKIN'd).
It is reasonable to apply a LOCK to a checked-out resource, but is
not required.  In particular, many systems will decide the LOCK
is irrelevant, since a "friendly" client will delay writing until
it can perform a CHECKOUT, and an "unfriendly" client can just wait
until the UNLOCK and then trash the resource contents at will.

<jra>
But then CHECKOUT/CHECKIN are really the same as UNLOCK/LOCK where the user
or server had decided to leave the resource locked by default instead of
unlocked. I'm not suggesting this semantic should be exploited, just that
there is a correspondence.
</jra>

Immutable-Revisions

An immutable-revision is a revision whose contents (and immutable
properties) cannot be changed.  More precisely, an attempt to retrieve
the contents or immutable properties of an immutable-revision will
always return the same contents or will fail.  Therefore a server can
delete the contents or properties of an immutable-revision (resulting
in a failure when an attempt is made to retrieve those contents or
properties), but can never delete the immutable-revision itself.


Mutable-Revisions

A mutable-revision is a revision whose contents and properties can be
changed, although an attempt to change the contents or the "immutable
properties" of a mutable-revision must be preceded by an explicit
"checkout/thaw" operation, and then should be followed by a
"checkin/freeze" operation to return it to a read-only state.  This
then requires two flavors of checkout: a checkout that unfreezes an
existing mutable-revision (which I'll call CHECKOUT) and a checkout
that creates a new (unfrozen) mutable-revision that is based on an
existing mutable-revision (which I'll call CHECKOUT-NEW).


Branching

When a versioned resource supports immutable-revisions, it is still
necessary to support "change".  In particular, there must be some
resource that you can name, that will periodically take on new values.
For a versioned resource with immutable-revisions, this analogue to a
mutable-revision is called a "branch".  Like a mutable-revision, a
branch can be checked-out, changed, and then checked-back in.  The tip
of the branch then reflects this change.  Also as with mutable
revisions, you sometimes want to check out a new branch that is based
on (the tip of) an existing branch, which requires another flavor of
checkout (i.e. CHECKOUT-NEW).

<jra>
So how does this relate to activities where a revision is checked out in
the context of an activity? Also, what's the difference between checking
out a branch, and checking out-new on a revision? It seems confusing that
we checkout a branch resource in order to change some other resource. I
don't think the analogy above holds. For mutable revisions, checkout just
makes it writable in place. Checkout-new creates a new revision in the
context of an activity. Say there is no parallel development. Then all
checkout-news are done in the same activity and one gets a single
line-of-descent. This is what you described as checkout on a branch for
mutable revisions above.

Creating new activities to support parallel development is likely to be
something DMS vendors will want to do, even though the server can't give
them a reliable conflict report. As described above, CHECKOUT-NEW on an
immutable revision would be the same as checking out an immutable revision
in a different activity creating the potential for a merge. This seems very
different than checkout and create a new revision for mutable revisions.
</jra>

From a protocol perspective, this provides a way to unify the worlds
of mutable and immtable-revisions.  In each world, there is CHECKOUT,
CHECKOUT-NEW, and CHECKIN, where CHECKOUT modifies an existing
modifiable entitity, while CHECKOUT-NEW creates a new modifiable
entity that can be modified in parallel with the original entity.
CHECKIN is used in either case to return the resource to a readonly state.

<jra>
Interesting, but it doesn't feel like it unifies it all that much. Here's
the semantics you described:

Mutable revisions:
  checkout: make it writeable
  checkout-new: create a writeable copy and set the ancestor/descendent
        was-derived-from relationship
  checkin: make it temporarily read-only
  no support for activities, single line-of-descent
  no configurations

Immutable revisions:
  checkout: create a writeable copy and set the predecessor/successor
         is-derived-from relationship
  checkout-new: create a branch, a new line-of-descent to support
         parallel development, and do a checkout on that branch
  checkin: make it read-only
  can support multiple activities, multiple lines-of-descent, and merging
  can support configurations

When you list it this way, it doesn't seem so uniform. I think the
uniformity comes from a consept that is too abstract and won't be of
sufficient interest to users. Another more significant problem is that
there is no way to mix mutable and immutable revision semantics on the same
resource which I think will inhibit DMS style clients from gradually
migrating to CM semantics.
</jra>

The alternative is to provide THAW/FREEZE operations that can only be
applied to mutable-revisions, resulting in inoperability between
servers that support mutable-revisions and servers that support
immutable-revisions.

<jra>
Here's another alternative whose semantics come from merging our two views:

Mutable revisions:
  checkout: make it writeable
  checkout-new: create a writeable copy and set the ancestor/descendent
        was-derived-from relationship. Always done in the context of an
        activity, even if the server only supports one.
  checkin: make it temporarily read-only
  checkin-immutable: make the new revision immutable
  can support multiple activities, multiple lines-of-descent, but merge
  conflict report would be advisory only

Immutable revisions:
  checkout: always fails
  checkout-new: create a writeable copy and set the predecessor/successor
        is-derived-from relationship. Always done in the context of an
        activity, even if the server only supports one.
  checkin: make it temporarily read-only
  checkin-immutable: make the new revision immutable
  can support multiple activities, multiple lines-of-descent, and merge
  conflict report is reliable
  configurations can be supported for immutable revisions

This may not be as uniform in the abstract sense as what you describe
above, but it seems more uniform in the concrete. The only difference
between mutable and immutable revisions is the expected errors on checkout
that enforce the mutability of the resource. Parallel development is a
completely orthogonal concept in this case. If your server supports
multiple activities, you have parallel development. If not, the server
effectively supports only one activity, and there is no parallel
development support.

Users can decide on mutability on a revision by revision basis. For
example, it would be great if during the early stages of development on a
program one could make the revisions mutable. This would eliminate a lot of
useless history during the initial development and discovery stages. Not
only does this save space, but it simplifies the revision history of the
resource. Then when the user decides something is stable, he can checkin
immutable to permanently save the revision.

I would recommend implementing the protocol with two methods CHECKOUT and
CHECKIN. CHECKOUT would have a boolean header MakeNewRevision and CHECKIN
would have a boolean header MakeMutable. DMS clients would set the
MakeNewRevision header to false, and the MakeMutable header to true by
default. CM clients would do the opposite. Servers supporting both
semantics wouldn't care, but would probably provide the CM default in order
to support their richer semantics. Servers supporting only DMS or CM
semantics, but not both would give Bad Request responses for the methods
they can't support.

Next, let's look at client/server interoperability. Of all the possible
combinations, the ones in question are DMS client on CM server and CM
client on DMS server. In this case, I'm assuming a DMS server does not
support checkin(immutable), and a CM server does not support
checkin(mutable). The second case is simple and can be immediately
dismissed. A CM client cannot expect to get semantics from a server that
doesn't support it. This is a up-level compatibility we wouldn't expect.

So the only interesting case is a DMS client on a CM only server. Given the
semantics above, checkout (no new revision), and checkin(mutable) would
fail so that the DMS client defaults would not work. This may be OK, but it
isn't ideal. Another alternative would be for CM-only servers to realize
they may be getting these requests, and to simulate them with mutiple
revisions that the client doesn't see. This can easily be done because the
server can distinguish between is-derived-from and was-derived-from
relationships, even if the participating revisions are never mutable. So
the "mutable" revision history can be tracked even though there may be many
immutable revisions that are not shown. I think this is the same
implementation you described in your solution, except the same resource
cannot have both a mutable and immutable revision history.

I think this approach provides the flexibility DMS systems want for
mutability, but retains as much of the richer semantics of CM systems as
can be reliably supported. At the same time, the semantics of the methods
are exactly the same in both cases other than expected error conditions for
changing immutable revisions. Finally, DMS clients are free to
incrementally exploit the richer CM semantics without changing their
current semantics.
</jra>

Specifying CHECKOUT/CHECKOUT-NEW Policy

For versioning individual documents, it is sufficient to just let the
user select CHECKOUT or CHECKOUT-NEW, as they see fit.  For
versioning sets of related resources that are being modified in
parallel by multiple users over the course of multiple sessions, it is
essential that the server provide a mechanism for the client to store
its versioning policy in a form that can be queried (and updated) by
multiple clients over multiple sessions.  This is especially true when
the CHECKOUT choice for a particular resource is dependent on the
current state of the revision graph for that resource.  I propose that
the "workspace" resource that stores the version-selection-rules is
the appropriate place to store this information, since the modification
and creation of new revisions must be synchronized with the
version-selection-rules, or else newly created revisions might
disappear from view as soon as they are checked in.

Bottom line: A CHECKOUT, CHECKOUT-NEW, and CHECKIN command, with no
special CHECKOUT/IN headers (which should make Larry happy :-).

Note for Dave: The CHECKIN-NEW would be used to produce the "anonymous
new change-set" you wanted.

Note for Brad: A workspace property would be used to achieve the
-force_branch_on_new_version functionality that you wanted.

Note for All: This doesn't discuss UNCHECKOUT.

Cheers,
Geoff