From: jamsden@us.ibm.com To: "Geoffrey M. Clemm" <gclemm@tantalum.atria.com> cc: ietf-dav-versioning@w3.org Message-ID: <85256704.004C8534.00@d54mta03.raleigh.ibm.com> Date: Mon, 25 Jan 1999 08:49:58 -0500 Subject: Re: CHECKIN/CHECKOUT Geoff, Some comments and questions below in <jra> tags. "Geoffrey M. Clemm" <gclemm@tantalum.atria.com> on 01/19/99 12:19:36 AM To: ietf-dav-versioning@w3.org cc: (bcc: Jim Amsden/Raleigh/IBM) Subject: CHECKIN/CHECKOUT Even though we missed the Jan 11 deadline, Jim Amsden and I have been hard at work on the WebDAV versioning model, and hope to have some documents to send out by the end of January (Jim even flew up to Mass in last week's ice storm for 2 days to work on it with me, which definitely deserves a "beyond-the-call-of-duty" award). Before I get my part of those documents finished up, I wanted to send out the long-promised "CHECKOUT/CHECKIN" document. This document has been significantly affected by the results of our modeling efforts, and so will probably bear only remote resemblance to what folks remember of the various prior conversations on the subject. So with that in mind, here's what I've got: ------------------------------------------------------------ CHECKOUT/CHECKIN Putting a Resource under Version Control When a resource is put under version control, it becomes unwriteable. In order to modify a resource, it must first be checked out, then can be modified one or more times, and then checked back in to indicate you are done modifying it. If your CHECKOUT fails, it means someone else is currently modifying the document, so you should only do a GET with the understanding that the results are only temporarily valid. Checkout vs. Lock Note the distinction between a (write) LOCK and CHECKOUT. The LOCK takes a resource that is writeable by everyone and temporarily makes it unwriteable by everyone except the lock holder (until it is UNLOCK'ed). A CHECKOUT takes a resource that is unwriteable by everyone, and temporarily makes it writeable (until it is CHECKIN'd). It is reasonable to apply a LOCK to a checked-out resource, but is not required. In particular, many systems will decide the LOCK is irrelevant, since a "friendly" client will delay writing until it can perform a CHECKOUT, and an "unfriendly" client can just wait until the UNLOCK and then trash the resource contents at will. <jra> But then CHECKOUT/CHECKIN are really the same as UNLOCK/LOCK where the user or server had decided to leave the resource locked by default instead of unlocked. I'm not suggesting this semantic should be exploited, just that there is a correspondence. </jra> Immutable-Revisions An immutable-revision is a revision whose contents (and immutable properties) cannot be changed. More precisely, an attempt to retrieve the contents or immutable properties of an immutable-revision will always return the same contents or will fail. Therefore a server can delete the contents or properties of an immutable-revision (resulting in a failure when an attempt is made to retrieve those contents or properties), but can never delete the immutable-revision itself. Mutable-Revisions A mutable-revision is a revision whose contents and properties can be changed, although an attempt to change the contents or the "immutable properties" of a mutable-revision must be preceded by an explicit "checkout/thaw" operation, and then should be followed by a "checkin/freeze" operation to return it to a read-only state. This then requires two flavors of checkout: a checkout that unfreezes an existing mutable-revision (which I'll call CHECKOUT) and a checkout that creates a new (unfrozen) mutable-revision that is based on an existing mutable-revision (which I'll call CHECKOUT-NEW). Branching When a versioned resource supports immutable-revisions, it is still necessary to support "change". In particular, there must be some resource that you can name, that will periodically take on new values. For a versioned resource with immutable-revisions, this analogue to a mutable-revision is called a "branch". Like a mutable-revision, a branch can be checked-out, changed, and then checked-back in. The tip of the branch then reflects this change. Also as with mutable revisions, you sometimes want to check out a new branch that is based on (the tip of) an existing branch, which requires another flavor of checkout (i.e. CHECKOUT-NEW). <jra> So how does this relate to activities where a revision is checked out in the context of an activity? Also, what's the difference between checking out a branch, and checking out-new on a revision? It seems confusing that we checkout a branch resource in order to change some other resource. I don't think the analogy above holds. For mutable revisions, checkout just makes it writable in place. Checkout-new creates a new revision in the context of an activity. Say there is no parallel development. Then all checkout-news are done in the same activity and one gets a single line-of-descent. This is what you described as checkout on a branch for mutable revisions above. Creating new activities to support parallel development is likely to be something DMS vendors will want to do, even though the server can't give them a reliable conflict report. As described above, CHECKOUT-NEW on an immutable revision would be the same as checking out an immutable revision in a different activity creating the potential for a merge. This seems very different than checkout and create a new revision for mutable revisions. </jra> From a protocol perspective, this provides a way to unify the worlds of mutable and immtable-revisions. In each world, there is CHECKOUT, CHECKOUT-NEW, and CHECKIN, where CHECKOUT modifies an existing modifiable entitity, while CHECKOUT-NEW creates a new modifiable entity that can be modified in parallel with the original entity. CHECKIN is used in either case to return the resource to a readonly state. <jra> Interesting, but it doesn't feel like it unifies it all that much. Here's the semantics you described: Mutable revisions: checkout: make it writeable checkout-new: create a writeable copy and set the ancestor/descendent was-derived-from relationship checkin: make it temporarily read-only no support for activities, single line-of-descent no configurations Immutable revisions: checkout: create a writeable copy and set the predecessor/successor is-derived-from relationship checkout-new: create a branch, a new line-of-descent to support parallel development, and do a checkout on that branch checkin: make it read-only can support multiple activities, multiple lines-of-descent, and merging can support configurations When you list it this way, it doesn't seem so uniform. I think the uniformity comes from a consept that is too abstract and won't be of sufficient interest to users. Another more significant problem is that there is no way to mix mutable and immutable revision semantics on the same resource which I think will inhibit DMS style clients from gradually migrating to CM semantics. </jra> The alternative is to provide THAW/FREEZE operations that can only be applied to mutable-revisions, resulting in inoperability between servers that support mutable-revisions and servers that support immutable-revisions. <jra> Here's another alternative whose semantics come from merging our two views: Mutable revisions: checkout: make it writeable checkout-new: create a writeable copy and set the ancestor/descendent was-derived-from relationship. Always done in the context of an activity, even if the server only supports one. checkin: make it temporarily read-only checkin-immutable: make the new revision immutable can support multiple activities, multiple lines-of-descent, but merge conflict report would be advisory only Immutable revisions: checkout: always fails checkout-new: create a writeable copy and set the predecessor/successor is-derived-from relationship. Always done in the context of an activity, even if the server only supports one. checkin: make it temporarily read-only checkin-immutable: make the new revision immutable can support multiple activities, multiple lines-of-descent, and merge conflict report is reliable configurations can be supported for immutable revisions This may not be as uniform in the abstract sense as what you describe above, but it seems more uniform in the concrete. The only difference between mutable and immutable revisions is the expected errors on checkout that enforce the mutability of the resource. Parallel development is a completely orthogonal concept in this case. If your server supports multiple activities, you have parallel development. If not, the server effectively supports only one activity, and there is no parallel development support. Users can decide on mutability on a revision by revision basis. For example, it would be great if during the early stages of development on a program one could make the revisions mutable. This would eliminate a lot of useless history during the initial development and discovery stages. Not only does this save space, but it simplifies the revision history of the resource. Then when the user decides something is stable, he can checkin immutable to permanently save the revision. I would recommend implementing the protocol with two methods CHECKOUT and CHECKIN. CHECKOUT would have a boolean header MakeNewRevision and CHECKIN would have a boolean header MakeMutable. DMS clients would set the MakeNewRevision header to false, and the MakeMutable header to true by default. CM clients would do the opposite. Servers supporting both semantics wouldn't care, but would probably provide the CM default in order to support their richer semantics. Servers supporting only DMS or CM semantics, but not both would give Bad Request responses for the methods they can't support. Next, let's look at client/server interoperability. Of all the possible combinations, the ones in question are DMS client on CM server and CM client on DMS server. In this case, I'm assuming a DMS server does not support checkin(immutable), and a CM server does not support checkin(mutable). The second case is simple and can be immediately dismissed. A CM client cannot expect to get semantics from a server that doesn't support it. This is a up-level compatibility we wouldn't expect. So the only interesting case is a DMS client on a CM only server. Given the semantics above, checkout (no new revision), and checkin(mutable) would fail so that the DMS client defaults would not work. This may be OK, but it isn't ideal. Another alternative would be for CM-only servers to realize they may be getting these requests, and to simulate them with mutiple revisions that the client doesn't see. This can easily be done because the server can distinguish between is-derived-from and was-derived-from relationships, even if the participating revisions are never mutable. So the "mutable" revision history can be tracked even though there may be many immutable revisions that are not shown. I think this is the same implementation you described in your solution, except the same resource cannot have both a mutable and immutable revision history. I think this approach provides the flexibility DMS systems want for mutability, but retains as much of the richer semantics of CM systems as can be reliably supported. At the same time, the semantics of the methods are exactly the same in both cases other than expected error conditions for changing immutable revisions. Finally, DMS clients are free to incrementally exploit the richer CM semantics without changing their current semantics. </jra> Specifying CHECKOUT/CHECKOUT-NEW Policy For versioning individual documents, it is sufficient to just let the user select CHECKOUT or CHECKOUT-NEW, as they see fit. For versioning sets of related resources that are being modified in parallel by multiple users over the course of multiple sessions, it is essential that the server provide a mechanism for the client to store its versioning policy in a form that can be queried (and updated) by multiple clients over multiple sessions. This is especially true when the CHECKOUT choice for a particular resource is dependent on the current state of the revision graph for that resource. I propose that the "workspace" resource that stores the version-selection-rules is the appropriate place to store this information, since the modification and creation of new revisions must be synchronized with the version-selection-rules, or else newly created revisions might disappear from view as soon as they are checked in. Bottom line: A CHECKOUT, CHECKOUT-NEW, and CHECKIN command, with no special CHECKOUT/IN headers (which should make Larry happy :-). Note for Dave: The CHECKIN-NEW would be used to produce the "anonymous new change-set" you wanted. Note for Brad: A workspace property would be used to achieve the -force_branch_on_new_version functionality that you wanted. Note for All: This doesn't discuss UNCHECKOUT. Cheers, Geoff