From: jamsden@us.ibm.com To: ietf-dav-versioning@w3.org Message-ID: <8525677B.004A5F13.00@d54mta03.raleigh.ibm.com> Date: Mon, 24 May 1999 08:53:53 -0400 Subject: WebDAV Versioning Overview from Model Document The following is an excerpt from the WebDAV Versioning Model document that provides an updated overview of the versioning semantics. This is more detailed than the versioning summary I sent out last week, and contains some updates for the latest thinking on configurations and baselines. Please give this a thorough read and review before next week's design team meeting as this forms the basis of what the protocol should support. If there are any errors, omissions, or issues, we can resolve them as we review the protocol. I'm hoping we can use this overview as the introductory section of the protocol document after perhaps factoring it out so it can be used in both the model and protocol documents. I am assuming the definitions in the goals document. Perhaps those should be factored out and reused too. WebDAV Versioning Semantics Overview This section provides an overview of the WebDAV versioning semantics. Subsequent sections provide detailed methods and semantic rules. Creating Versioned Resources A resource is any potentially statefull entity that can be accessed on the web through a URL. The model below defines an interface, Resource that abstracts a web resource and its behavior. It also defines a specialization of Resource called ResourceCollection that adds containment or grouping behavior for Resources and other ResourceCollections using the Composite pattern. A resource may or may not be versioned. When a resource is first created by using a WebDAV PUT or MKCOL method, or Resource.setContents(), it is created as an unversioned resource. A resource may be checked in to make it a versioned resource, and to create the initial or first revision. A checked in revision cannot be modified by anyone at any time without checking it out first. When a resource is put under version control, the server generates a URI for the versioned resource as a whole, and for each revision that may be used to explicitly access that revision. These URIs are globally unique and are never reused for any other versioned resource or revision. Naming Revisions: Revision Ids and Labels Each revision of a versioned resource must be distinguished from other revisions of the same versioned resource. Revision names are used to distinguish revisions and consist of either a revision id or any number of revision labels. A revision of a versioned resource is given a system assigned revision id when it is checked in. This revision id acts as a persistent, immutable identifier distinguishing this revision from all others of the same versioned resource. The revision id cannot be changed, assigned to another revision, or reused. A user may assign other revision names called revision labels to a revision in order to distinguish it from other revisions using more meaningful names. The revision labels must be unique for any given versioned resource, but may be reassigned to any revision of the versioned resource at any time. Revisions of different versioned resources may have the same label. Modifying a Versioned Resource Subsequently, a client may reserve or check out a revision, which creates a working resource that is a copy of the checked out revision. Checking out a revision registers intent to modify the revision and prevents other users from modifying the same revision at the same time producing conflicting or lost updates. By adhering to a checkout, update, checking protocol, users are assured their updates will not be lost or conflict with those of other users. A working resource is identical to an unversioned resource in all respects other than that it has one or more predecessors. It may be edited by setting its properties or contents any number of times. When the client is satisfied that the working resource is in a state that should be retained in the version history, the client checks the working resource in to create a new revision of the resource. Users can use checkout/checkin to register intent to modify a versioned resource similar to the way lock and unlock are used in DAV level 2. The sense is reversed though. A checked in revision cannot be changed without checking it out first, and revision histories are maintained. The working resource may be checked in as either mutable or immutable. An immutable revision cannot be changed and provides a stable environment for history management, change recovery, merging, and configuration management. A mutable revision is more suitable for situations where versioning is treated more informally, and it is not necessary or desirable to maintain strict version histories, or to be guaranteed that it is always possible to backtrack to a previous point in time and recover. This form of versioning is typical of modern document management systems. If the revision is mutable, a subsequent checkin may be done with overwrite allowing the revision to be updated without creating a new revision. Any previous contents of the revision are lost. A mutable revision can also be checked in creating a new revision if the user wants to retain the previous revision. If the revision is immutable, a check in must create a new revision, checkin with overwrite is not allowed. Servers may choose to not allow revisions to be checked in as mutable, or they may not allow a revision to be checked in without creating a new revision. These constraints are typical of current configuration management systems. Document management systems typically allow revisions to be mutable and don't have these restrictions. Adding or removing member URL segments modifies a revision of a collection. Changing the contents of a member of a revision of a versioned collection does not imply a change to that revision of the versioned collection. The ability to checkout a revision may be controlled. A user may checkout a revision specifying a revision scope of shared or exclusive. Shared scope implies that other users may checkout the same revision in some other activity while exclusive scope prevents any parallel development on this revision. Checkout control is managed through locks on the versioned resource and/or the revision as describe in "Controlling Versioned Resources". Selecting Revision through the Workspace Resources, working resources, versioned resources, and revisions of versioned resources are all accessed using a URL. When a user agent accesses a revision of a versioned resource, it is necessary to provide additional information to specify which revision of the versioned resource should be accessed. Specifying the resource URL and a revision name can be used to access specific revisions of a versioned resource. However, this requires the user to add and remember labels for each revision, and does not provide a way of accessing revisions modified in an activity, or contained in a configuration. Nor does it enable non-versioning aware clients to access revisions. There must also be some way to distinguish working resources checked out from the same revision by different principals. Revisions are usually accessed using a simple, human meaningful URL for a versioned resource. A workspace may be used to provide a mapping between versioned resources and specific revisions of versioned resources. Setting the revision selection rule of a workspace specifies the mapping. This allows versioned resources and un-versioned resources to be accessed the same way. As a consequence, relative ULRs continue to work, and DAV class 1 or 2 clients that are not versioning aware are able to access versioned resources through a default workspace. The server maps the user URL to a versioned resource in a server-dependent way while the workspace selects a particular revision of the versioned resource. A workspace may contain a current activity and a revision selection rule. See the section "Parallel Development with Activities" for further details on activities. When a workspace revision selection rule is used to perform revision selection for a versioned resource: If the URL is to a checked out working resource, then it is selected. Working resource can only be accessed through a workspace. If the URL is to a versioned resource that is not checked out, the workspace revision selection rule is applied to select the revision. If there is no matching revision, then a resource not found status is returned. This rule is applied to collections to select the revision that determines their member names, and to other resources to determine the revision containing their contents. A workspace revision selection rule can specify any number of revision labels, activities, configurations, or the revision selector "latest" to specify what revision to select. The rules are applied in order until the first match is found. Any subsequent potential matches are ignored. A label matches a revision with that label. An activity matches the latest revision in that activity, and may result in merge conflicts with changes made in other activities. A configuration matches a revision contained in that configuration. Latest matches the latest revision based on the last modified time. See section "Configurations" for further details on configurations. See section ?Merging? for additional rules on revision selection when the revision selector is merged into the workspace revision selection rule. If a request is made and no workspace is specified, a default workspace containing no activity and "latest" in the revision selection rule is used. Administrators can change the current activity and revision selection rule of this default workspace to have such down-level client requests done in an activity, or to support access to more specific revisions of versioned resources. A resource revision is checked out in the context of a workspace, which is used, with the resource URL, to subsequently access the working resource. Different users can checkout the same revision in different workspaces and not see each other?s changes. If a workspace is not specified on checkout, the server selects a workspace and returns it in the checkout response. This workspace has no current activity, and no revision selection rule. It can only be used to access the checked out working resource. Client applications that wish to do their own URL to revision mappings and not rely on server workspaces may use these workspaces as checkout tokens to do so. Revisions are checked out in the current activity of the workspace if any. When the resource is checked back in, it remains visible in the workspace if the workspace revision selection rule contains the current activity. In order to prevent checked in revisions from becoming invisible in the workspace if activities aren't used, a workspace might have a current label. This label is automatically applied to any revision when it is checked in. If the versioned resource already has a revision with this label, the label is moved to the new revision. Putting this label in the workspace revision selection rule will ensure that all checked in revisions are visible in the workspace. Parallel Development with Activities When a revision of a versioned resource is already checked out, another user cannot check it out again and therefore cannot make any changes. In order to increase resource availability, avoid serializing work, and allow multiple users to make changes to the same revision simultaneously, a server may support parallel development. Parallel development allows users to choose to do work on a resource that is checked out by someone else in a different context, and to merge those changes together at some later time. Resources are checked out in the context of an activity. An activity abstracts a set of related changes a user is making to versioned resources. Each activity represents a thread of development. Servers may support multiple activities that can be used to enable parallel development. These different activities can be merged together at some later time in order to integrate the changes. A revision that is already checked out in an activity cannot be checked out again in the same activity. If parallel development is desired, a user can checkout the revision in another activity and merge them later. See the section "Merging" for further details. Activities can be seen as adding complexity to both clients and servers that may not be desirable for some situations. Simple parallel development does not require users to create activities, and set a current activity in their workspace. Clients can choose to manage their own parallel development and merge manually. The user just wants to checkout a revision, make some changes and check it back in. There is no need to organize changes, or provide sophisticated merging. If there is a conflict, users will simply resolve it at checkin manually, or not bother with the merge at all. Simple parallel development can be accomplished without using activities. A server may allow many checkouts of the same revision without using an activity. The workspace merge conflict report is not available to detect conflicts resulting from changes that were not made in the context of an activity. Client applications are responsible for detecting and integrating the changes. In order to prevent checked in revisions from becoming invisible when activities are not used, the workspace supports a current label. The current label is automatically moved to any new revision that is checked-in to that workspace. Two workspaces with different current labels can work in parallel on the same versioned-resource, and then simplified merging can be performed by adding both labels to the revision selection rule of the workspace incorporating the changes done in other workspaces. Configuration Management A workspace represents a volatile set of revisions. Any new checkouts in that workspace, changes to versioned resources that affect the revision selected by the revision selection rule, or changes to the revision selection rule itself, may result in the selection of different revisions or working resources for versioned resources. A configuration is a versionable resource that represents a consistent, immutable set of revisions. A configuration contains a set of revisions, where a given versioned resource can have at most one revision in a given configuration. A configuration cannot contain a mutable revision because the semantics of configurations cannot be guaranteed. Different revisions of a configuration can select different revisions of the same versioned resources, or can select revisions of different versioned resources. A configuration may be used as a revision selector in a workspace revision selection rule. A workspace whose version selection rule contains a configuration will always return the same revisions as long as there are no revisions checked out. A revision may be added to a configuration by a specific label, or is the revision may be selected by a given workspace. When a revision of a versioned collection is added to a configuration, it, and recursively all its members are included in the configuration. That is, a revision for the collection, and recursively revisions of all its members are selected. This enables configurations to maintain the state of namespaces defined by versioned collections as well as the state defined by the contents and properties of resources. Adding a revision to a configuration that already contains that member replaces the selected revision. The URL used to access a revision of a versioned resource in the context of a label or workspace when the revision is added to a configuration is not retained in the configuration. In order to access this revision at some later time, it is necessary to add the configuration to the revision selection rule of a workspace, and bind a name in the server's namespace to the versioned resource corresponding to the desired revision. Then the server uses the URL binding to access a versioned resource, and the workspace to select a particular revision as specified by the configuration in the revision selection rule. This allows flexibility in naming revisions in the context of how they are used. If the user URL of the revision is important, then it is possible to retain this information by putting a revision of the revision's parent collection in the configuration. Configurations can depend on other configurations. The meaning of this dependency is that when a configuration is used as a revision selector in a workspace revision selection rule, its dependent configurations are also implicitly included. Dependent configurations cannot have overlapping members. A versioned collection has an associated baseline which is a distinguished, versioned configuration containing the collection, and recursively, all its members. A new revision of a versioned collection baseline is created by baselining the collection. If a collection represents a component and its parts, a baseline of a collection represents a particular configuration of that component. Baselines provide a convenient means of accessing versions of a configuration of a versioned collection and facilitate reuse by helping users discover which configuration to use. Configurations are convenient for defining a persistent set of revisions that relate to each other in some specific way at some point in time. This can be useful for selecting consistent versions of resources to publish or deploy an application, or for recovering to a specific version state for legal or maintenance reasons. Versioned Collections A collection contains a set of member URL segments. For versioned collections, the members represent versioned resources, not particular revisions. To add or remove members from a revision of a versioned collection, it must be checked out just like any other resource. Creating a new revision of a member, or modifying a member has no effect on the collection. Deleting a versioned resource that is a member of a collection does not delete the versioned resource; it only deletes the member from that version of the collection. The resource may still be a member of a previous or subsequent revision of the collection or some other collection. The URL for a collection without a particular revision name is resolved to a particular revision using the workspace the same as any other resource. If the collection is part of a URL for some other resource, then its members are determined from the selected revision. When a revision of a collection is added to a configuration, then recursively, so are all its members. This is similar to COPY and MOVE, which must specify infinite depth. As described in the section "Configurations", a versioned collection may have a baseline which is a versioned configuration selecting a revision of the versioned collection, and recursively revisions of all its members. Revision History A revision may have one predecessor, zero or more merge predecessors, and more than one successor. A predecessor of a revision is a revision that this revision was derived from. A merge predecessor is a predecessor created by merging changes from a source predecessor resource into a target successor resource. A successor of a revision is a revision derived from this revision. Each revision has a line-of-descent that consists of a path from the initial revision of the resource to the selected revision along the successor/predecessor relationships. A line-of-descent specifies a portion of the overall history of the versioned resource. Each revision has a predecessor relationship with the revision it was checked out from, a merge predecessor relationship with the revisions merged into it, and a successor relationship with revisions that were checked out from it. Revisions are related to their predecessor and merge predecessors through is-derived-from or merged-from relationships. The revision history of a versioned resource includes these relationships along with revision ids and labels, revision descriptions, checked out state, etc. The revision history contains sufficient information so that a client may display or sort the history by last modified properties. Merging Each activity represents a separate parallel thread of development. Users may make their changes in the context of an activity. Changes to the same revision must be done in separate activities or using no activity. At some point, a user may want to merge changes made to the same revision together to create a new revision containing the combined updates. This is accomplished by merging an activity into a workspace. Revision selectors in the workspace revision selection rule are usually connected with an "or" relationship. That is, if a revision selector in the list denotes a matching revision , then that revision is selected and all the other revision selectors are ignored. When a merge is desired, the revision selector is added to the revision selection rule with a "merge" relationship. In this case, when a revision selector denotes a matching revision, all other revision selectors are examined to determine if they would select a conflicting revision. A merge conflict is determined by the following rules. In these rules, the merge source is the revision selected by the revision selector being merged into the workspace revision selection rule. The alternate revision is any other revision selected by some other revision selector in the workspace revision selection rule. 1. If the revision selected by the merge source specifies a predecessor of an alternate revision, then the alternate revision is selected. 2. If the merge source specifies a successor of an alternate revision, then the merge source revision is selected. 3. Otherwise the merge source and the alternate revision are revisions that are on different lines-of-descent, and a merge conflict exists. This merge conflict will be indicated when a conflicting revision is accessed through the workspace. In order to do a merge, it is first necessary to determine what must be merged. A user determines the conflicts by merging the source activity (or any other revision selector) with the workspace. This enters the merge source revision selector into the workspace revision selection rule with a "merge" relationship, and introduces merge conflicts that must be resolved. A merge conflict report lists the revisions that have been modified in parallel in different activities. The merge conflict report is generated by examining all resources selected by revision selectors merged into the workspace revision selection rule, and determining if those revisions conflict with any other revision selected by the workspace revision selection rule. A user can request the differences between two revisions of a resource (servers may provide a differences report, but they must at least indicate if they are the same or not). A user can request conflicts between an activity and the current workspace to generate a merge conflict report. A user can also request the differences between a configuration and the current workspace, which lists at least the activities that are contained in the configuration but not in the workspace and vice versa. So differences are detected at different levels: content differences for resources, revision differences for activities, and activity differences for configurations. Once the merge conflicts are known, the conflicts are resolved by merging the revisions from the merge source into the revision selected by the workspace to create a new working resource. Servers may perform some default auto merging, but at a minimum, the merge is done by checking out the revision in the current activity and noting the merge from the merge source. This creates a merge successor/predecessor relationship between the merge source and workspace revisions called merged-from. The conflict is now removed because the working resource is now a successor of both the source and target revisions. It is the user's responsibility to apply the differences in the two revisions in an appropriate manner. The merge is complete when all the conflicts are resolved, all differences have been merged, and the resources are all checked back in. When merging mutable revisions, the merge conflict report may be inaccurate as the source revision may change without the system being aware. Users are responsible for applying any changes to ancestor revisions to their descendants as appropriate. The system cannot determine if there are any changes that need to be applied other than by looking a the last-modified dates of the revisions. In summary, merging activities is simply adding the activities to the revision selection rule of a workspace. The workspace can then produce the potential revision conflicts by detecting activities or revision selectors that specify revisions on different lines of descent of the same versioned resource. These conflicts are available in the merge conflict report. Conflicts are resolved by merging the revisions creating new working resources where the client suitable applies changes from conflicting revisions. The merge is complete when the merge conflict report is empty. Locking Versioned Resources Locking a versioned resource prevents any principal other than the owner of the lock from checking out any revision in any activity. Locking a revision of a versioned resource prevents any principal other than the owner of the lock from checking out just that revision in any activity. Shared locks allow multiple principals to control checkouts on the versioned resource or revision. Locking an activity prevents any principal from making any further changes in the context of that activity. That is, it is not possible to checkout a resource using a locked activity. Locking a workspace prevents any principal from making any change to that workspace including changing the revision selection rule, or checking out any resources in that workspace.