Re: Discussion Topic: Simple Version Selection and Checkout

jamsden@us.ibm.com
Fri, 29 Jan 1999 10:35:45 -0500


From: jamsden@us.ibm.com
To: ietf-dav-versioning@w3.org
Message-ID: <85256708.00560444.00@d54mta03.raleigh.ibm.com>
Date: Fri, 29 Jan 1999 10:35:45 -0500
Subject: Re: Discussion Topic: Simple Version Selection and Checkout




---------------------- Forwarded by Jim Amsden/Raleigh/IBM on 01/29/99
10:35 AM ---------------------------


Jim Amsden
01/29/99 10:13 AM

To:   "Geoffrey M. Clemm" <gclemm@tantalum.atria.com>
cc:
Subject:  Re: Discussion Topic: Simple Version Selection and Checkout
      (Document link not converted)

Some brief clarifications. Here's Geoff's results of using CM style URL
resolution for DMS systems:

(1) The concept of a "workspace" is introduced at the document-management
versioning level.

(2) The version-selection rule is a property of the workspace, so if that
rule says to do something special for URL-1 (e.g. "for URL-1, pick the
revision with id R1.3.5"), and you MOVE URL-1 to URL-2, you need to
update the version-selection-rule to do this special thing to URL-2
instead of URL-1.  In configuration management, this is not a problem
because version selection is almost always done with a floating label
or "branch/LATEST", rather than a bunch of special cases for individual
resources.  Is it a problem for document-management?

(3) Although you can access an arbitrary revision (since every revision
has a URL), only a revision exposed in a workspace can be modified.
This seems like a pretty reasonable constraint to me, but I wanted to
make sure that wasn't just my configuration management background
coming through.

First, once multiple versions are introduced, there must be some way for
users to refer to a particular revision of a resource. We introduced labels
to provide human meaningful names for particular revisions, and
configurations for refering to sets of particular revisions. But, we need
some way of using these labels and configurations to access the revision
they name. We also need a deterministic, controlled way of resolving
references to versioned resources that don't specify a particular label.
We've explored a number of approaches, none of which were without issues.

1. Munge the URL and include the label for the revision. For example,
http://host:8080/myprojects/index.html;r1.0.9. This was not considered
acceptable because it is not permissible to munge URLs, and labels would
have to be provided for each collection in the path, not just the leaf
resource.  Another problem is relative URLs would also have to contain
revision labels to get the right revision.

Note that HTTP/1.1 does allow ; in a URL, and the text following could be
used as a version label without violating any HTTP/1.1 rules. So this may
not be URL munging at all. Note also that index.html;r1.0.9 referes to a
specific revision of versioned resource index.html, no matter what versions
of collection myprojects it may be in. So unless index.html has been
removed from some version of myprojects, it's not really necessary to
specify a version of the collection in order to reference a version of one
of its members.

2. Require users to use an redirect URL generated by the server when the
revision is created. This allows standard URLs to be used to access
revisions of versioned resources, but the URLs would likely have little
resemblence to the URL of the versioned resource, and would probably not be
meaningful to human users.

4. Put the revision label in as a header for each resource, and provide
some default if the revision is not specified. The default could be some
functor like "latest". This works well for a single resource, but it
doesn't scale for collections, or a large number of resources as the client
has to keep too much revision information. You would need to use a revision
path in the header, not just a revision in order to provide revisions for
parent collections. The header would have to include labels, activities,
configurations, and various functors to provide flexible URL mapping. This
would be a complicated header that would have to be retained by the
clients, and set for each request. The server would be unaware of any
versioning context it might be able to cache between requests.

5. Use the primary rule of patterns: factor out the thing that changes into
a separate object and delegate. This is the workspace approach. We leave
resource URL's alone, they are the same for all revisions. The URL of a
versioned resource and all its revisions is the same as the URL of the
resource before it was versioned. This requires no changes to existing
WebDAV clients, and supports back-level clients on versioning servers.
Instead of worrying about the particular revision of each resource
requested, the client creates a workspace which contains a revision
selection rule that is reused for each request in the context of that
workspace. The semantics of the revision selection rule are well defined,
support parallel development and configurations, and are supported by the
server. The client just sends a workspace URL in a header with each
request, and it is used to resolve URLs to specific revisions. There would
be a default workspace whose revision selection rule contained only
"latest" that is used if the header is not specified. Workspaces are
resources that clients can develop editors to examine and set.

This works for back-level clients, DMS clients, and CM clients in a uniform
way, is consistent with relative URLs, and is reasonably simple. It allows
checked out revisions, revisions in the current activity, revisions with
specific labels, in a specific activity, being merged in the current
activity, belonging to a configuration, latest, etc. to be accessed using a
single mechanism. The downside is that DMS client applications will have to
set the workspace. This shouldn't be too bad though because DMS systems
won't likely have multiple activities for parallel development and don't
support configurations. It is likely the default workspace with checked out
and latest in its revision selection rule will be adequate for most uses.

For Geoff's item 2), I don't think moving URL-1 to URL-2 would require any
changes to the revision selection rule in the workspace. Assume the
revision selection rule contains label R1.3.5, and URL-1 has a revision
with that label. When URL-1 is copied or moved to URL-2, the labels go with
the revisions. So a reference to URL-2 in the workspace would resolve to
the corret revision.

Item 3) should also not be a problem. The workspace revision selection rule
can include functor "latest" which applies to all resources. So the
workspace would resolve all URLs to some revision, one that could be
checked out and modified. On checkin, the new revision would be visible to
anyone using this workspace. In general, putting "latest" in a revision
selection rule should be avoided, and if it is included, it must be the
last entry. This is because the user has not been specific about what
revisions his workspace should expose, and latest is a "floating label"
that moves with each new revision. This makes the workspace potentially
unstable and may expose incompatible revisions. WebDAV must support latest
though, and it is the only acceptable default.

Well, I guess it wasn't so brief after all. This is hard stuff, but I think
we're getting there. I'm keen to be sure that DMS systems are as consistent
with CM systems as possible while providing the additional flexibility for
mutable revisions. This will allow these systems to co-exist, and for DMS
client applications to incrementally include CM capabilities over time
without changing existing semantics.