configurations and all that...

Jeff McAffer OTT (Jeff_McAffer@oti.com)
Tue, 06 Apr 1999 20:06:44 -0400


From: Jeff_McAffer@oti.com (Jeff McAffer OTT)
To: ietf-dav-versioning@w3.org ('deltav')
Message-ID: <1999Apr06.200300.1250.1136829@otismtp.ott.oti.com>
Date: Tue, 06 Apr 1999 20:06:44 -0400
Subject: configurations and all that...



I would like to present what might be a different perspective on   
configurations.  Basically our cut on this (uhh, what we want) is that   
configurations are "deep revisions of collections".

Revisions of non-collection resources give users a recoverable immutable   
state for that resource.  Revisioning a collection gets you a shallow   
(one level) fixing of content (i.e., the immediate member set is fixed   
but not their revisions or contents).  Configurations were introduced (I   
believe) to recognize the need for generating recoverable states   
including collections.

It appears that their origin was as a freezing of the state of a   
workspace such that whatever revision would have been selected at that   
moment was captured by the configuration.  Recognizing that people might   
not want to capture their entire workspace, there has been talk about   
operating on workspaces and scoping the operation to narrow the group of   
resources captured.  I get the feeling that while we acknowledge that   
configurations are not necessarily "the entire world" of immutable   
resources, people think of them as workspace things and assume they are   
relatively rare.  Certainly it is assumed there are few of them in the   
RSRs.

Let's turn the table a little and focus on the user view.  Users have   
(potentially numerous and deep) collections of resource revisions   
identified by workspace RSRs and they want to capture them (perhaps   
independently) for later reuse.  They might have all manner of stuff in   
their workspaces.  Some of it ready to go to production, some just   
starting prototyping.  The workspace is not the focus, the collections of   
resources are.  The workspace is the view onto, or context for, the   
resources (i.e., specs revisions via RSRs) but that's it.

Looking at it this way, its natural the to talk about revisioning these   
collections with depth infinity (i.e., "deep revisioning").  This is, I   
believe, the operation everyone has been talking about and calling   
"snapshot the workspace within some scope"?  The distinction may appear   
subtle but it simplifies the explanation of the semantics.  People   
understand collections.  They understand deep and shallow.  Other WebDAV   
people have been working hard on collection semantics.  I suspect that   
versioning will have many of the same issues.  It would be great if we   
could derive our semantics from theirs so we appear as a simple variation   
(if at all).

Thought:  Is "snapshot" = "checkin a collection with depth infinity"?
    The collection you check-in is the scope (references the roots
    of the collections of interest).  The deep check-in updates the
    collection to contain all the mappings...

In the latest conference call there was discussion about scoping and   
building a collection of scope patterns or starting points to use when   
snapshotting.  Geoff has introduced DAV:versioned-collection.  I'm not   
sure but this mixed with the scope ideas looks to me like the collection   
we are deep versioning!  Check it in (or pass it to the snapshot method)   
and have it is updated to record the correct revisions of the referenced   
resources and their children.

Think of the snapshot operation as "updating a deep revision of a   
collection (ie., a configuration)".
  - pass in one indicating just the "roots" or "scope" and you get a full   

    snapshot (within that scope).
  - pass in one from a previous snapshot and you get an incremental   
snapshot.
    (i.e., it is updated).  This goes part way to addressing the issue of   

    incremental updates of configurations etc.  I would guess that many   
servers
    could do interesting optimizations given the previous state and the
    current state.
  - pass in one with just / (the global root) and you get a snapshot of   
the
    whole workspace.

Ok, none of this is radically different from what we have talked about.   
 The reason I think of things this way is because of what we do here.   
 Software development environments.  I am sitting here with a pretty   
standard system which has several hundred individual, independent   
"components" loaded.  Each component is a collection of resources which   
needs to be individually and deeply revisioned.  While these components   
are often shared, exchanged, shipped, ... (whatever) in groups, the   
grouping may change from operation to operation or user to user.

Question: How do I ship a component to someone and retain revision info   
etc.?

On the issue of lots of configs in RSRs, lets assume that an individual   
user decided to try creating a small set of configs for the purposes of   
limiting the RSR count.  To do this, he creates a bunch of "super   
configurations" which specify "needed configurations" (we call them   
"prerequisites" or "required-maps").  This creates some prereq DAG rooted   
with a few super-configs.  Those configs are put in the RSRs.  BTW,   
everything had to be checked in (i.e., revisioned), as I understand it,   
for the RSRs to work properly?

Assume I have an RSR which refers to a revision of config C.  Consider   
what happens when I create a new revision of some resource A which is   
included in some config X.  If I want to avoid RSR hacking, I have to   
update (i.e., create a new revision of) X.  X is "needed" by Y.  So I   
crack open Y and update and revision it.  Y is "needed" by Z, ...  is   
needed by C.  I sure hope configurations have a lightweight   
implementation (in both speed and space).

Note that this is roughly what we do now.  Manually!  It sucks.  It is so   
hard to manage direct revision dependencies like this that we had to   
remove some of the prerequisite identification requirements as a matter   
of practicality.  Our saving grace is that currently the environment   
allows us to spec and share unrevisioned configurations in our equivalent   
of RSRs so we don't have to do the revisioning very often (typically once   
a release cycle).

An automated mechanism would not be much better due to the revision   
bloat.  Revisions should be interesting user checkpoints.  A change to a   
dependee is only interesting when the dependent says it is.  If you force   
revisioning up the dependent chain, you end up with an explosion of   
useless revisions.  Manually/independently revisioning 5 prerequisites of   
Z should not create 5 different revisions of Z.  Who is going to name all   
these revisions?  How are users going to manage/understand/find out which   
revisions are which?  Why is a small change at a low level so immediately   
and obviously forced on people at the high level?

A reasonable alternative is to have the root super-config (i.e., C) in   
the RSRs and then add the new revision of X to the RSRs such that it   
overrides C.  This should work well but it leads us back to the beginning   
in that we may well have lots of configs spec'd in workspace RSRs.   
 Everytime I revision something in a different component (or whatever I   
define as my finest grained, deeply versioned collection) I add to the   
RSRs.

***NOTE:  Users are going to define this granularity.  For some, the   
collections they want to deep revision contain whole websites and they   
will have only one collection, for others they contain one part of one   
component and they have thousands.  It is whatever makes sense for the   
user's domain.  We would do well to not make too many assumptions about   
this.

I don't see having lots of configs in the RSRs as a problem.  One can see   
a number of ways for servers to optimize the config searching to make   
this a non-issue.  Prerequisites (i.e., needed configs) are useful ways   
for users to group/reuse logically coherent resource sets but BEWARE!   
 Maintaining these dependencies is a NON-TRIVIAL amount of work for the   
user.  Further, users should not be creating these to satisfy the system   
(i.e., webdav) but to help them solve their problems.  This structure may   
or may not fit into some nice hierarchy.  I have no problem with having   
this capability in WebDAV but we will not use it (much) and any problem   
that is solved only by using "needed configs" is not solved for us.

Anyway, this has gone on long enough.  The summary is that by changing   
the focus a bit to; see the problem as "how do I deep revision   
collections of resources", assume that there can be many many of these   
deep revisions, and phrase RSRs in terms of these deep revisions, we can   
leverage the user's understanding of collections as well as the work done   
by the collection people, solve more problems and end up with a model   
that is powerful and flexible without too much strain.

Jeff