configurations and all that...
Jeff McAffer OTT (Jeff_McAffer@oti.com)
Tue, 06 Apr 1999 20:06:44 -0400
From: Jeff_McAffer@oti.com (Jeff McAffer OTT)
To: ietf-dav-versioning@w3.org ('deltav')
Message-ID: <1999Apr06.200300.1250.1136829@otismtp.ott.oti.com>
Date: Tue, 06 Apr 1999 20:06:44 -0400
Subject: configurations and all that...
I would like to present what might be a different perspective on
configurations. Basically our cut on this (uhh, what we want) is that
configurations are "deep revisions of collections".
Revisions of non-collection resources give users a recoverable immutable
state for that resource. Revisioning a collection gets you a shallow
(one level) fixing of content (i.e., the immediate member set is fixed
but not their revisions or contents). Configurations were introduced (I
believe) to recognize the need for generating recoverable states
including collections.
It appears that their origin was as a freezing of the state of a
workspace such that whatever revision would have been selected at that
moment was captured by the configuration. Recognizing that people might
not want to capture their entire workspace, there has been talk about
operating on workspaces and scoping the operation to narrow the group of
resources captured. I get the feeling that while we acknowledge that
configurations are not necessarily "the entire world" of immutable
resources, people think of them as workspace things and assume they are
relatively rare. Certainly it is assumed there are few of them in the
RSRs.
Let's turn the table a little and focus on the user view. Users have
(potentially numerous and deep) collections of resource revisions
identified by workspace RSRs and they want to capture them (perhaps
independently) for later reuse. They might have all manner of stuff in
their workspaces. Some of it ready to go to production, some just
starting prototyping. The workspace is not the focus, the collections of
resources are. The workspace is the view onto, or context for, the
resources (i.e., specs revisions via RSRs) but that's it.
Looking at it this way, its natural the to talk about revisioning these
collections with depth infinity (i.e., "deep revisioning"). This is, I
believe, the operation everyone has been talking about and calling
"snapshot the workspace within some scope"? The distinction may appear
subtle but it simplifies the explanation of the semantics. People
understand collections. They understand deep and shallow. Other WebDAV
people have been working hard on collection semantics. I suspect that
versioning will have many of the same issues. It would be great if we
could derive our semantics from theirs so we appear as a simple variation
(if at all).
Thought: Is "snapshot" = "checkin a collection with depth infinity"?
The collection you check-in is the scope (references the roots
of the collections of interest). The deep check-in updates the
collection to contain all the mappings...
In the latest conference call there was discussion about scoping and
building a collection of scope patterns or starting points to use when
snapshotting. Geoff has introduced DAV:versioned-collection. I'm not
sure but this mixed with the scope ideas looks to me like the collection
we are deep versioning! Check it in (or pass it to the snapshot method)
and have it is updated to record the correct revisions of the referenced
resources and their children.
Think of the snapshot operation as "updating a deep revision of a
collection (ie., a configuration)".
- pass in one indicating just the "roots" or "scope" and you get a full
snapshot (within that scope).
- pass in one from a previous snapshot and you get an incremental
snapshot.
(i.e., it is updated). This goes part way to addressing the issue of
incremental updates of configurations etc. I would guess that many
servers
could do interesting optimizations given the previous state and the
current state.
- pass in one with just / (the global root) and you get a snapshot of
the
whole workspace.
Ok, none of this is radically different from what we have talked about.
The reason I think of things this way is because of what we do here.
Software development environments. I am sitting here with a pretty
standard system which has several hundred individual, independent
"components" loaded. Each component is a collection of resources which
needs to be individually and deeply revisioned. While these components
are often shared, exchanged, shipped, ... (whatever) in groups, the
grouping may change from operation to operation or user to user.
Question: How do I ship a component to someone and retain revision info
etc.?
On the issue of lots of configs in RSRs, lets assume that an individual
user decided to try creating a small set of configs for the purposes of
limiting the RSR count. To do this, he creates a bunch of "super
configurations" which specify "needed configurations" (we call them
"prerequisites" or "required-maps"). This creates some prereq DAG rooted
with a few super-configs. Those configs are put in the RSRs. BTW,
everything had to be checked in (i.e., revisioned), as I understand it,
for the RSRs to work properly?
Assume I have an RSR which refers to a revision of config C. Consider
what happens when I create a new revision of some resource A which is
included in some config X. If I want to avoid RSR hacking, I have to
update (i.e., create a new revision of) X. X is "needed" by Y. So I
crack open Y and update and revision it. Y is "needed" by Z, ... is
needed by C. I sure hope configurations have a lightweight
implementation (in both speed and space).
Note that this is roughly what we do now. Manually! It sucks. It is so
hard to manage direct revision dependencies like this that we had to
remove some of the prerequisite identification requirements as a matter
of practicality. Our saving grace is that currently the environment
allows us to spec and share unrevisioned configurations in our equivalent
of RSRs so we don't have to do the revisioning very often (typically once
a release cycle).
An automated mechanism would not be much better due to the revision
bloat. Revisions should be interesting user checkpoints. A change to a
dependee is only interesting when the dependent says it is. If you force
revisioning up the dependent chain, you end up with an explosion of
useless revisions. Manually/independently revisioning 5 prerequisites of
Z should not create 5 different revisions of Z. Who is going to name all
these revisions? How are users going to manage/understand/find out which
revisions are which? Why is a small change at a low level so immediately
and obviously forced on people at the high level?
A reasonable alternative is to have the root super-config (i.e., C) in
the RSRs and then add the new revision of X to the RSRs such that it
overrides C. This should work well but it leads us back to the beginning
in that we may well have lots of configs spec'd in workspace RSRs.
Everytime I revision something in a different component (or whatever I
define as my finest grained, deeply versioned collection) I add to the
RSRs.
***NOTE: Users are going to define this granularity. For some, the
collections they want to deep revision contain whole websites and they
will have only one collection, for others they contain one part of one
component and they have thousands. It is whatever makes sense for the
user's domain. We would do well to not make too many assumptions about
this.
I don't see having lots of configs in the RSRs as a problem. One can see
a number of ways for servers to optimize the config searching to make
this a non-issue. Prerequisites (i.e., needed configs) are useful ways
for users to group/reuse logically coherent resource sets but BEWARE!
Maintaining these dependencies is a NON-TRIVIAL amount of work for the
user. Further, users should not be creating these to satisfy the system
(i.e., webdav) but to help them solve their problems. This structure may
or may not fit into some nice hierarchy. I have no problem with having
this capability in WebDAV but we will not use it (much) and any problem
that is solved only by using "needed configs" is not solved for us.
Anyway, this has gone on long enough. The summary is that by changing
the focus a bit to; see the problem as "how do I deep revision
collections of resources", assume that there can be many many of these
deep revisions, and phrase RSRs in terms of these deep revisions, we can
leverage the user's understanding of collections as well as the work done
by the collection people, solve more problems and end up with a model
that is powerful and flexible without too much strain.
Jeff