From: Jeff_McAffer@oti.com (Jeff McAffer OTT) To: ietf-dav-versioning@w3.org ('deltav') Message-ID: <1999Apr06.200300.1250.1136829@otismtp.ott.oti.com> Date: Tue, 06 Apr 1999 20:06:44 -0400 Subject: configurations and all that... I would like to present what might be a different perspective on configurations. Basically our cut on this (uhh, what we want) is that configurations are "deep revisions of collections". Revisions of non-collection resources give users a recoverable immutable state for that resource. Revisioning a collection gets you a shallow (one level) fixing of content (i.e., the immediate member set is fixed but not their revisions or contents). Configurations were introduced (I believe) to recognize the need for generating recoverable states including collections. It appears that their origin was as a freezing of the state of a workspace such that whatever revision would have been selected at that moment was captured by the configuration. Recognizing that people might not want to capture their entire workspace, there has been talk about operating on workspaces and scoping the operation to narrow the group of resources captured. I get the feeling that while we acknowledge that configurations are not necessarily "the entire world" of immutable resources, people think of them as workspace things and assume they are relatively rare. Certainly it is assumed there are few of them in the RSRs. Let's turn the table a little and focus on the user view. Users have (potentially numerous and deep) collections of resource revisions identified by workspace RSRs and they want to capture them (perhaps independently) for later reuse. They might have all manner of stuff in their workspaces. Some of it ready to go to production, some just starting prototyping. The workspace is not the focus, the collections of resources are. The workspace is the view onto, or context for, the resources (i.e., specs revisions via RSRs) but that's it. Looking at it this way, its natural the to talk about revisioning these collections with depth infinity (i.e., "deep revisioning"). This is, I believe, the operation everyone has been talking about and calling "snapshot the workspace within some scope"? The distinction may appear subtle but it simplifies the explanation of the semantics. People understand collections. They understand deep and shallow. Other WebDAV people have been working hard on collection semantics. I suspect that versioning will have many of the same issues. It would be great if we could derive our semantics from theirs so we appear as a simple variation (if at all). Thought: Is "snapshot" = "checkin a collection with depth infinity"? The collection you check-in is the scope (references the roots of the collections of interest). The deep check-in updates the collection to contain all the mappings... In the latest conference call there was discussion about scoping and building a collection of scope patterns or starting points to use when snapshotting. Geoff has introduced DAV:versioned-collection. I'm not sure but this mixed with the scope ideas looks to me like the collection we are deep versioning! Check it in (or pass it to the snapshot method) and have it is updated to record the correct revisions of the referenced resources and their children. Think of the snapshot operation as "updating a deep revision of a collection (ie., a configuration)". - pass in one indicating just the "roots" or "scope" and you get a full snapshot (within that scope). - pass in one from a previous snapshot and you get an incremental snapshot. (i.e., it is updated). This goes part way to addressing the issue of incremental updates of configurations etc. I would guess that many servers could do interesting optimizations given the previous state and the current state. - pass in one with just / (the global root) and you get a snapshot of the whole workspace. Ok, none of this is radically different from what we have talked about. The reason I think of things this way is because of what we do here. Software development environments. I am sitting here with a pretty standard system which has several hundred individual, independent "components" loaded. Each component is a collection of resources which needs to be individually and deeply revisioned. While these components are often shared, exchanged, shipped, ... (whatever) in groups, the grouping may change from operation to operation or user to user. Question: How do I ship a component to someone and retain revision info etc.? On the issue of lots of configs in RSRs, lets assume that an individual user decided to try creating a small set of configs for the purposes of limiting the RSR count. To do this, he creates a bunch of "super configurations" which specify "needed configurations" (we call them "prerequisites" or "required-maps"). This creates some prereq DAG rooted with a few super-configs. Those configs are put in the RSRs. BTW, everything had to be checked in (i.e., revisioned), as I understand it, for the RSRs to work properly? Assume I have an RSR which refers to a revision of config C. Consider what happens when I create a new revision of some resource A which is included in some config X. If I want to avoid RSR hacking, I have to update (i.e., create a new revision of) X. X is "needed" by Y. So I crack open Y and update and revision it. Y is "needed" by Z, ... is needed by C. I sure hope configurations have a lightweight implementation (in both speed and space). Note that this is roughly what we do now. Manually! It sucks. It is so hard to manage direct revision dependencies like this that we had to remove some of the prerequisite identification requirements as a matter of practicality. Our saving grace is that currently the environment allows us to spec and share unrevisioned configurations in our equivalent of RSRs so we don't have to do the revisioning very often (typically once a release cycle). An automated mechanism would not be much better due to the revision bloat. Revisions should be interesting user checkpoints. A change to a dependee is only interesting when the dependent says it is. If you force revisioning up the dependent chain, you end up with an explosion of useless revisions. Manually/independently revisioning 5 prerequisites of Z should not create 5 different revisions of Z. Who is going to name all these revisions? How are users going to manage/understand/find out which revisions are which? Why is a small change at a low level so immediately and obviously forced on people at the high level? A reasonable alternative is to have the root super-config (i.e., C) in the RSRs and then add the new revision of X to the RSRs such that it overrides C. This should work well but it leads us back to the beginning in that we may well have lots of configs spec'd in workspace RSRs. Everytime I revision something in a different component (or whatever I define as my finest grained, deeply versioned collection) I add to the RSRs. ***NOTE: Users are going to define this granularity. For some, the collections they want to deep revision contain whole websites and they will have only one collection, for others they contain one part of one component and they have thousands. It is whatever makes sense for the user's domain. We would do well to not make too many assumptions about this. I don't see having lots of configs in the RSRs as a problem. One can see a number of ways for servers to optimize the config searching to make this a non-issue. Prerequisites (i.e., needed configs) are useful ways for users to group/reuse logically coherent resource sets but BEWARE! Maintaining these dependencies is a NON-TRIVIAL amount of work for the user. Further, users should not be creating these to satisfy the system (i.e., webdav) but to help them solve their problems. This structure may or may not fit into some nice hierarchy. I have no problem with having this capability in WebDAV but we will not use it (much) and any problem that is solved only by using "needed configs" is not solved for us. Anyway, this has gone on long enough. The summary is that by changing the focus a bit to; see the problem as "how do I deep revision collections of resources", assume that there can be many many of these deep revisions, and phrase RSRs in terms of these deep revisions, we can leverage the user's understanding of collections as well as the work done by the collection people, solve more problems and end up with a model that is powerful and flexible without too much strain. Jeff