Message-ID: <01b501bf5a10$9d300610$9a114498@us.oracle.com> From: "Eric Sedlar" <esedlar@us.oracle.com> To: "Geoffrey M. Clemm" <geoffrey.clemm@rational.com>, <ietf-dav-versioning@w3.org> Date: Sat, 8 Jan 2000 11:42:46 -0800 Subject: Re: Baselines vs. labels Thanks for pointing out the caching benefit of baselines, and the way a shared activity can be used to modify them. My biggest beef is not with baselines, but with the number of ways of seleting a set of revisions: * baseline * configuration * shared activity (with RSRs specifying the revisions) * label If you are willing to get rid of configurations, let's get rid of them. --Eric ----- Original Message ----- From: "Geoffrey M. Clemm" <geoffrey.clemm@rational.com> To: <ietf-dav-versioning@w3.org> Sent: Tuesday, January 04, 2000 8:06 AM Subject: Re: Baselines vs. labels > > From: "Eric Sedlar" <esedlar@us.oracle.com> > > Can someone give a bit of rationale for when you use baselines and when > you use a label applied recursively to all the elements of a > collection? Is a baseline just a specialized case of a label? > > One key difference between a baseline and a label is that you can > ask a revision "what labels are on you" but you cannot ask a revision > "what baselines include you". This can have significant performance > impact on certain implementation choices. > > Another key difference is a baseline is created by taking a "snapshot" > of the state of a workspace, which can provide a very cheap implementation > mechanism (i.e. based on the current value of the revision-selection-rule). > A label is moved arbitrarily, so there is no revision-selection-rule > optimization available. > > Another key difference is that a baseline captures the "history" of a > set of revisions, i.e. you can find predecessor baselines and successor > baselines. In contrast, a label just captures the current state of > a collection of revisions. Where the label was in the past is lost. > > So a label and a baseline have very different properties (neither is > a specialization of the other). Which one you use depends on which > of these properties are more important. > > When do you recommend using recursive labels vs. a baseline? For > typical configuration management needs, if I want to create release > 1.0.1.1 of my product, it seems like I might want the ability to slip > in a new revision of a file at the last minute, which would mean > moving a label from version to version. > > The fact that you have a 1.0.1.1 means you probably have a history > captured by naming conventions on your labels. With baselines, you > have your history explicitly modeled via predecessor and successor > properties (and therefore unlike naming conventions, interoperable > between different clients and servers). In a baseline system, you > use "activities" to capture dynamic change (i.e. slipping in a new > version), and only create new baselines when you have a set of revisions > that you want to capture in the history. You don't modify an > existing baseline for the same reasons you don't modify an existing > revision. > > I couldn't do that with a > baseline, since I would need a new baseline, which would have a > different URI, so this wouldn't be transparent to people who were > already using release 1.0.1.1 of my product. It might be useful to > include some scenarios in the spec as to when to use either. > > You would have a "release-1.0.1.1" activity to share between folks > that want to see the release-1.0.1.1 work in progress. You would > create a baseline only when there is a state of release-1.0.1.1 that > you want to capture for history. > > Since version management systems like CVS use tags (i.e. labels) in > this way, I think some clarifications in the spec in this area would > be helpful. > > CVS uses labels to model both activities and baselines. A label that > is being moved forward along some line of descent is modeled as an > activity. A label that never moves is modeled as a baseline. > > From: "Eric Sedlar" <esedlar@us.oracle.com> > > It seems excessively complex to have three different ways to identify > a set of revisions (for use in revision selection rules). I don't see > much utility for baselines if you can never change the revision of a > particular file in a baseline. > > Baselines and activities are designed to be used together. > Baselines capture the history of a set of related resources. > Activities capture the changes that lead from one state of the > history to the next. If you want to see a particular state of > the history, you place a baseline in your revision-selection-rule. > If you want to see changes from that baseline, you add the appropriate > activity (activities) to you your revision-selection-rule. > > It seems to me that the performance benefits of baselines are based on > the fact that you have a contiguous subtree of revisions, where there > is no need to check the revision selection rules when traversing a > link (this often involves searching through the list of revisions in a > versioned resource to find the latest one or a particular label, > etc.). Each collection revision in a baseline can point directly to > the associated revision the next layer down. > > The key optimization is based on the fact that it is a snapshot of the > state of a workspace. This means that you can just capture the state > of the revision-selection-rule property, and not scan the resources at > all. > > What if you introduced a new concept like "baseline configuration". A > baseline configuration would be rooted at a particular versioned > collection recursively, just like a baseline. However, you would be > allowed to change the revisions in a baseline configuration after > creating the configuration. Then you can get rid of baselines & > configurations, and simplify the spec. > > I'd be happy to get rid of "configurations", and just keep baselines > and labels. But I would not be willing to get rid of baselines, since > then the only way to capture the state of a set of versioned resources > would be to enumerate the currently selected revisions, which would > not scale. > > Is the reason you consider labels less "reliable" than configurations > due to the assumption that you are protecting them with access control > on a bunch of different resources rather than access control on a > single resource? > > A label is just an XML element within a resource property. It is > very unlikely that we will define access control down to that level > (i.e. this element in this property is read-only by this individual). > This means that access control on labels is unlikely to ever be provided. > > Also, can an administrator rename baselines? (E.g. I create a > baseline from /amazon/catalogs/music at a particular point in time, > and store it in "/baselines/amazon/catalogs/music/dec6_99.base". Then > I modify the revision selection rules in the workspace I have > selected, and create a new baseline from /amazon/catalogs/music, which > includes a different set of revisions, and call it > "/baselines/amazon/catalogs/music/temp.base". Can I delete the first > baseline and rename the second one to have the same name as the first > one, thus changing the selected revisions for anyone who has > referenced "/baselines/amazon/catalogs/music/dec6_9.base" in their > revision selection rules? > > A baselines is given an immutable name by the server, not a mutable > name by the client (JimA has tried to argue otherwise, but we are > vigourously resisting :-). The main reason is that it is the immutability > of a baseline that leads to a variety of optimizations (i.e. I can cache > the baseline locally, and not have to keep going back to the server > to see if it has "changed"). > > From: "Eric Sedlar" <esedlar@us.oracle.com> > > 1) To justify having a "baseline" concept in the spec, I think we need to have > * a real customer scenario where the absolute guarantee a baseline will > never be changed is necessary, and > > If I ship a release to a customer, I want to know what was in that release, > with no if, ands, or buts. > > * show that this is a significant enough case to warrant the complexity > > If I can't reliably reproduce a shipped release, my attempts to reproduce > and fix a customer problem are serverely hampered. > > 2) Even if you can come up with 1), I would argue that the performance benefits > of a baseline should be available to whatever mechanism is used to represent a > release (currently a configuration) > > I would represent a release with a baseline. > Performance benefits come from restrictions, not generalizations. > In particular, the performance benefits of baselines over labels > and configurations derive from the "snapshot the state of my workspace" > characteristics of baselines. > > since I think that is going to be far more > commonly used, hence why something like the "baseline configuration" concept > I'm proposing might be a good idea. > > If you want the performance benefits provided by baselines, you will > *only* use baselines and activities (i.e. not labels or general > configurations). > > 3) I don't see anything in the spec preventing any WebDAV resource (baselines, > configurations, etc.) from being renamed by a user. If you did, you would have > to reserve the entire namespace above the location of the baseline or whatever, > and make it appear as a read-only filesystem. > > We cannot prevent people with access to the server from administratively > changing the URL's, but we can ensure that the protocol provides no means > of doing so. In particular, we can require that a MOVE on one of these > special URL's (e.g. revision and baseline URL's) fail. > > From: jamsden@us.ibm.com > > ... I've never > quite understood the use case for baselines either. ... As far as > performance and optimization is concerned, a server is free to examine the > contents of a configuration when it encounters it in a workspace revision > selection rule, and based on its contents, perform any optimizations it > wants. > > The optimization isn't at reference time, but at creation time. > When you create a baseline, you can snapshot the state of a revision > selection rule. This is not feasible for a general configuration which > can be created and modified out of the context of a workspace. > > Cheers, > Geoff > > -- > Geoffrey M. Clemm > Chief Engineer, Configuration Management Business Unit > Rational Software Corporation > (781) 676-2684 geoffrey.clemm@rational.com http://www.rational.com > >