Re: Baselines vs. labels

From: Eric Sedlar (esedlar@us.oracle.com)
Date: Sat, Jan 08 2000
Next message: Chris Kaler: "proposed versioning usage scenario for client-managed checkouts"
Previous message: Geoffrey M. Clemm: "http://dav.ics.uci.edu/vdt/draft-ietf-deltav-versioning-01.2"
In reply to: Geoffrey M. Clemm: "Re: Baselines vs. labels"
Next in thread: jamsden@us.ibm.com: "Re: Baselines vs. labels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Other mail archives: [this mailing list] [other W3C mailing lists]
Mail actions: [ respond to this message ] [ mail a new topic ]
Message-ID: <01b501bf5a10$9d300610$9a114498@us.oracle.com>
From: "Eric Sedlar" <esedlar@us.oracle.com>
To: "Geoffrey M. Clemm" <geoffrey.clemm@rational.com>, <ietf-dav-versioning@w3.org>
Date: Sat, 8 Jan 2000 11:42:46 -0800
Subject: Re: Baselines vs. labels

Thanks for pointing out the caching benefit of baselines, and the way a
shared activity can be used to modify them.  My biggest beef is not with
baselines, but with the number of ways of seleting a set of revisions:

* baseline
* configuration
* shared activity (with RSRs specifying the revisions)
* label

If you are willing to get rid of configurations, let's get rid of them.

--Eric

----- Original Message -----
From: "Geoffrey M. Clemm" <geoffrey.clemm@rational.com>
To: <ietf-dav-versioning@w3.org>
Sent: Tuesday, January 04, 2000 8:06 AM
Subject: Re: Baselines vs. labels


>
>    From: "Eric Sedlar" <esedlar@us.oracle.com>
>
>    Can someone give a bit of rationale for when you use baselines and when
>    you use a label applied recursively to all the elements of a
>    collection?  Is a baseline just a specialized case of a label?
>
> One key difference between a baseline and a label is that you can
> ask a revision "what labels are on you" but you cannot ask a revision
> "what baselines include you".  This can have significant performance
> impact on certain implementation choices.
>
> Another key difference is a baseline is created by taking a "snapshot"
> of the state of a workspace, which can provide a very cheap implementation
> mechanism (i.e. based on the current value of the
revision-selection-rule).
> A label is moved arbitrarily, so there is no revision-selection-rule
> optimization available.
>
> Another key difference is that a baseline captures the "history" of a
> set of revisions, i.e. you can find predecessor baselines and successor
> baselines.  In contrast, a label just captures the current state of
> a collection of revisions.  Where the label was in the past is lost.
>
> So a label and a baseline have very different properties (neither is
> a specialization of the other).  Which one you use depends on which
> of these properties are more important.
>
>    When do you recommend using recursive labels vs. a baseline?  For
>    typical configuration management needs, if I want to create release
>    1.0.1.1 of my product, it seems like I might want the ability to slip
>    in a new revision of a file at the last minute, which would mean
>    moving a label from version to version.
>
> The fact that you have a 1.0.1.1 means you probably have a history
> captured by naming conventions on your labels.  With baselines, you
> have your history explicitly modeled via predecessor and successor
> properties (and therefore unlike naming conventions, interoperable
> between different clients and servers).  In a baseline system, you
> use "activities" to capture dynamic change (i.e. slipping in a new
> version), and only create new baselines when you have a set of revisions
> that you want to capture in the history.  You don't modify an
> existing baseline for the same reasons you don't modify an existing
> revision.
>
>    I couldn't do that with a
>    baseline, since I would need a new baseline, which would have a
>    different URI, so this wouldn't be transparent to people who were
>    already using release 1.0.1.1 of my product.  It might be useful to
>    include some scenarios in the spec as to when to use either.
>
> You would have a "release-1.0.1.1" activity to share between folks
> that want to see the release-1.0.1.1 work in progress.  You would
> create a baseline only when there is a state of release-1.0.1.1 that
> you want to capture for history.
>
>    Since version management systems like CVS use tags (i.e. labels) in
>    this way, I think some clarifications in the spec in this area would
>    be helpful.
>
> CVS uses labels to model both activities and baselines.  A label that
> is being moved forward along some line of descent is modeled as an
> activity.  A label that never moves is modeled as a baseline.
>
>    From: "Eric Sedlar" <esedlar@us.oracle.com>
>
>    It seems excessively complex to have three different ways to identify
>    a set of revisions (for use in revision selection rules).  I don't see
>    much utility for baselines if you can never change the revision of a
>    particular file in a baseline.
>
> Baselines and activities are designed to be used together.
> Baselines capture the history of a set of related resources.
> Activities capture the changes that lead from one state of the
> history to the next.  If you want to see a particular state of
> the history, you place a baseline in your revision-selection-rule.
> If you want to see changes from that baseline, you add the appropriate
> activity (activities) to you your revision-selection-rule.
>
>    It seems to me that the performance benefits of baselines are based on
>    the fact that you have a contiguous subtree of revisions, where there
>    is no need to check the revision selection rules when traversing a
>    link (this often involves searching through the list of revisions in a
>    versioned resource to find the latest one or a particular label,
>    etc.).  Each collection revision in a baseline can point directly to
>    the associated revision the next layer down.
>
> The key optimization is based on the fact that it is a snapshot of the
> state of a workspace.  This means that you can just capture the state
> of the revision-selection-rule property, and not scan the resources at
> all.
>
>    What if you introduced a new concept like "baseline configuration".  A
>    baseline configuration would be rooted at a particular versioned
>    collection recursively, just like a baseline.  However, you would be
>    allowed to change the revisions in a baseline configuration after
>    creating the configuration.  Then you can get rid of baselines &
>    configurations, and simplify the spec.
>
> I'd be happy to get rid of "configurations", and just keep baselines
> and labels.  But I would not be willing to get rid of baselines, since
> then the only way to capture the state of a set of versioned resources
> would be to enumerate the currently selected revisions, which would
> not scale.
>
>    Is the reason you consider labels less "reliable" than configurations
>    due to the assumption that you are protecting them with access control
>    on a bunch of different resources rather than access control on a
>    single resource?
>
> A label is just an XML element within a resource property.  It is
> very unlikely that we will define access control down to that level
> (i.e. this element in this property is read-only by this individual).
> This means that access control on labels is unlikely to ever be provided.
>
>    Also, can an administrator rename baselines?  (E.g. I create a
>    baseline from /amazon/catalogs/music at a particular point in time,
>    and store it in "/baselines/amazon/catalogs/music/dec6_99.base".  Then
>    I modify the revision selection rules in the workspace I have
>    selected, and create a new baseline from /amazon/catalogs/music, which
>    includes a different set of revisions, and call it
>    "/baselines/amazon/catalogs/music/temp.base".  Can I delete the first
>    baseline and rename the second one to have the same name as the first
>    one, thus changing the selected revisions for anyone who has
>    referenced "/baselines/amazon/catalogs/music/dec6_9.base" in their
>    revision selection rules?
>
> A baselines is given an immutable name by the server, not a mutable
> name by the client (JimA has tried to argue otherwise, but we are
> vigourously resisting :-).  The main reason is that it is the immutability
> of a baseline that leads to a variety of optimizations (i.e. I can cache
> the baseline locally, and not have to keep going back to the server
> to see if it has "changed").
>
>    From: "Eric Sedlar" <esedlar@us.oracle.com>
>
>    1) To justify having a "baseline" concept in the spec, I think we need
to have
>        * a real customer scenario where the absolute guarantee a baseline
will
>    never be changed is necessary, and
>
> If I ship a release to a customer, I want to know what was in that
release,
> with no if, ands, or buts.
>
>        * show that this is a significant enough case to warrant the
complexity
>
> If I can't reliably reproduce a shipped release, my attempts to reproduce
> and fix a customer problem are serverely hampered.
>
>    2) Even if you can come up with 1), I would argue that the performance
benefits
>    of a baseline should be available to whatever mechanism is used to
represent a
>    release (currently a configuration)
>
> I would represent a release with a baseline.
> Performance benefits come from restrictions, not generalizations.
> In particular, the performance benefits of baselines over labels
> and configurations derive from the "snapshot the state of my workspace"
> characteristics of baselines.
>
>    since I think that is going to be far more
>    commonly used, hence why something like the "baseline configuration"
concept
>    I'm proposing might be a good idea.
>
> If you want the performance benefits provided by baselines, you will
> *only* use baselines and activities (i.e. not labels or general
> configurations).
>
>    3) I don't see anything in the spec preventing any WebDAV resource
(baselines,
>    configurations, etc.) from being renamed by a user.  If you did, you
would have
>    to reserve the entire namespace above the location of the baseline or
whatever,
>    and make it appear as a read-only filesystem.
>
> We cannot prevent people with access to the server from administratively
> changing the URL's, but we can ensure that the protocol provides no means
> of doing so.  In particular, we can require that a MOVE on one of these
> special URL's (e.g. revision and baseline URL's) fail.
>
>    From: jamsden@us.ibm.com
>
>    ... I've never
>    quite understood the use case for baselines either. ... As far as
>    performance and optimization is concerned, a server is free to examine
the
>    contents of a configuration when it encounters it in a workspace
revision
>    selection rule, and based on its contents, perform any optimizations it
>    wants.
>
> The optimization isn't at reference time, but at creation time.
> When you create a baseline, you can snapshot the state of a revision
> selection rule.  This is not feasible for a general configuration which
> can be created and modified out of the context of a workspace.
>
> Cheers,
> Geoff
>
> --
> Geoffrey M. Clemm
> Chief Engineer, Configuration Management Business Unit
> Rational Software Corporation
> (781) 676-2684   geoffrey.clemm@rational.com   http://www.rational.com
>
>
Next message: Chris Kaler: "proposed versioning usage scenario for client-managed checkouts"
Previous message: Geoffrey M. Clemm: "http://dav.ics.uci.edu/vdt/draft-ietf-deltav-versioning-01.2"
In reply to: Geoffrey M. Clemm: "Re: Baselines vs. labels"
Next in thread: jamsden@us.ibm.com: "Re: Baselines vs. labels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Other mail archives: [this mailing list] [other W3C mailing lists]
Mail actions: [ respond to this message ] [ mail a new topic ]