- From: Babich, Alan <ABabich@filenet.com>
- Date: Fri, 24 Jul 1998 16:44:47 -0700
- To: "'ejw@ics.uci.edu'" <ejw@ics.uci.edu>, John Stracke <francis@netscape.com>, Chris Kaler <ckaler@microsoft.com>, Bradley Sergeant <bradley_sergeant@intersolv.com>, Alan Babich <ABabich@felix.filenet.com>, Sam Ruby <rubys@us.ibm.com>, Bruce Cragun <Cragun.Bruce@gw.novell.com>, David Durand <dgd@cs.bu.edu>, Sridhar Iyengar <sridhar.iyengar@mv.unisys.com>
- Cc: Alex Hopmann <alexhop@microsoft.com>, "'webdav'" <w3c-dist-auth@w3.org>
> a) Prepare at least one (and ideally many more than one) > scenario. Please > email it out to the rest of the design team before the meeting (by the > 6th) -- you should send it to the general WebDAV mailing list as well. OK, Jim, as per your request, here's a scenario from the real world. WARNING: This is a long e-mail (about 345 lines). A software company is porting several hundred thousands of lines of C code to several different platforms. The company intends to have exactly one source base that handles all platforms. Therefore, there will be #ifdef's in some of the source files that control selection of text for platform dependent stuff like I/O. There are two field releases being supported as linear lines of development, and the current development release is being supported as a linear line of development. The field release lines of development branch off the development release line of development when a change is made to the development release but not a field release. First, let's consider how we start out. Then we will consider what happens during the port, i.e., parallel editing. Consider one source file x.c . It started out: 1.1 --> 1.2 --> 1.3 Those are the good old RCS version labels. (First number is number of times the same node was branched. Second number is consecutive linear line of development change number.) We attach user version labels, because we don't care about no stinking RCS version labels. :-) They are irrelevant to us humans. So, we invent a convention where our user version labels are of the form r<major release number>_<minor release number>_<build number>_ <change number for the major/minor release> User version labels are a necessity in order to be able to recreate or initially create any base level of any release. In order to do so, you have to select a whole collection of files, and the exact versions you need must be specified in a simple way, e.g., release number and build level. (For example, in order to do that for build 7 of release 1.0, you merely check out a read only copy of the version of every file that has the version label of the form r1_0_7_x where x is maximal. There is no possibility of such a simple algorithm against the hardwired RCS version labels.) The layer on top of sccs puts the user version labels on automatically. So, the version structure for x.c is actually 1.1 --> 1.2 --> 1.3 r1_0_0_0 r1_0_1_1 r1_0_2_2 1.1 was the initial version of x.c for build 0 of release 1.0. 1.2 was change 1 for build 1 of release 1.0. 1.3 was change 2 for build 2 of release 1.0. OK. So now we release 1.0 to the field and start work on release 1.1. Nothing happens until we change x.c . There are two possibilities. Either we are making a pure new development change, or are fixing a bug in the field, and we want that exact same fix in the development release. First, we fix a bug and the fix is exactly the same (and x.c is exactly the same) in the development release (release 1.1) for build 3 of the field release, and build 0 of the development release. The change is automatically "rolled forward" by the tools by simply putting on multiple version labels on the new file. This can optionally be done only when checking out and in the tip of a line of development. 1.1 --> 1.2 --> 1.3 --> 1.4 r1_0_0_0 r1_0_1_1 r1_0_2_2 r1_0_3_3 r1_1_0_0 Next, we put in a piece of pure new development for build 8 of the development release (i.e., 1.1). 1.1 --> 1.2 --> 1.3 --> 1.4 --> 1.4 r1_0_0_0 r1_0_1_1 r1_0_2_2 r1_0_3_3 r1_1_8_0 r1_1_0_0 Next, we fix a bug in the field release (i.e., 1.0) build 7. The change can not "roll forward" to the development release, because we have made a development only change. In other words, we are now checking out (and back in) a node that is not at the tip of the line of development. This causes the version tree to branch. 1.1 --> 1.2 --> 1.3 --> 1.4 --> 1.5 r1_0_0_0 r1_0_1_1 r1_0_2_2 r1_0_2_3 r1_1_8_1 r1_1_0_0 | v 1.4.1.1 r1_0_7_4 Then we fix another bug for build 9 of the field release (i.e., 1.0). 1.1 --> 1.2 --> 1.3 --> 1.4 --> 1.5 r1_0_0_0 r1_0_1_1 r1_0_2_2 r1_0_2_3 r1_1_8_1 r1_1_0_0 | v 1.4.1.1 --> 1.4.1.2 r1_0_7_4 r1_0_9_5 It should be clear how the lines of development progress from here. Note that the development release is always the main trunk (i.e., has RCS version numbers of the form "1.x".) If we ever check out and in node 1.4, the RCS number would be 2.4.1.1, and there would be two direct offspring from node 1.4 (1.4.1.1, and 2.4.1.1). Branching the same node more than once is unusual, and I'm not going to bother to illustrate that. --- OK. So much for preliminaries. The above is slightly simplified from what we actually did, but that's OK. Now for the port to multiple platforms (parallel editing). To simplify things, lets only show the end of the line of development for the current development release (i.e., 1.1) for file x.c . 1.5 r1_1_8_1 Now Christine comes along and starts to port x.c to Solaris. She checks out a copy of r1_1_8_1 of x.c in her private working directory. (She also makes copies of lots of other source files, of course.) She does not lock any files. She makes a copy and doesn't leave x.c locked, because it's going to take her quite a while (weeks or months) to finish porting what she is porting. Joe, who is adding new features to the product, may need to continue the main line of development in the interim. He can not be stopped dead in his tracks by Christine checking out x.c and leaving it locked for weeks or months. Now Joe makes a change to x.c on the main line of development on the original platform (AIX) for build 10 of the development release. Joe is not coordinating with Christine, and Christine is not coordinating with Joe. 1.5 --> 1.6 r1_1_8_1 r1_1_10_2 This doesn't affect Christine, who has her own copies of all the files. Now Sam comes along and starts to port x.c to HPUX. So Sam checks out a copy of r1_1_10_2 of x.c (and a bunch of other source files) into his private directory and goes to town on the port. Just as Christine didn't leave any files locked, Joe doesn't leave any files locked either. Joe, Christine, and Sam are all working in parallel and not coordinating with each other. Joe makes another change to x.c for build 12 of the development release. 1.5 --> 1.6 --> 1.7 r1_1_8_1 r1_1_10_2 r1_1_12_3 Now, Christine finishes her port. So, she checks out x.c (r1_1_12_3) and leaves it locked. She compares the r1_1_12_3 version against her private copy of x.c (based on r1_1_8_1). If Joe did anything to x.c that interferes with what she did to it, Christine resolves the discrepancies by editing her private copy. Once she has decided that all discrepancies are resolved, she checks in x.c to build 15 of the development release using her final copy of x.c . Version r1_1_12_3 is only locked for the duration of her merge. 1.5 --> 1.6 --> 1.7 --> 1.8 r1_1_8_1 r1_1_10_2 r1_1_12_3 r1_1_15_4 Now x.c can theoretically run on AIX and Solaris, and Joe and Sam are working in parallel and not coordinating with each other. Now Sam finishes his HPUX port. Sam checks out a copy of x.c (r1_1_15_4) and leaves it locked. He looks to see that what he has done against his copy of r1_1_10_2 is still valid against r1_1_15_4. Sam resolves any discrepancies in his private copy. Then Sam checks in his private copy against build 17 of the development release. 1.5 --> 1.6 --> 1.7 --> 1.8 --> 1.9 r1_1_8_1 r1_1_10_2 r1_1_12_3 r1_1_15_4 r1_1_17_5 Now, x.c can theoretically run on AIX, Solaris, and HPUX, and the binaries for all the platforms can be complied from the same source base. The ports are done. OK. Now several things should be clear: (0) Simple linear lines of development are critical. You can never lose track of the lines of development. (1) User labels are necessary in order to retrieve a coordinated set of files to reproduce an arbitrary build. The RCS labels are totally inadequate for this purpose, since there is no dependable pattern across a large set of files. (2) Multiple user labels must be assignable to the same version of a file in order to support multiple releases (e.g., multiple field releases and new development). (3) Parallel editing requires a merge. In general, there is no general algorithm that can perform this merge for you. Human insight is required. Tools such as diff can help, but, in the end, there are lots of situations in which a human has to check the results regardless of the tools used. (4) It is not reasonable to expect N versions to be merged all at the same time. That makes the problem exponentially more complicated, and humans don't do well at things that get exponentially more complicated with N. So, merges should be done pairwise. (5) One can not keep the main line of development locked for a very long period of time. (6) Yet exclusive locking is necessary for ordinary development, and to protect the decisions made during a merge. (7) Exclusive locking is necessary and sufficient to do parallel editing. (8) Using the approach of this example, part of the history of the derivation was lost. (From the final version graph, you can't tell that Christine worked against r1_1_8_1 or that Sam worked against r1_1_10_2. You would need checkin comments to tell you that.) This may not be desirable. (This issue is addressed in the next section.) --- In the above example, it may be desirable to be more explicit about the history of how a version was derived. For example, when Sam checked in his version, the new version he created was dependent on both the version he originally checked out and the one that was current when he finished the port. Furthermore, it may be desirable for Sam to check in intermediate versions of his files periodically. These are regarded as "work in progress" versions, because they aren't considered finished yet. Yet, it may be a good idea to check in such work in progress versions periodically, if only to get them into the safekeeping of the source code control system. Backups are one possible consideration. Having copies on multiple disks is, in general, safer than just having a copy of your work on one disk, even if no backups are done. In order to accomplish these goals, we only need one additional thing -- the ability to indicate that a line of development merges into another one. One way to do this is as follows. When Christine started her port, she could have forced an identical version of x.c to be created as the next version in the main line of development. Then, x.c will be locked for an extremely brief time. Then, Christine can check out the next to last version. When she checks it in, the version graph will branch. She can keep checking in and out versions on her very own branch until the port works. Then, she can do a checkin that (a) terminates her "port to Solaris" branch, and (b) extends the main line of development branch. A version label convention will have to be adopted for her "port to Solaris" branch. Sam can do the same thing for his "port to HPUX" branch. Let's just look at what the final result might be for Christine's port starting against r1_1_8_1: 1.5 --> 1.6 --> 1.7 --> 1.8 --> 1.9 r1_1_8_1 r1_1_10_2 r1_1_12_3 r1_1_15_4 r1_1_17_5 | (same as ^ | r1_1_8_1) / v / 1.5.1.1 --> 1.5.1.2 -------------------------------- s1_1_0_0 s1_1_1_1 Here Christine forced the creation of r1_1_10_2 to be exactly the same as r1_1_8_1 by checkout with lock and checkin with no changes. Then she checked out r1_1_8_1 with lock for the Solaris release s1_1. Then she checked it in, creating s1_1_0_0, a work in progress version. Note that since she didn't check out the tip, she forced a branch. Then she checked s1_1_0_0 out with lock and in again to create s1_1_1_1, her final version before the merge. Meanwhile, Joe created r1_1_12_3 from r1_1_10_2, and created r1_1_15_4 from r1_1_12_3 on the main line of development to implement new features. Finally, Christine checked out and locked r1_1_14_4, made the necessary adjustments to her private copy of s1_1_1_1 based on her private copy of r1_1_14_4, and finally checked this private copy in as r1_1_17_5. The new thing is that happened on this checkin is that an arc from s1_1_1_1 to r1_1_17_5 was created. The normal arc from r1_1_15_4 plus the new arc indicate that r1_1_17_5 was derived from both of those versions. Thus, we have a complete history of Christine's port and Joe's new features, and all the derivation relationships. It's clear how to add SAM to this scenario. Since all checkout's with lock are exclusive, checkin with merge is done pairwise -- there are no more than 2 incoming arcs. Alan Babich
Received on Friday, 24 July 1998 19:47:55 UTC