Re: [long] Re: I-D ACTION:draft-ietf-webdav-versioning-01.txt from Geoffrey M. Clemm on 1999-02-08 (w3c-dist-auth@w3.org from January to March 1999)

From: Geoffrey M. Clemm <gclemm@tantalum.atria.com>
Date: Mon, 8 Feb 1999 00:35:30 -0500
To: max@glyphica.com
Cc: w3c-dist-auth@w3.org
Message-Id: <9902080535.AA27540@tantalum>
   From: Max Rible <max@glyphica.com>

Note:
Indented lines with a ">" are from Chris Kaler.
Indented lines without a ">" are from Max.

   ... the "configurations" from the versioning
   draft sound a lot like the "projects" we've been debating over here.
   After much hammer-and-tongs haranguing about the natures of projects, 
   we settled on the notion that a project would have a lot of links but 
   could own entities in its own right, in order to simplify matters
   of hierarchy.

There are two very different notions of project currently in common use.

The first (and the one I prefer) is "a team of people working together
to produce a particular product" (e.g. a project as defined by the tool
"Microsoft Project").

The second is "a set of resources" (e.g. a project as defined by the
tool "Microsoft Visual Studio", although it appears likely that
Visual Studio will stop using the term in this way).

The major reason for this confusion probably stems from the common
simple case where a set of resources is worked on by only one team of
people.  In this case it is natural to associate the set of resources
with the team that is responsible for it.  Unfortunately, when life
gets more complicated and several teams work on the same set of
resources, significant confusion results from this ambiguous use
of the term project.

So I'd strongly recommend reserving the word "project" for the team of
people building a particular release of a product, and use some other
term (such as "component" or "package" or "subsystem") to refer to "a
set of resources".

In particular, I'll be using this definition of project in the rest of
this message.  I'll also be using the term "workspace" to refer to the
resource that specifies how version selection is done for client
requests, and the term "configuration" to refer to the resource that
is an immutable set of revisions (sometimes called a "baseline" or a
"checkpoint").

   I'm trying to figure out if a project is likely to 
   *be* a configuration or *use* a configuration.

A project (as defined above) would "use" a configuration (as defined above)
to capture a particular state of the project.  The "product" produced
by the project would be the final configuration produced by the project.
A project would also use a set of workspaces, some for developers working
on that project, others for system integrators, testers, and release engineers.

   >[CK] Configurations are very similar, but also very different.  A
   >     configuration collection can be referenced in the Configuration-Id
   >     header.  That is not true of all MKREF collections.  As well, changes
   >     to the resources in the context of a configuration are automatically
   >     represented inside the configuration collection.  That is, if I
   >     rename foo.htm to bar.htm using MOVE in the /c/1 configuration, then
   >     inside /c/1 there will be a reference to bar.htm.

Note: Chris uses the term "configuration" to cover both what I would call
a "workspace" and what I would call a "configuration".  I believe that it
is essential that we clearly separate these two concepts (although I'd
be happy to not use the term "configuration" for the latter, since "baseline"
and "checkpoint" are probably more commonly used terms for this concept).

   >Should configurations be able to contain other configurations, or
   >simply references to them?  I can easily see that a configuration's
   >user might wish to partition it when it gets large and cluttered.

   >[CK] This is a really interesting question.  Conceptually, why not?
   >     However, that is really hard to represent in the resources.

   More than a collection is?  I may be missing something here...

I think that it is hard to represent when it is a broadly defined concept.
When it is more narrowly defined (i.e. an immutable set of revisions),
then it is quite simply represented.

   >                                                                  As
   >     well, some of the semantics start to get really messy.  What does
   >     it mean to have nested configurations?  What does it mean for a
   >     resource to be "in" nested configurations.  We opted to say that a
   >     configuration can be derived from another, but there isn't a notion 
   >     of containment.

There are two senses of the term "derived" ... one which I am for,
and one which I am against.

The one I am "for" is a relationship between configurations, namely,
that you can select a configuration in a workspace, make some changes
in that workspace, and then create a new configuration.  I would then
say that the new configuration is a "descendent" of the original one
(in the same way that one revision is a descendent of a prior
revision).

The one I am "against" is a relationship between workspaces, namely,
that you somehow automatically see any change from one workspace in
another.  I believe you can achieve the same effect in a more
controlled fashion by having two workspaces specify common
configurations and activities in their revision-selection-rules.

   >[CK] Another way to think of this is that the collections in the
   >     namespace represent the project and sub-project relationships
   >     and that configurations represent various "releases" of those
   >     projects.  In this way the "V2" configuration can be derived
   >     from the "V1" configuration.

   So you have:

   1.  A bunch of actual files in a namespace, representing the source tree.

Well, you have a bunch of resources accessible in a URL namespace ...
whether the server actually implements that as a bunch of actual files
in a namespace is up to the server (many will in fact do just that).

   2.  A number of configurations, each representing a separate release of the
       code, with links into multiple levels of the source tree.

Yes, this is the sense in which I would use the term "configuration",
although the links are to specific revisions, not into a "source tree".

   3.  Many more configurations, each depending on the release configurations,
       representing developer workspaces.

And these are what I would call "workspaces", not "configurations".
And a workspace "depends" on a (release) configuration by specifying
that configuration in its revision-selection-rule.

   The developer would then use the actual URIs from (1) for dealing with
   the files

Yes!

   and personal (3) workspace as a header to to select the 
   appropriate versions from (1),

Yes!

   and seldom (if ever) actually see the contents of (3)?

Yes, assuming by "the contents of (3)", you mean the flat list of references
to revisions.

   (Logging in to a workspace becomes interesting:
   how to choose from among a bunch of configurations?  

When time comes to change the configuration selected by your
workspace, you normally would pick a descendent of that configuration
(where the "descendent" of a configuration is described earlier).

   Thus, most WebDAV operations would not occur on configurations
   directly.  All the PUTs and GETs and PROPPATCHes and LOCKs and 
   UNLOCKs would happen using the URIs from (1) and the configuration
   ID from (3).

Yes!  The only time you'd refer to server-defined URL's would be when
you wanted to do things like label revisions you don't currently see
in your workspace (possibly, to make them visible in your workspace).

   The configuration would occasionally be added to
   through MKREF and removed from via DELETE, but all of this would
   happen behind the scenes.

Configurations are immutable, but workspaces would be changed by these
MKREF's and DELETE's.

   This makes a big difference in my visualization of configurations.  The
   impression I got from the document was that a user might want to create a
   configuration in their home directory on a server and use that as their
   local copy of the data in the source tree, that configurations would be
   something that you navigated into just like a normal directory tree.

   What I'm seeing now is something more like a lock token with a lot more
   state packed into it.  A user would "log in" to a configuration and
   begin sending the configuration-id header along with almost all
   requests to the server.  This would provide a context for their
   particular view of the source tree.

Yes!  (Replacing the term "configuration" with the term "workspace").

   Could a user ever wind up being "logged in" to multiple configurations?
   [re: BRANCH]

Not to multiple workspaces, but a given workspace could specify multiple
configurations and activities in its version-selection-rule.


   >[CK] This is something I should have made clearer.  Both are valid.
   >     The VER:... notion is used to refer to a specific revision via
   >     a URI.  You can also specify the URL and the Revision-Id header.
   >     Chalk this up to a bad example.

   Hopefully, the VER:... URI functionality will be optional?  Should there
   be a way to discover this functionality through OPTIONS?

I would restrict the use of these server generated URI's for CM
metadata manipulations (such as labeling, and doing merges) on
revisions not visible in the workspace.  Standard user commands such
as CHECKOUT/CHECKIN/UNCHECKOUT should be restricted to the human
understandable URI's.

   >>Regarding SETDEFAULT:  why is it specified as sending an XML body?

   >[CK] The idea was to allow the user to specify additional information
   >     as well as allow the DAV:none to cancel.  We could do this all 
   >     through headers, but XML seemed appropriate.

   What other information would go in with SETDEFAULT?  For that matter,
   is SETDEFAULT really doing anything you couldn't with PROPPATCH?
   If I'd read that SETDEFAULT functionality was to be implemented with
   a live property named DAV:defualtversion, I wouldn't have blinked...

Once you have the notion of a default workspace for a web site, you no
longer need the SETDEFAULT method ... you just make changes to the
revision-selection-rule of the default workspace, or modify the labels
or branches specified in that revision-selection-rule.

   Another question that comes to mind:  why are there properties using
   comma-separated lists (such as DAV:revisionlabel, DAV:mergedfrom, ...) 
   instead of an XML representation of such a list?  I would think
	   <D:revisionlabel>
	     <D:label>Beta test</D:label>
	     <D:label>Release</D:label>
	   </D:revisionlabel>
   would be more in keeping with the standard.

I share your distaste for something as mungy as a comma-separated list.
Although an XML property would be an improvement, I believe what we
really should do is represent a resource-list property as a collection.
One of the key advantages of using a collection is that it has an
incremental update mechanism, which circumvents the "lost update" problem
(i.e. two clients read a property list, make different changes, and then
write back their change, with the second writer overwriting the changes
made by the first writer).

Max: Another set of excellent observations/suggestions!

Cheers,
Geoff
Received on Monday, 8 February 1999 00:35:33 UTC