Re: Configurations, sets and collections

Geoffrey M. Clemm (gclemm@tantalum.atria.com)
Tue, 20 Apr 1999 00:15:47 -0400


Date: Tue, 20 Apr 1999 00:15:47 -0400
Message-Id: <9904200415.AA25853@tantalum>
From: "Geoffrey M. Clemm" <gclemm@tantalum.atria.com>
To: Jeff_McAffer@oti.com
Cc: ietf-dav-versioning@w3.org
In-Reply-To: <1999Apr19.175810.1250.1151130@otismtp.ott.oti.com>
Subject: Re: Configurations, sets and collections


   From: Jeff_McAffer@oti.com (Jeff McAffer OTT)

   Consider the following simple structure in a particular workspace.  Shown   
   are the human URL segments and the versioned resource (VRxx) to which the   
   URLs map.
     /  (VR4)
       A/  (VR7)
	 B/  (VR 6) 
	   index.html  (VR 9)
       X/  (VR 2)
	 cool.gif  (VR3)
       Y/  (VR1)
	  ...

To avoid confusion, we probably need a bit more information.
In particular, we need to specify which revision of each
versioned-collection is selected in the workspace, since it
is a revision of a versioned-collection that has members,
not the versioned-collection itself.
So let's assume the following mapping defined by the workspace:

VR4 -> R5
VR7 -> R23
VR6 -> R3
VR9 -> R9
VR2 -> R12
VR3 -> R11
...

Then the state of some key versioned-collection revisions would be:

R5: [A->VR7, X->VR2, Y->VR1]
R23: [B->VR6]
R3: [index.html->VR9]
R12: [cool.gif->VR3]

   What this structure really implies is that VR4 (/) is a collection   
   resource which has three named elements, A, X and Y where A => VR7, X =>   
   VR2 and Y => VR1.

Actually, VR4 is a versioned-collection, for which the workspace currently
selects R5.  R5 is a versioned-collection revision which has
three members named A, X, and Y which are bound to VR7, VR2, and VR1,
respectively.

   Similarly, VR7 is a collection which has one element   
   such that B => VR6.  and so on.

More precisely, VR7 is a versioned-collection, for which the workspace
currently selects R23, and R23 has a single member named B that
is bound to VR6.

   Below is a description of the two proposals for configurations ("sets"   
   and "collections") in terms of the above example.

   SETS
   ====
   The "Sets" proposal said that a configuration was a *set* of resource   
   revisions.  A couple things to note:
    - revisions have a property which indicates their versioned-resource.
    - "setness" here is based on versioned-resource not revision.
      That is, there can only be one entry in the configuration
      for a particular versioned-resource.

The "setness" was intended just to say that a configuration is a "set
of revisions".  It is true that this set is constrained so that there
can only be one entry in the configuration for a particular
versioned-resource.  A revision can be treated as defining a mapping,
namely, a mapping from the versioned-resource containing that revision
to that revision.

    - As such, the "set" can be seen as a table where a key is a
      versioned-resource id and the corresponding value is a revision
      id.  A given key can only occur in a configuration once.

Yes.

   In the above example, a configuration encompassing A and X (but not /)   
   would look something like this (where Rxx is a revision id).
    VR7 = R23
    VR6 = R3
    VR9 = R9
    VR2 = R12
    VR3 = R11

   Because VR7 and VR2 are collections and collections map member names to   
   versioned-resources, the namespace *below* A and X are preserved.

Rather "Because R23 and R12 are versioned-collection revisions,
they immutably define the names of their internal members."

   That is, the *relative* path B/index.htm still exists if you can get
   to VR7.   

Actually, R23 just guarantees that VR6 is selected by the name "B".
You would need both R23 and R3 to be in your configuration to guarantee that
B/index.html still exists.

    The configuration roots can be relocated by "mounting" them at some   
   point in the user URL space (e.g., putting VR7 at /foo/A).

Yes.

   There is a problem though.  Because VR7 (A) and VR2 (X) are local roots,   
   they have no containing collection within the configuration and their   
   names are not preserved.

I'm not sure how this can be a "problem".  If you care about
their names, you would include a versioned-collection
revision in your configuration that gave them a name.  For example,
the revsion R5 named above.

   So when I mount the configuration in URL space,   
   I have to somehow come up with names for these (potentially numerous)   
   root versioned-resources.  For example, how did I know to use 'A' when   
   mounting VR7 at /foo/A in my example above?  This seems like the 80% case   
   to me and yet there is no place in the system to put this information.

What is wrong with storing it in a versioned collection revision, just as
you did with the other names you wanted to remember?

   One solution is to logically (if not physically) put these "roots" into a   
   "root-collection" which maintains the namespace for VR7 and VR2 (A and X   
   respectively).

How is this any different from binding a name to a versioned-resource
in any other collection revision?

   I'm not sure if this is DAV server functionality or   
   something that clients do on their own.  I hope it is the former so that   
   it is standardized and interop between tools is enabled.  Opinions?

Perhaps you are wondering about how you got to the first versioned
resource (e.g. VR4 in this example)?  That's easy (:-).  You just
create a versioned-resource explicitly at some URL.  You can create
a versioned-collection at "/" if you want to version the entire
namespace.

   BTW, it was proposed that new "set" protocol be added.  Would it be   
   reasonable to use regular collection protocol and give the   
   versioned-resource id as the name and the revision-id as the element when   
   putting something into a configuration?  This avoids having to add   
   methods etc to the protocol.

There currently is no versioned-resource-id defined, except in the
context of a configuration-management "repository".  We could require
one, but I believe that the "level 1 versioning" folks didn't want to
have to come up with one (is that true?).  If we do have
versioned-resource-id's, then that would be the only sensible choice
for the configuration member name.

   COLLECTIONS
   ===========

   The "Collections" proposal said that a configuration was a collection of   
   revision ids.  Being a collection means there must be member names.  I   
   was confused on this and did not really understand what these names would   
   be.  One choice was that they were autogen'd/random.  In this case, case   
   the names don't mean anything so why bother? Just to satisfy some API?

My impression was that it was to avoid the need for new add/delete methods.

   Another choice (as I understood it) was that the names were the URL   
   fragments relative to the roots.  Note that in either case, the values in   
   the collection are the complete revision ids (including the   
   versioned-resource id) of the resource revision in the config.  For the   
   example above, the configuration is:
    A = VR7.R23
    A/B = VR6.R3
    A/B/index.html = VR9.R9
    X = VR2.R12
    X/cool.gif = VR3.R11
   Note that the names of VR7 and VR2 are maintained.  Unfortunately, there   
   are problems.  First the information is redundant.  The namespace,   
   exclusive of the names A and X, is maintained by the   
   versioned-collections themselves.  This leads to the problem that if I   
   rename B in A, I have to go and change all the entries in the   
   configuration.  If A is high up in a large configuration, this is   
   expensive.  In the Sets example, you only have to change the revision of   
   VR7.  It does however allow you to maintain the collection protocol for   
   configurations.

Redundancy is an annoyance, but could be outweighed by convenience.

A minor problem is that this doesn't give you unique names, since
several of the roots could have the same names, i.e. if your tree was:

/X/A/...
/Y/A/...

and you wanted the two "A"s to be the roots of your configuration.

More importantly, such a member naming scheme does not produce a consistent
WebDAV namespace.  To represent the example configuration with this
naming scheme, config/A should be bound to R23 and config/A/B be bound
to R3.  But if config/A is bound to R23, since R23 binds the name B
to VR6, config/A/B *must* be bound to VR6, *not* R3.

In other words, a versioned-collection revision cannot be used both
to remember a set of versioned-resource names, *and* be used to give
names to specific revisions in a configuration.


   SUMMARY
   =======
   Overall I vote for Sets with the ability to maintain root names in some   
   standard way.  I do not have to use then when mounting a configuration   
   but it sure would be nice to have that starting point.  If we are really   
   hot about putting individual revisions into a configuration, use the   
   versioned-resoruce id as the name and the revision-id as the value on a   
   regular collection put operation.

I don't think it is meaningful to talk about "maintaining root names"
of a configuration (except trivially by adding a versioned-collection
revision that that gives them names, in which case they are no longer roots).

   BTW, we still need to talk more about "required/needed configurations".  

If we have "configurations" (which can have more than one root), then
there is no need for required/needed configurations.  And recently,
I have become convinced that in any case, it is sufficient to have
required/needed *activities* to handle the relevant scenarios.

A topic for next week (:-).

Cheers,
Geoff