Re: Working collections from Geoffrey M. Clemm on 2000-11-19 (ietf-dav-versioning@w3.org from October to December 2000)

From: Geoffrey M. Clemm <geoffrey.clemm@rational.com>
Date: Sun, 19 Nov 2000 12:59:12 -0500 (EST)
To: ietf-dav-versioning@w3.org
Message-Id: <200011191759.MAA18766@tantalum.atria.com>
   From: Greg Stein <gstein@lyra.org>

   On Sat, Nov 18, 2000 at 04:47:56PM -0500, Geoffrey M. Clemm wrote:
   > We should first confirm that we agree on the semantics of a collection
   > version and a collection version selector, namely, that each member of
   > a collection version is a version history,  while each members of a
   > collection version selector is either an unversioned resource or a
   > version selector.

   Sounds fine to me.

Great!  Then we're at least reading the same book, although perhaps not
quite yet on the same page (:-).

   > By analogy with collection versions, the members could be version
   > histories.  By analogy with collection version selectors, the members
   > could be unversioned resources or other working resources.

   ... or version selectors.

Even better!

   Let's call the first one, Model A. And the second one Model B.
   ... (read: avoid Model C)
   ... Of these, I'd rather take the flexibility of Model B.

Great, now we're on the same chapter of the same book (:-).

   Concretely: Subversion has several server-defined namespaces. One each for
   activities, working resources, versions, and histories.

That corresponds precisely to what is currently in the protocol.
In fact, the protocol has just one additional namespace: workspaces
(where a workspace is really just a root for a tree of version selectors).

   The history resources cannot be moved out of their space.

The protocol has this restriction as well.  Note that if you support
collection versions, that a history resource has multiple names
(multiple bindings).  In particular, it has its name in the "history
namespace", but it also has names in the "version namespace".
For example, if http://gmc.repo/his/id73 identifies a history
resource, and http://gmc.repo/ver/c23 is a collection version with
a member named "foo.html" that identifies this history resource,
then http://gmc.repo/his/id73 and http://gmc.repo/ver/c23/foo.html
are two names for the same history resource, one in the history
namespace and the other in the version namespace.

   If any server uses server-defined namespaces for histories, then a MOVE
   would be impossible if you tried to enforce "working collections must
   contain version histories."

I don't think working collections should contain version histories,
but I'll point out that you could allow version histories to be
moved around in the "working collection" namespace, even though you
can't move them around in the "history" namespace.

   >    The working resource for the collection is also handy for
   >    deleting or for replacing the collection:
   > 
   >        DELETE http://host.name/repo/$svn/wrk/100.3
   >        (note that this request also requires a checked-out parent)
   > 
   Note that Tim also pointed out that we probably would not delete the working
   collection itself, but delete the member from the parent collection. The
   above request is ambiguous with "do an UNCHECKOUT".

Oops, I didn't read this carefully in my first response.  UNCHECKOUT
(as currently defined) only applies to a checked out version selector.
You just DELETE a working resource if you don't want it any
more. (i.e. don't want to check it in).  So the above request is not
ambiguous with "do an UNCHECKOUT", but rather *is* the way you tell
the server that you no longer are interested in that working resource.
(And the parent, "http://host.name/repo/$svn/wrk" is not under version
control, and therefore does not have to be checked out prior to the
deletion.

On the other hand, 

 DELETE http://host.name/repo/$svn/wrk/100.3/foo.html

is the way that you delete the member named "foo.html" from the
checked out collection.

   I'm still a bit woozy on deletion of resources, both in WebDAV and DeltaV.
   In WebDAV, you lock and delete the target.

I don't see any reason to lock the target of a DELETE (the lock is
just blown away by the DELETE).  I'd just issue the DELETE.

   In DeltaV, do we checkout the
   target to make sure we are allowed to delete it? (e.g. watch out for
   reserved checkouts and stuff).

There's no reason to checkout a resource before deleting it.
The deletion is a modification to the collection containing
the resource, not to the resource itself.  A checkout of a
resource (reserved or not) has no effect on whether or not
you can delete it.

   I'm tempted to just say "check out both of them, then delete relative to the
   parent collection".  (sigh... but checking out the target is probably
   redundant, then) ... Ideas, anyone?

Just repeat the mantra "a delete is an operation on the parent collection
of a resource", and all will be well (:-).

   >        COPY http://host.name/repo/somedir/
   >        Destination: http://host.name/repo/$svn/wrk/100.3
   >        Overwrite: T
   > 
   > This wouldn't work, because deleting a working resource (as is done by
   > an Overwrite:T) cancels the checkout.  You would need to use the new
   > "update" value for the Overwrite header.

   Nits. You know what I meant, and the server really should, too.
   This is where people read the letter of the law too strictly and
   screw up clients because the client wasn't as anal as the server.

Well, the server knows what you said (i.e. "get rid of the working
resource thereby canceling the checkout"), and it's really important
that the client and the server agree on this (or else the client will
be surprised by what the server ends up doing, because it will do what
the client said, not what it meant).

   But for discussion. Sure... let's assume that I used "update" in there.
   The point is that I need to check out the collection because I'm about to
   replace its contents.

If you are going to modify it's contents (i.e. add/delete/change a
member), then sure, you need to check it out.  But if you are going
to delete the collection, then you are operating on the *parent*
of the collection, i.e. deleting a member of the parent of the
collection, so it is the *parent* that needs to be checked out for
the DELETE.

   >    DELETE of a member is a CHECKOUT of two items: the thing to
   >    DELETE and the parent collection.
   > 
   > Only the parent collection needs to be checked-out in order to
   > DELETE a member of that collection.

   Yah, this was my rambling above. What about reserved checkouts and stuff for
   the member? e.g. is it possible to just delete the thing without having it
   checked out?

Yes.  Checking something out prior to deleting it is as unnecessary as
locking it before deleting it.  If you cared about the state of that resource,
you wouldn't be deleting it.

   >    Without being able to do a CHECKOUT on a collection, there
   >    wouldn't be a way to do any of the above.
   > 
   > You could just use a VERSION-CONTROL request to create a new version
   > selector whose target is the desired collection version.  Then you can
   > do all your operations on that version selector.  It is one extra round
   > trip, but if that really matters, we could easily add a "DAV:keep-checked-out"
   > option to the VERSION-CONTROL request.

   That feels *way* bogus. Use VERSION-CONTRL? And just how do I get that to
   participate within an activity? Or to MERGE it? etc

The version selector created by VERSION-CONTROL has all the activity
behavior that a working resource has.  When you check it out, the
DAV:activity-set of the version selector is set to be the specified
activity(s), and when you check it in, the new version is added to the
DAV:version-set of the activity.

   > This would allow us to avoid all the extra complexity that would
   > result from having two ways to manipulate versioned collections.

   I'm not seeing the complexity. Check out a collection version; you get a
   working collection; modify it at will; check it in.
   The working collection operates just the same as a checked-out collection
   version selector. And this is right: they're both representative of a
   checked-out [collection] resource. Seemed pretty obvious to me.

If we make working collections act like checked-out collection version
selectors, then I agree that we can avoid the complexity.  It appears
that the main question we have to resolve is what precisely are the
members of a working collection.  If they are version selectors, then
we are almost there, since the working collection then just becomes
the root of a tree of version selectors.  The only thing left to do
to unify the models is to make the root *also* be a version selector,
and then we have a uniform tree of version selectors.

Which is a good transition into your next message:

   From: Greg Stein <gstein@lyra.org>

   > But if you've got versioned collections, it's natural to get some
   > kind of tree when you checkout the collection, so it might as
   > well be a tree of version selectors.

   Yes. I was going to make a collection's internal members available via a
   working collection, but only for limited operations (namely: DELETE).
   I understand that the spec needs to be a bit more lucid than the hand-waving
   that I'm doing with those internal members :-)

How were you planning on enforcing this limitation?  Or perhaps more
importantly, why were you planning on introducing such a limitation?
If the working collection creates a tree of version selectors, why not
just use those version selectors (which are local to you, and therefore
won't be checked out by anyone else) for versioning operations on those
members?

The model then becomes: Check-out a collection (which then creates
a tree of version selectors), and then checkout one of those version
selectors when you want to change that member of the tree.

   In my model, if the parent is a working collection, then the PUT
   [into the working collection] succeeds and becomes a working
   resource. (and an analogous behavior for MKCOL)

To be careful about terminology, it becomes a version selector, not
a working resource, since auto-versioning creates version selectors,
not working resources.  (You indicate as much below, so this is just
me being pedantic :-).

   >    Yes, it becomes version-controlled, as that is the behavior
   >    that my server imposes when a resource is PUT there. This is
   >    allowed by the spec.

   > When something becomes version-controlled, it becomes a version
   > selector. If you were using version selectors,

   I'm using version selectors. Kind of mandatory :-)

Well, one could invent a new kind of resource for this, but using
version selectors would certainly keep the model simpler.

   > this would all be
   > consistent (i.e. you'd be creating a new version selector
   > as a member of an existing version selector).

   Assuming auto-versioning, then yes. The net result is a new version
   selector.

OK, we're in good shape then.

   > If you are using
   > working collections, we'd have to define the semantic interactions
   > of version selectors nested within working collections.

   Unless you state that a PUT/MKCOL creates a working resource within the
   working collection.

That would not work very well, since when you checkin a working resource,
it is deleted, which is not the behavior you would want (it would make
it appear that you wanted to delete that member of the working collection).

   [ and thus: they also create a version history; not sure when a version
     would be created, though ??? ]

When you first create a version history, it is given an initial version.
When you put an existing (non-versioned) resource under version control,
the initial version is a copy of the state of the non-versioned resource.

   > This becomes
   > complex because working resources are deleted when they are checked
   > in, while version selectors are not.  So if you check in a working
   > collection, what happens to the possibly checked-out version selectors
   > inside that working collection?

   Yes, this could be complicated :-). You could say that they just continue to
   hang out, but what does the parent collection (the working collection that
   you checked in) become? You can't just get rid of it because of the
   namespace consistency issue.

Yes.  This is why checked-out version selectors are preferable for
collections, because when you checkin a checked-out version selector,
it just becomes a checked-in version selector, and all of its members
are still visible and accessible.

   > With collection version selectors,
   > this is not an issue because they are not deleted when you check them
   > in so that nested version selectors do not get orphaned by checkin.

   Yup. But checked-out version selectors are not "thread safe" if you will :-)

Remember that you got "your own" tree of version selectors when you
checked out the desired collection.  So nobody else is checking out
your version selectors (they'd be checking out *their* version selectors,
that were created when they did *their* checkout of the collection).

So the only constraint here is that you can't checkout the same resource
twice at the same time yourself, without first creating a new tree
of version selectors, but that seems like a pretty reasonable
(and perhaps even desireable) restriction.

   >    3) At MERGE time, a version history is created for the PUT'd resource and a
   >       new collection version is created which references that.
   > 
   > MERGE modifies the properties of existing version selectors
   > (e.g. DAV:target and DAV:merge-predecessor-set), but
   > does not create any new versions (only CHECKIN does that).

   Euh... when did that happen? Damn.

Well, actually, it's always been that way.

   Just how is a transacted change set supposed to happen, then?
   Before, it was "accumulate the changes into an activity, then merge
   the activity." What now?

I'd support the introduction of the ability to CHECKIN an activity,
to mean "atomically checkin all checked-out resources for that
activity".  Does anyone object?  (pretty sneaky burying this
question deep in this long message ... :-).

   > ... Because of the problems with checking in working
   > collections that have version selector members, it would be much
   > simpler to just use collection version selectors.

   So the issue isn't about "what goes into a working collection" as much as
   "what happens when you check it in?"

Or "can we just use collection version selectors, since they already
have the appropriate behavior on checkin?"

   Getting all academic with this-contains-that just peeves me. The
   goal is to make things work, rather than have a pure model that
   doesn't work.

Well, collections are all about what contains what, so I believe that
"making things work" for version-controlled collections hinges on
getting the "what contains what" questions answered rigorously.

   >    Describe it how you want, but it is still going to work as any
   >    rational person is going to expect: check out the collection,
   >    copy something in, then merge that back.

   > I'm with you on the "checkout" and "copy into", but that needs to be
   > followed by a "checkin", not a MERGE.

   Can I do a CHECKIN on an activity?
   I see no other way to create a transacted change set.

I agree that having a transacted change set is an important
capability, and that doing a CHECKIN on an activity is the
right way to model it.  I'll raise this in a separate mail
message (for those that didn't make it all the way into
this one :-).

   If you check out '/some/collection/' into a working collection, then
   /working/collection/foo.html' may not be available.
   In truth, I'll probably make the URL available for DELETE, but nothing else.
   You'd need to go to the version (selector) to refer to the contents.

That seems like an unnecessary complication to the model ... you can
remove this restriction, and the model becomes simpler, and the clients
life becomes simpler as well.

   >    I guess it would be possible to DELETE the version selector,
   >    have the server identify the parent, see that a working
   >    resource is available, and then go and modify that working
   >    resource. Hrm. I'll have to think about that.
   > 
   > You can only DELETE the version selector if it is checked out
   > (which is what I'd want you to do, so actually I'm all for that :-).

   Wait a minute.
   People keep saying "you only need to CHECKOUT the parent. you don't need to
   CHECKOUT the parent *and* the target." But now you're saying that I have to
   check out the version selector (the target) to delete it.
   So which is it?

My apologies ... my comment was very poorly worded.  The "it" I was
referring to (that must be checked out) was the parent collection that
contains the version selector.

   And if I check out the target and DELETE that, then that is just an
   UNCHECKOUT, right?

Yes.

   Hmm. The spec is totally silent on deleting any of the new resource types.
   What happens when you delete an activity? a working resource? a workspace?

Good point.  There are things that we should say here about what happens
to references to those objects when you delete them.  I'll try to get
something written up this week.

   >    > <tim>
   >    > The operations you describe are more appropriate to a checked out
   >    > collection version selector; and I agree that they are essential.
   >    > </tim>

   >    A working resource is a working resource. It shouldn't matter
   >    whether it came from a checked out version selector or a
   >    checked out version.

   > Well, to be precise, the result of checking out a version
   > selector is a checked out version selector, not a working
   > resource (you only get a working resource by checking out a
   > version).  A working resource has some very different behavior
   > from a checked out version selector.  In particular, checking in
   > a working resource deletes the working resource, while checking
   > in a checked out version selector just changes its state from
   > checked out to checked in.

   Fine. But all other operations: PUT, PROPFIND, etc work the same for a
   checked-out version selector and a working resource, right? You can still do
   all the same stuff to it. A CHECKIN treats them a bit different, but they
   seem pretty much the same otherwise.

That would depend on how we are handling "working collections".
Earlier, you were suggesting that we only allow DELETE on members
of working collections, and not PROPFIND, CHECKOUT, etc.  If so,
a working collection would be very different from a checked out
collection version selector.  But I believe that the checked out
collection version selector semantics are the ones that we want,
so we should use those.

One way to think about this is that I'm suggesting that checking
out a collection version should create a workspace (i.e. a tree
of version selectors), or in other words, that a working collection
is a workspace.  In fact, I'd be happy to allow you to create a
workspace by issuing a checkout against a collection version.

The only difference between such a workspace and one that was created
by a MKWORKSPACE is that the server selects the name for the workspace,
while with a MKWORKSPACE, the client specifies the name for the workspace.

So, if we say that you can CHECKOUT a collection version to create
a workspace, and you can CHECKIN an activity to get transactioning,
would you have what you need?

Cheers,
Geoff
Received on Sunday, 19 November 2000 13:00:00 UTC