Versioning spec review - 02.3

jamsden@us.ibm.com
Mon, 18 Oct 1999 16:44:26 -0400


From: jamsden@us.ibm.com
To: ietf-dav-versioning@w3.org
Message-ID: <8525680E.007205BD.00@d54mta03.raleigh.ibm.com>
Date: Mon, 18 Oct 1999 16:44:26 -0400
Subject: Versioning spec review - 02.3



Here's my comments on the 02.3 spec:

1.1, 2nd paragraph Versioned Collection versioning -> Versioned Collection

Also, core is described as providing versioning of largely independent
resources. Independent with respect to what? Most web applications are highly
linked resources, so this level of versioning by implication isn't particularly
useful. What exactly is meant by this statement? I think independent in this
case refers at least in part to independent changes in the resources, not
necessarily just independent resources.

Last paragraph: How about using components of versioning support instead of
"levels"?

1.2, Versionable Resource: Might want to put what it means to place a null
resource under version control. Does it become a resource with an empty body?

Target might want to be Target Revision to make it more specific. What else can
ge Target-Selector be? Might not be needed here in the definition, but that's
where the question came up for me.

Baseline: (look for . . in a number of places in the document) 1st paragraph,
last sentence: ..."a baseline contains a revision of the versioned collection
and a revision or baseline".... Which is it, a revision of a member, or a
baseline? A revision of a non-collection and a baseline of a collection? What's
the difference between a revision and a baseline or a collection in this
context? (answer: the baseline is just a deep revision). Maybe no change
required, I'm just thinking.

I'll be looking for where repository is used. If it never appears in the client
space, then the protocol shouldn't need the concept. It should be an
implementation concept only. Note I continue to believe the protocol should not
be specifying where activites and configurations go. Versioned resources yes,
because they are owned by the server, the client uses a mapped URL to get to
them. But the protocol should only know about this URL, not the physical
location of the versioned resource. But I'm still open.

2.1 The initial revision MAY become the target depending on the workspace
current label, current activity, or RSR.

2.2 WebDAV level 2 should be class 2. Does a server have to support immutable
revisions? Last paragraph says mutable revision support is optional, but doesn't
say anything about immutable revisions.

2.4 sets of related resource -> resources

2.5 User "revision selector" instead of "rule element".

2.7 The logic isn't consistent. Say you have a working collection of a versioned
collection. Now you want to add a new resource to that collection. You use PUT
to create the new resource, but since its not versioned, the PUT fails. So the
only way to add a new resource is to CHECKIN a null resource. But this seems
more like an error than the PUT. You're checking in a resource that you never
checked out, and doesn't exist. I have less problem if the members have to be
checked in before the collection can be checked in.

2.8 makes a reasonable case for repositories, but it seems like the protocol is
getting pretty deep into implementations here. Some servers might not have
repositories at all, there won't be any of these restrictions. Others (like
DAV4J) may have very different semantcs for partitions of the namespace because
the server has multiple underlying repository managers with restrictions on
relationships between the resources in them. OPTIONS on a resource should
provide the necessary information. Users might not be aware of these boundaries
inside the server until an error occurs (mostly through bindings across
repository managers). But this will be true for many non-versioning methods too.

Section 2 is missing the locking semantics that were in the versioning model
introduction. We should include it here and address the issues it raises.

3.2.1 Boolean in other WebDAV headers (DAV:overwrite) is T and F. Versioning
should be consistent.

3.2.2 should be a reference not copy.

3.3.1, what are the empty parens for? (mutability, etc.?)

3.4.3 (and others), last sentence: what XML document? Do you mean  the value of
the DAV:default-workspace element in an entity body?

3.4.7 PUTPROP->PROPPATCH

3.5.4 Seems more natural for this to be labels, not labeled. Labeled looks more
boolean, and labels is consistent with the property on a versioned resource that
plays the same role.

3.5.6 What is DAV:workspaces used for? Just for reporting?

3.5.9 merge-precdecessors: the client should be restricted to merge only
revisions of the same versioned resource, and only if they are on different
lines of descent. Otherwise merge relationships are not meaningful. This and
merge-successors are good examples of how property collections don't work well.
There are a lot of semantics associated with merging and integrity relationships
that must be maintained. Doing this by allowing clients to directly edit
collections is inappropriate. Instead there should be a MERGE method that
maintains the semantics. It's ok to have the properties for reporting purposes,
but no for establishing the relationships.

3.5.10 indicates merge-successors is readonly, but describes how to add and
delete new merge successors.

3.6.1: The revision selector for a baseline would have to include the URL of the
associated collection, and the baseline id. A baseline is the only revision
selector that has a compound name. This will complicate the revision selection
rules.

A baseline doesn't have labels? The duplication between baseline, revision, and
configuration looks suspicious. Maybe we need some factoring.

3.7.1 Wouldn't the request have to know the workspace in order to get the
workspace property of a working resource?

3.7.3: doesn't every baseline have to have a corresponding revision of the
versioned collection?

I think the XML element definition fo checkin-policy is wrong. It implies each
item can be specified more than once, but this doesn't make sence. And there's
no extensibility built in, say PCDATA.

3.7.4 Need more control over merge. To ensure semantics, should use a MERGE
method instead of directly editing properties which are likely live and not the
implementation of persistence for the merge successor/predecessors.

3.8.2: rsr-baseline: must have the URL for the collection and the id of the
baseline. A baseline is like a revision and must be addressed by its id.

General question on the introduction of conflicts in the RSR: many of the
revision selectors indicate they don't ever create conflicts, or only create
conflicts in certain circumstances. Aren't conflicts created not by a particular
revision selector, but by the presence of more than one revision selector in the
RSR, each of which might pick a revision that is not on the same line of
descent?

rsr-configuration: this one is a problem as the RSR must be used to select the
revision of the configuration that the RSR uses to see if the configuration
contains a revision of the target resource. Since configurations can't contain
configurations, this isn't a problem, but it may have undesirable implementation
consequences. The configuration used by the RSR can change without anything
changing in the RSR itself. Say the configuration selected is one labeled Foo,
and Foo moves to a different configuration. Perhaps we need to restrict a
revision selector to a revision of a configuration, just like baselines.
Whenever a specific revision is required, the workspace isn't used, and a
specific id is required. Can't use a label because that could move.

3.8.3 and 3.8.4 should be current-label and current-acctivity to highlight their
role.

3.9.3 Why can't an activity be used in more than one workspace at a time?
Workspaces keep working resource separate, so why can't more than one user be
working on the the same activity at the same time. This is common in branching
systems where branches represent some larger unit of work.

3.10 I'd like to see if there's a way this section could be removed. It sounds
like implementation detail. I realize its just using the protocol to describe
some behavior, but I think this is overly restrictive and nothing the client
needs to know. For example, activities, workspaces, and configurations are
resources created by users for user purposes. The fact that the server uses them
too is not important to users. Users will want to put their activities in their
own collections, not be forced to put them in some server-specific location,
perhaps mixed up with a lot of other unrelated activities.

Section 4, 1st paragraph: methods inherit all of the WebDAV functionality should
be methods have all of... We should avoid inherit as it as other uses and
carries lots of expectation.

4.1 GET on resources with no body returns an empty body with no MIME type.

4.2 last paragraph: seems like PUT to null resource in a working collection is
how one would begin to add a new member to a versioned collection.

4.5 Seems like it should be OK to copy:
workspace: it would be a new workspace with the same RSR, but none of the
working resources
activity: it would just create a new activity with some of the properties copied
configuration: should work fine

4.9 checks out the target, not the versioned resource. This doesn't seem
consistent with lock. Lock is a dynamic access control mechanism. Locking a
versioned resource should be the same as setting the single-checkout property.
Only the lock owner can do the checkout. Lock on a revision does the same thing
for the revision. Lock on a workspace prevents any checkouts in that workspace
(because only the lock owner can update the properties), etc. These are the
semantics from the model introduction I think.

4.11 OPTIONS is on the resource too, not just the server. I hope the client
doesn't need to know repositories on the server.

5.1 I don't know what a standard data container is. I think its a resource
without a resourcetype property.

Seems too bad that MKRESOURCE can't initialize the whole state of a resource in
a single atomic operation. We don't need it, but user's might. This could be
done if we used multi-part entity request bodies.

5.5 Why can't a REPORT be on a resource? It would just return if that one
resource changed, added (compare-request doesn't exist), or was deleted (request
resource doesn't exist). Isn't this the base case for the recursion implied in
the other resources.

5.4.6 looks like the conflicts-response should have been
conflicts-report-response in the example. Or its wrong in 5.4.2.

6.1, so unlike all other methods, CHECKOUT doesn't use the default workspace.
Irregularity creeps in... Pehaps the client should be required to do MKRESOURCE
first to create the workspace (or checkout token), and then provide it on the
CHECKOUT. This is not a significant overhead, and the client is very likely to
use this same workspace for other CHECKOUTs. I don't like servers implicitly
creating workspaces. This does not imply that the workspace has to have a
revision selection rule, or that the server has to support extended workspaces.

2nd to last sentence should be "A subsequent request on the same URL that
specifies that workspace in a Target-Selector header will be applied to that
working resource."

6.1 4th precondition, DAV:activity and/or DAV:label (or current-activity,
current-label) must be set. Its OK to use both.

Missing precondition, if DAV:activity is specified, the resource cannot be
already checked out in that activity.

The preconditions should be specified more logically too. For example, If the
DAV:single-checkout property of the selected versioned resource is set, the
resource must not be already checked out in any other workspace. Last
precondition: a revision cannot be checked out twice in the same workspace.

Marshalling, how is the Target-Selector overridden with a specific label or id?
We need the Target-Selector to specify the workspace for collections in the URL
path, while overriding the Target-Selector for the leaf element of the path.

Why is there a propertyupdate element on a CHECKOUT? Shouldn't the be a
PROPPATCH after the checkout? If this is for checkout policies, then perhaps we
should simplify CHECKOUT, and let clients do CHECKOUT, PROPPATCH, and UNCHECKOUT
if the PROPPATCH (or anything else they want) fails as they wish. I don't see
why we need this in the CHECKOUT protocol. There doesn't seem to be any reason
this needs to be atomic, and there certainly shouldn't be any performance issues
as CHECKOUTS are not done that often. I'm especially against this if there are a
whole lot of restrictions on what can be in the propertyupdate to restrict the
updates to things having to do with checkout and a subsequent checkin.

I don't like how specific the postconditions are. The should say:

The revision is checked out in the selected workspace in the current activity if
any. All the rest sounds like implementation detail and tends to hide the
meaning of the method. Perhaps we need to include both the logical and physical
pre and post conditions.

Result: the checkout response must be in a multistatus, not just a response
element.

6.2 The sentence "If the server supports mutable revisions." appears out of
context. I don't think we should overload checkin with uncheckout semantics.

Again, CHECKIN is doing PROPPATCH work. This is not a propertyupdate, its a set
of parameters for the CHECKIN method. We should not reuse the propertyupdate,
but rather create a new element, specific to the method. Otherwise we have to
specify a whole bunch of restrictions about what can go in the propertyupdate.

DAV:uncheckout is a control couple. This is not good style. Use a separate
method. There's no reason to conserve them. Control couples appear to make the
protocol smaller, but they really add complexity. Notice that most of these
checkin policies could be marshalled in simple headers.

7. The paragraph about Target-Selector specfies a revision id or label is
incorrect. The selector "self" cannot be applied to collections on the path
because its a revision of the collection that's needed, not the versioned
collection as a whole. The revision says what the members of that revision of
the collection are which can be used to validate the next entry in the path. So
we need two headers, the Target-Selector containing the workspace, and a
Revision-Selector that overrides it for the leaf resource. Its the
Revision-Selector that can have "self" not the Target-Selector.





Issues:

1. Do members of a verioned collection have to be versioned resources?

2. Should the server specify where activities, workspaces, and/or configurations
are located in the URL namespace?

3. Are revision ids and revision labels in the same namespace (i.e., specified
in the same header and XML elements)?

4. What does it mean to LOCK a workspace, activity, configuration, baseline,
versioned resource, revision?

5. Property resources aren't really resources or collections. You can do a PUT
or MKCOL in them, GET, etc. We're trying to reuse some of the WebDAV methods to
specify the protocol for new method semantics. We don't want two ways of
specifying these semantics, XML and property resources. Perhaps neither is
correct and we should be using additonal methods.

6. Can an RSR contain a revision selector that is a versioned resource (e.g., a
configuration)? No. Have to specify a particular revision using the revision id
(labels can move).

7. Can a revision selector have a compound name?

8. LOCK in a versioning server needs to be better defined.
Versioned resource
revision
working resource
configuration
activity
workspace
baseline