Date: Thu, 18 May 2000 13:27:28 -0400 (EDT) Message-Id: <200005181727.NAA05925@tantalum.atria.com> From: "Geoffrey M. Clemm" <geoffrey.clemm@rational.com> To: ietf-dav-versioning@w3.org Subject: Re: draft-ietf-deltav04.5 now available From: Juergen Reuter <reuterj@ira.uka.de> Thanks for the review, Juergen! This is a partial review of draft-ietf-deltav-versioning-04.5.htm. It only covers those sections that are relevant for basic versioning. I'd like to encourage everyone to try to review the core versioning section (it's only 20 pages long). This is the section that needs to be suitable for the widest range of implementations (and will be the most commonly implemented part). You should *NOT* need to be experienced with versioning to understand and implement the basic versioning protocol. I apologize for this late review and for currently having not enough time for a full review; I also had not enough time to sort out those issues that others may already have pointed out. Nevertheless, I hope that this review helps to improve the protocol. The timing was fine ... I'd especially appreciate reviews from anyone who will not be attending the design meeting next week, since the more input we have for that meeting, the better off we will be! Table of Contents ----------------- Many headlines end with a period where it probably should not occur (at least, the "." do not occur consistently). I didn't see any headlines that end with a period. I assume you are not referring to the "..." in the table of contents? Section 1 --------- For the reader, it would be quite helpful to see a commented listing or table/figure of the protocols and their features that this protocol directly builds upon. Versioning is designed to be orthogonal to most features in the HTTP protocol. The only exceptions are those explicitly identified in the section on "Versioning and Existing Methods". I'll add this statement to section 1.1. I believe this implies answers the following questions, but I'll answer them explicitly just to make sure. For example, is the Bindings protocol required if a server just wants to support basic versioning? No. Is it required for advanced versioning? Just for versioned collections. I'll state this explicitly in 1.1. Is WebDAV locking required? No. If locking is required, why? It isn't. Version control systems usually have some kind of built-in locking mechanism, depending on their versioning model. Is a DeltaV server supposed to map such internal locks to externally visible WebDAV locks No. (which might be difficult)? The semantics of built-in locking mechanisms for most versioning systems is very different from the WebDAV locks, so I'd use the word "impossible" rather than just "difficult" (:-). What about the redirection resources protocol which was mentioned in early versions of this protocol? It seems to have disappeared. Hmmm. I'm not sure what this is referring to. Do you have a particular protocol draft version you can point me at? (I of course keep the protocol drafts under version control :-). Section 1.2 ----------- > A "versioned resource" ... , it must be checked out ... > A "working resource" ... by applying a CHECKOUT request ... > A "revision" ... by applying a CHECKIN request ... The intention of this section is to introduce terms. I think it is dangerous to refer here to methods like CHECKOUT or CHECKIN which are specified later, because this might lead to circular definitions. Well, these references were added in response to an earlier reviewer that wanted the terms tied to explicit protocol elements. I currently am leaning towards that earlier reviewers perspective, since in general the terms are in fact tied to specific protocol elements. The circularity is deliberate, since the only semantics that are required are those defined in terms of relationships between the methods and properties exposed by the protocol. (In fact, the term "versionable resource" is defined circularly; cp. my comments to section 6.1). This is a good example. A "versionable resource" is any resource that implements the VERSION method. We try to provide an underlying model for understanding what that might mean, but the real definition of the term is as stated. Therefore, I propose to prefer wording like "checking out" rather than "applying a CHECKOUT request". None the less, I do notice that wording like "checking out" is unprecise and therefore may lead the reader to some misconception, depending on the reader's a priori conception of versioning models; but I think this is still better than refering to definitions that appear later in the document and thereby pretending precision which is not really there. I believe that careful choice of the method names (e.g. CHECKOUT for the concept "check out") allows us to both have a correct definition while at the same time appeal to a user's prior knowledge. > Predecessor, Successor > ... Ancestor and descendant are defined in terms of predecessor and successor. Bearing object inheritence in mind, it would be nice to notice that a predecessor can be viewed as an immediate ancestor and a successor as an immediate descendent. That would require us to introduce a term like "immediate", which I think should be avoided if we already have terms (i.e. predecessor and successor) that capture that concept. > Foo.htm <sarcasm>CONGRTLS.W95?</sarcasm> :-) Fixed (:-). > Target DeltaV uses the concept of a single target which may be a a versioned resource, a revision, or a working resource; this is reflected in the Target-Selector header. Alternately, one could introduce a concept that uses a revision-target, a versioned-resource-target and a working-resource-target, reflected by three distinct headers. Just wondering: what was the design rationale for the concept of a single target? Since those three choices are mutually exclusive, it seemed to make more sense to make that selection in a single header, rather than defining multiple headers and then having to say which takes precedence over which, or having to say that "you can only have one of the xxx, yyy, or zzz headers in a request" Section 1.3 ----------- > Because this augmented BNF uses the basic production rules provided in > Section 2.2 of [RFC2068], those rules apply to this document as well. Except for the LWS production, the augmented BNF in section 2.1 of [RFC2068] does *not* use any of the basic production rules provided in section 2.2. of [RFC2068]; hence, the argumentation is weak. I propose to change the wording e.g. into the following: "The basic production rules provided in Section 2.2 of [RFC2068] apply to this document as well." At various times we did use some of the augmented BNF rules, but we may well not do so at the moment. If this is still true when the dust settles, I agree that we should make the change you describe. > The term "protected" is placed in parentheses following the definition > of a property that cannot up updated with a PROPPATCH request. Typo: replace "cannot up updated" with "cannot be updated". Fixed. Btw, what does "cannot" mean? Return a 403 (Forbidden) or 409 (Conflict) as suggested in section 8.2.1 of [WebDAV] for read-only properties? We're planning on doing a final "status code cleanup" once we agree on the semantics. This should be decided at that time (do you have a preference for this particular case?). What is the difference between a "protected" property and a "read-only" property? Yes, I know, a "protected" property may change e.g. as a side effect of a CHECKIN, but not via a PROPPATCH. Yes, that is why I decided to not use the term "read-only". Section 4.1 of [WebDAV] says: > Live properties include cases where a) the value of a property is > read-only, maintained by the server, and b) the value of the > property is maintained by the client, but the server performs syntax > checking on submitted values. So, a read-only property may also change. Hence I do not see any difference between "protected" and "read-only". Yes, but a protected property can be changed by the client, just not with a PROPPATCH. I'm concerned that people will interpret "readonly" as "only can be changed by the server". Section 2.1 ----------- > An unversioned resource that can be put under version control is > called a versionable resource. What does "can be put under version control" mean? For a suggestion, see my comments to section 6.1. That was defined in the preceding terms section, i.e. "can be put under version control" means that it supports the VERSION method. Section 2.2 ----------- > The default target is also selected for any versioned resource other > than the one identified by the request-URL. For example? For example, server side includes. I'll add this example to this section. In advanced versioning, the most common example would be all of the versioned collections identified by prefix's of the request-URL, i.e. for a request-URL /x/y/foo.html, the collections "/", "/x", and "/x/y" may be versioned and require target selection. Section 2.2.1 ------------- > A working resource id does not distinguish working resources from > different versioned resources ... since the same working resource id > can be assigned ... This probably should read: "A working resource id does not need to distinguish working resources from different versioned resources ... since the same working resource id may be assigned ..." Hmm. The phrase "does not need to ..." doesn't seem right in this context. This statement is not implied by the "since" clause ... i.e. the fact that the same working resource id can be assigned to working resources of several versioned resources implies that it *can't* be used to distinguish, not that it doesn't need to distinguish. Section 2.3 ----------- Could you give an example? Suppose you perform a LOCK request on a versioned resource, and the Target-Selector header of that request contains a label for some revision. Is this equivalent with LOCKing the stable URL of that revision? I rewrote this section, and moved it to a section on the LOCK method. Hopefully it is clearer now. In particular, it is not the same as LOCK'ing the stable URL, since a lock also makes sure that the locked URL continues to identify the same resource. This means that it matters which URL you use in your LOCK request. Section 3.1.2 ------------- > When an id is marshaled in the header of an HTTP request, the > characters are encoded using the UTF-8 encoding scheme. It may be worth noting that, according to the syntax of [HTTP/1.1], the encoded message header field may not contain any CTL character except LWS. I think I'd prefer to have someone refer to the appropriate section of the HTTP protocol version they are using, rather than tie this to any particular constraints that are in place for a particular version of HTTP. Section 3.1.4 ------------- The href element is frequently used throughout WebDAV and should be known to the reader of this protocol, which is an extension to WebDAV; hence I think this section can be dropped. I always like to drop things ... does anyone object? Section 3.2 ----------- WebDAV introduces a format to specify properties: it presents a table with name, namespace, purpose, value, and description of each property. It would be nice to continue this format throughout sections 3.2 ... 3.5, maybe extended by an additional item "type" that distinguishes between "protected", "read-only", and "read/write". The "namespace" field is redundant, since all properties introduced in this protocol are in the DAV namespace. The distinction between "purpose" and "description" always seemed rather arbitrary to me. And we've tried to just use the DTD as the "value" definition. So I guess I'm not that fond of the WebDAV format (:-). But a common convention is better than no convention. Anyone else feel strongly one way or the other on this? Section 3.2.1 ------------- > This property is used to track the author of a resource. Each revision > of a versioned resource can have a different DAV:author values, > indicating who authored that revision. This sounds as if DAV:author is a revision property, but not a versioned resource property or working resource property. Then this section should be moved to section 3.4. Or is the property DAV:author available on revisions, versioned resources and working resources? Then, this should be clearly stated, e.g. by providing a definition of "resource property" in the introduction of section 3.2, for example: "For the purpose of this protocol, a resource property is defined to be a property as specified in [RFC2518] that is available on versioned resources, revisions and working resources." Fixed. Note: these properties are intended to be used for any resource, not just for the new resource types introduced by versioning. Section 3.3 ----------- > The DAV:resourcetype of a versioned resource ... This probably should read: "The value of the DAV:resourcetype property of a versioned resource ..." In most places, I have appreviated the phrase "the value of the xxx property of a versioned resource" as simply "the xxx of a versioned resource". I believe this abbreviation is unambiguous, and produces less tortured sounding sentences. Was there some issue of ambiguity here? Section 3.3.1 ------------- > <!ELEMENT revision-set (href*)> If a versioned resource may be empty (i.e. contain no revision), then "(href*)" seems appropriate. Otherwise, I would like to vote for "(href+)". Section 2.1 of this protocol should clarify if a versioned resource may be empty. Changing it to href+ is fine with me. > <!ELEMENT working resource id-set (working-resource-id*)> Typo: replace this with: "<!ELEMENT working-resource-id-set (working-resource-id*)>" Fixed. > <!ELEMENT working-resource-id (#PCDATA) Typo: missing ">": "<!ELEMENT working-resource-id (#PCDATA)>" Fixed. Section 3.4 ----------- > ... all revision properties except ... are immutable. "immutable" should be defined somewhere and compared with "read-only" and "protected". "immutable" implies "protected", but not vice-versa, right? "immutable" implies "read-only", but not vice-versa, right? Immutable is not being used in a technical sense ... it just means "cannot be changed". Protected properties can be changed, but not with PROPPATCH. Section 3.4.2 ------------- > Since a revision id can be specified in a target selector header, the > length of a revision id should be compatible with common header length > constraints. Could you give a reference for "common header length constraints"? Were you thinking of section 3.4.8 of RFC 822? This statement was left intentionally vague. Those constraints will vary over time (or eventually just go away), so I didn't want to get too specific. Section 3.4.3 ------------- > The DAV:predecessor-set of a revision contains a stable URL for the > revision that was checked out to produce this revision. The > DAV:predecessor-set property can contain multiple revisions if the > advanced versioning MERGE method is supported. > > <!ELEMENT predecessor-set (href*)> As far as I understand, the DAV:predecessor-set for the initial revision contains no href element, which should be mentioned. I try to avoid repeating things that are said elsewhere unless it is something that is likely to be misunderstood. Since most/all versioning systems have this characteristic, I didn't think it needed to be repeated, but I could be convinced otherwise. Otherwise, the predecessor-set element should be defined as "<!ELEMENT predecessor-set (href+)>". How about the following wording: "The DAV:predecessor-set of a revision contains a stable URL for each revision that is a predecessor as specified in section 1.2. Except for the initial revision which does not have any predecessor, each revision either has a single predecessor that was checked out to create this revision, or it has multiple predecessors which it was merged from." That sounds fine to me. I'll make the change. Section 3.4.4 ------------- > This property contains a stable URL for each of the revisions whose > DAV:predecessor-set identifies this revision. This probably should read: "This property contains a stable URL for each of the revisions whose DAV:predecessor-set contains a URL that identifies this revision." This is another shorthand I tend to use, i.e. to say that "Property xxx identifies a revision" rather than saying "Property xxx contains a URL that identifies this revision". Do you think that this could be misunderstood? Again, it is a question of trying to make the sentence a bit less tortured. Section 3.4.5 ------------- > Value: id Yes, I know, I am nitpicking, but the "value" item usually refers to the property's value, while here it refers to the label element, which itself is the property's value. Hence, correctly, it should be read: <!ELEMENT label-set (label*)> Value: <!ELEMENT label (#PCDATA)>, with the child content of this element representing an id. I use the "Value:" notation only to describe the contents of #PCDATA (otherwise I use a DTD). To make this clearer, I'll replace "Value:" with "PCDATA value:" > For efficiency ... may report only a subset of the labels ... Oops! Then there is, just because of efficiency reasons, no reliable way to recover all labels, even if I have full access rights? I think, this s a flaw. OK, I'll remove the "efficiency" qualifier, but I believe we need to leave the "for security reasons" qualifier. Section 4 --------- It should be mentioned that all versioning headers obey the message header format as specified in section 4.2 of [HTTP/1.1]. Otherwise, the field-name, for example, does not need to be a token, but could be regarded as a literal string. But that would mean that there is no implied LWS between the field-name and the ":" (just as, for example, there is no implied LWS in the HTTP-Version production in section 3.1 of [HTTP/1.1]). Section 4.2 of [HTTP/1.1] also explicitly allows LWS between the ":" and the field-value. I would like to propose the following introduction in section 4: "This section defines some header fields additional to those defined in [WebDAV] and [HTTP/1.1]. They follow the format and conventions defined in section 4.2 of [HTTP/1.1]." I think it is better to just put the LWS explicitly in these productions. So I'll just add LWS? after the ":" and add LWS before the id. Section 4.1 ----------- > ... there is no target for the versioned resource (primarily used with > the SET-TARGET method). For the reader's convenience, it would be nice to add a reference like: "... with the SET-TARGET method; see section 6.2)." Chris and others objected to the "none" Target-Selector value, so that has been moved into the SET-TARGET body. Section 5 --------- > Note that if a method modifies only a binding to a resource and not > the resource itself (e.g. DELETE, MOVE, BIND, UNBIND), the method is > being applied to the collection containing that binding, so the method > is not affected by versioning unless that collection is under version > control. Ok, but what happens, for example, when you try to DELETE a stable URL? Does that mean that the associated revision is to be deleted? That currently is up to the server. We want to give a server a great deal of freedom in how to implement stable URL's, so I believe we need to avoid defining behavior such as DELETE. Trying to MOVE a stable URL should fail (e.g. 405 (Method not allowed)), right? Yes, it must fail, but which error code it should produce is up for debate. The same questions arise when you apply DELETE or MOVE on a versioned resource whose target is a revision. No, in that case, the paragraph you quoted applies, and the DELETE or MOVE is applied to the binding to that versioned resource, not to the revision. Section 5.2 ----------- > When PROPFIND is applied to a versioned resource... This probably should read: "When PROPFIND is applied to a binding that binds a versioned resource..." I tend to use the shorthand "the method XXX is applied to a yyy" to mean "the method XXX is applied to a URL that identifies a resource of type yyy". I believe this is unambiguous, and produces a simpler statement. Note that a PROPFIND is not "applied to a binding", it is applied to a resource. As Roy has pointed out, one *could* have a URL that identifies a particular binding, but usually a URL does not identify a binding. In particular, if you have a collection named /foo, the URL's /foo/x and /foo/y do not identify the bindings of foo, but rather the resources that those bindings map to. > When PROPFIND is applied to a stable URL for a revision, the DAV:urn > for the revision will be returned. This probably should read: "When PROPFIND is applied to a binding that binds a stable URL for a revision, the DAV:urn for the revision will be returned." As above, a PROPFIND is not applied to a binding, but rather it is applied to a URL, which causes the PROPFIND to be applied to the resource identified by that URL. Section 5.4 ----------- > When a resource supports core versioning, the DAV response header for > an OPTIONS request MUST contain "VERSION". In the DAV response header, WebDAV uses a production called "extend" which is actually undefined. This has been put on the WebDAV issues list. On the w3c-dist-auth mailing list, there was a proposal to define: extend = "<" absoluteURI ">" However, the literal "VERSION" would not satisfy this production. I strongly recommend to achive agreement about the extend production (at least on the w3c-dist-auth mailing list) before using it. I just posted a suggestion that we define extend to be: extend = token | ( "<" absoluteURI ">" ) so that DAV defined extensions can be specified in a terse form. This would then make VERSION a legal extension. Section 5.5 ----------- > 4xx (No Such Target) Just to remind to: Do not forget to finally assign a number for this error. Yes! Btw, in section 6, you may want to list this error in the response status code sections, where appropriate. Since this error can occur in any request, it probably would be needlessly redundant to specify it in every method. > Return Status Codes: Why do you put the return status codes between the preconditions and postconditions? I would expect them behind the postconditions, because the postconditions are only relevant for the 200 (Ok) status, and hence the return status codes altogether present a more global view on what may happen and thus should be outside the scope of the pre-/postconditions block. I put them there to place them near the pre-conditions, which is what they refer to (except for the 2xx conditions). Probably a better thing to do would just be to associate the appropriate status code to each pre-condition (indicating what status code should be returned if that pre-condition is violated). Section 6.2 ----------- > SET-TARGET Although "SET-TARGET" is a valid token as required by the extension-method production in section 5.1.1 of [HTTP], I do not know of any method name that uses a hyphen to divide words (for example, cp. "MKCOL"). Is there a specific reason to break this silent convention? For readability. I believe the use of hyphens in header names is the relevant precedent. > A SET-TARGET request can be applied to a versioned resource whose > target is a revision to specify that revision to be the target when > the specified label is used in a Target-Selector header. Ugly wording -- hard to understand! Based on earlier feedback, I've split the SET-TARGET method into SET-TARGET and SET-LABEL-TARGET methods, and rewritten the intro, so hopefully this is now fixed. What about the following: "A SET-TARGET request can be applied to a versioned resource to specify a revision to be the target. The label of the target must be specified in a Target-Selector header." (Is this what you intended to say?) Yes, this works for the SET-LABEL-TARGET method. The tortured wording from before was probably due to trying to get a single sentence that applied to both "set default target" and "set label target". > Preconditions: Maybe you want to add: "The request must contain a valid Target-Selector header." This information is now in the body of the SET-LABEL-TARGET method. > <!ELEMENT label (#PCDATA)) Typo: replace "))" with ")>". Fixed. > 404 (Not Found): The request-URL did not identify a resource. A null resource is a resource, but I suppose 404 should be returned also if the request-URL identifies a null resource. Hence, the wording shoud be slightly modified. I believe that the concept of a "null resource" is a major source of confusion, so I make a point of never using it. I believe that it is clearer to say "The request-URL did not identify a resource" than it is to say "The request-URL identifies the null resource". And don't get me started on "lock null resource" (:-). It may be considerable to mention that 423 (Locked) might be returned. I tried to avoid specifying error status returns that are not affected by versioning (there are a large number of them!). Section 6.2.2 ------------- > <? xml version="1.0" encoding="utf-8" ?> The white space between the "?" and "xml" is invalid; see rule [23] of [XML1.0]. Fixed. Section 6.3 ----------- > A CHECKOUT request does not require a lock token, since it does not > modify an existing target. Yes, but even if it did modify an existing target, it would not require a lock token, as long as the target is not locked, right? Yes. (I believe this fact is compatible with the quoted statement). > 405 (Method Not Allowed): The request-URL did not identify a versioned > resource whose target is a revision. "whose" is misleading: the target is determined not only from the versioned resource, but also from the target selector. Hence, I would like to suggest the following wording: "The request-URL MUST identify a versioned resource. The target of this resource as determined from the Target-Selector header MUST be a working resource". The target actually can depend on a variety of things (i.e. the Target-Selector header, the default target, the default workspace for the URL, and a Workspace header). Since the term "target" is defined to include all those things, I believe it is unambiguous to refer to the "target of a versioned resource". Section 6.4 ----------- > If the request-URL is write-locked for the specified target selector, > the CHECKIN request MUST include the appropriate lock token. This is (at least for me) misunderstanding. How about the following wording: "The request-URL together with the target selector determine a stable URL for the new revision. If this stable URL is write-locked, the CHECKIN request MUST include the appropriate lock token." The request-URL with the Target-Selector do not form a stable reference to a particular version (which is what a stable URL is), since the Target-Selector could specify a label. When you say it is a misunderstanding, did you mean that it wasn't clear what was being said, or that you thought it meant something else? > If the request body of the CHECKIN request contains a > DAV:keep-checked-out element, then instead of deleting the working > resource, CHECKIN will modify the DAV:predecessor-set of the working > resource to contain a stable URL for the newly created revision. Does that imply that you can do a series of CHECKIN requests without any intermediate CHECKOUT? Yes. > <!ELEMENT checkin-policy (keep-checked-out|overwrite|ANY)* > This is an invalid element type declaration: mixed content can not be mixed with "ANY"; see rule [46] of [XML1.0]. I'll just get rid of the ANY. None of the WebDAV DTD's are suitable for a validating parser, so adding an "ANY" doesn't have much point (extensions will be adding in new types of nodes to everything we're defining here). > 201 (Created): The revision was successfully created. Why not use 200 (Ok)? Creating a new revision occurs in conjunction with deleting the working resource (unless keep-checked-out is set); hence, isn't it a transport of data rather than a creation? The revision is very different from the working resource in behavior (for one thing, it has a stable URL), so it's probably worth flagging this as a creation event (and returning the stable URL in a Location header). > 405 (Method Not Allowed): The request-URL does not identify a > versioned resource whose target is a working resource, or the > specified DAV:checkin-policy elements are invalid. ... or DAV:overwrite was applied on a server that does not support mutable revisions. This will get fixed when we merge the status returns into the precondition section. Section 6.6 ----------- > Unlike a resource property, a report depends on more than the state of > the resource identified by the request-URL. Could you give an example? The DAV:property-report can depends on not only the resource itself, but also the resources identified by href's that are values of properties of that resource. Section 6.6.1 ------------- > >>RESPONSE > > HTTP/1.1 200 OK > > Host: www.webdav.org Host is a request header only, not a response header. Fixed. Section 6.6.2 ------------- > Many properties consist of a set of one or more href elements. Probably this should read: "Many of the properties specified in sections 3.2 through 3.5 consist ... ". Or is property-report designed to handle user-defined properties as well? It is designed to handle user-defined properties as well. > The elements of a DAV:property-report identify which properties of the > resource are to be reported. "the resource" should be replaced with "the resources" or "each resource". Fixed. > >>REQUEST > ... The example still contains identifiers from earlier versions of the draft: replace "<D:revisions>" with "<D:revision-set>" (2 times) replace "<D:revisions/>" with "<D:revision-set/>" (2 times) replace "<D:revision>" with "<D:revision-id>" (2 times) replace "<D:revision/>" with "<D:revision-id/>" (2 times) Fixed. Section 18 ---------- > All other IANA considerations mentioned in [RFC2518] also applicable Typo: replace "applicable" with "apply". Fixed. ########################################################################## Bye, Juergen And many thanks for a careful and detailed review!! Cheers, Geoff