From: jamsden@us.ibm.com To: ietf-dav-versioning@w3.org Message-ID: <8525678A.0077C4E2.00@d54mta03.raleigh.ibm.com> Date: Tue, 8 Jun 1999 17:45:38 -0400 Subject: 5/26 Working Group Meeting - Concord MA ---------------------- Forwarded by Jim Amsden/Raleigh/IBM on 06/08/99 05:43 PM --------------------------- |------------------> | Discussion | | Main Topic | | | | Jim | |Amsden/Raleigh/IBM| | 05/26 08:45 AM | | | | | |------------------> >------------------------------------------------------------------| | . | | | | Subject: | | . | | 5/26 Working Group Meeting - Concord MA | | . | | Category: | | Working Group Meetings | | | | | | | >------------------------------------------------------------------| The DELTA-V (WebDAV Versioning design team) held a meeting held on 5/26 and 5/27 in Concord MA. Thanks to Rational and Geoff Clemm for excellent accommodations in a wonderfully historic site. The following contents are the meeting notes. Attendees: Chris Kaler Jeff McAffer Bruce Kragan Jim Amsden Geoff Clemm David Durand John Vasta Rational ** indicates action item *** issue needing discussion before adjourning Discussed minor changes to Goals document. **Chris will merge the definitions in goals document with the versioning overview in model document into the introduction part of the protocol document. We will try to get updates of all documents ready for Oslo IETF. Versioning Overview Chris doesn't like checkin to create a new versioned resource. Wants a separate method to create a new versioned resource to avoid overload checkin, simplify implementation, and allow depth header to checkin a whole collection. Do we even want to allow shallow checking of a collection to create a new versioned resource. Can we have a versioned collection without versioned members? Its not clear that checking in an unversioned resource is that much of an overload of checkin. The semantics are similar in that a new revision is created and put in the checked in state. There is no ambiguity or extra information required for a server to detect that the checkin is being applied to an unversioned resource. A depth header could be added to checkin in order to check in all checked out members in a single operation. Adding new methods is a significant overhead in HTTP because the interaction with all headers must be defined. How do URLs map after checkin: the original, human URL that was used to checkin the revision, a particular revision URL, a URL corresponding to the versioned resource as a whole and its history? There are three possible URLs to consider: 1) The URL that was used to create the versioned resource in the first place, known in the protocol as the versioned resource URL. That is, the human meaningful URL that is known to users. This URL must be used in conjunction with a revision selector to access a particular revision. The server maps the URL to a versioned resource while the revision selector specifies the target revision. 2) The server generated URL for a particular revision of the versioned resource. This can be used by down-level, non-versioning aware clients to access particular revisions outside of that selected by the default workspace. 3) a URL of the versioned resource as a whole that can be used to access meta-data (properties) about all revisions such as the revision history, versioning properties, deletion of the versioned resource as a whole including all its revisions, etc. This URL is called the "history" resource in the current protocol document. There is no URL to a versioned resource without a workspace. Do we need one? For down-level clients? Can a checkout create a working resource on some other platform, a local copy? We are proposing a server-generated URL for each revision that could play this role. Checkout creates a working resource on the server. A client is free to copy this resource to a local file or WebDAV repository if desired. Auto versioning isn't mentioned. "or" and "merge" not described well in Selecting Revisions Through the Workspace. Grafted in without adequate description. This needs a better write-up. The overview doesn't describe how to create a workspace or activity. Use PROPPATCH to create resources? Protocol document uses MKRESOURCE for this. Need to explain versioned collections. Review versioned collections write-up especially with respect to baselines. Chris thinks merging should be optional. He thinks we might not be defining the conflicts properly or there may be many ways to determine conflicts. Too complex. Could be shortened up. The current model does make the assumption that a successor logically contains its predecessor. This was considered to be a core concept for versioning model and for determining merge conflicts. ** Chris will merge this overview section into the protocol document and will put in sections introducing and describing levels. Note below that we decided to split versioning into core versioning capabilities with options instead of defining multiple levels. See below for details. Advanced Collections overview Geoff gave an update on the advanced collections protocol, especially the new BIND method and semantics. Can say the collection is ordered. Ordered will be maintained, and client can insert and delete in an order. The ordering is client defined, server maintained. Binding is the ability to cause two URLs to map to the same resource under client control. Mapping: URL-->resource. State of a collection is a set (possibly ordered) of bindings, a URI segment --> resource. A null resource is represented by a URL that is not mapped to any resource. PUT or COPY to a null resource creates a new resource and a binding in the parent collection. In addition to PUT and COPY which create new resources, advanced collections introduced the BIND method to create a new binding to an existing resource. BIND <URL existing resource> with <new destination URL>. For example: BIND /home/test/x.html with /x/y/zz.html would fail if /home/test/x.html is not currently mapped to a resource. Problem with direction of resource vs. new URL. BIND implies opposite direction. MKALIAS might be better. Thinking of it as bind with, not bind to, helps. If /x/y/zz.html already exists, it will be rebound. Can be controlled by overwrite header. Say /home/test/x.html is mapped to resource R13. After bind, /x/y/zz.html will also be mapped to R13. X and Y would have to exist. PROPFIND depth infinity must detect cycles. GET on any URL bound to a resource returns the same resource. Same with PROPFIND. There are no separate properties. DELETE removes the binding from its parent collection. There is no implied destruction of the resource state. Server is free to remove resource only if there are no further bindings. There is an optional back reference property so a resource can know its bindings. Absence of this property cannot be used to determine non-existence of bindings. Garbage collection is the server's problem. Can refuse to bind or allow delete if server can't enforce semantics. There is a new header on DELETE to remove all bindings. Server can fail this if it can't do it. There is a problem with proxies removing headers they don't understand. This is a generic problem with WebDAV that may require update proxies for authoring servers. BIND /x/y/ with /a/b/ means if /x/y/ maps to collection CR-97, then after the bind, /a/b/ is also mapped to CR-97. CR-97's state is a set of bindings. Say CR-97 has members a.html --> R21. Then /a/b/a.html is mapped to R21. Could do BIND /x/y/ with /x/y/subx/ can create a cycle. /x/y/ --> CR-97. So is /x/y/subx/ --> CR-97. Note CR-97's state has been changed to contain itself. Must detect these cycles on any depth infinity operation. Redirect references are retained. Is a separate resource with its own properties. Redirect is responsibility of client. Server never implicitly does redirects as was the case with direct references. Versioning Protocol Next we walked through the draft-ietf-webdav-versioning-01.3 document. Organization of document needs to be considered. Need to put in template for each method specifying preconditions, semantics, postconditions, error codes, etc. Complete one method to establish the template and set expectation. Need to think about how the organization should be influenced by leveling. The current spec splits versioning and configurations across leveling but leveling issues have been re-introduced so this organization may not reflect leveling properly. We are considering a core set of versioning capabilities with options for more advanced functions. Use of "-" in property names and terms needs to be consistent with DAV and English. Names for levels vs. classes for DAV conformance. Should we be consistent with the established mechanism for denoting conformance, or does versioning need to introduce a new mechanism? We'll probably want to introduce a class 3 server which supports core versioning and some means of specifying the supported options. Perhaps simply through the methods returned by OPTIONS. Need to specify valid data types for each property. Auto-versioning also must support DAV class 1 and 2 clients too, not just HTTP/1.1. How does null resource relate to a lock null resource. It's a URL that responds with a 404 not found for all methods. A lock null resource can respond to PROPFIND. Is a revision name (id or label) is a string? If so what is the encoding? How will encodings be handled in labels? Will they be URI's? Does a working resource have state on the server? Most members hope it does. Bruce has some issues with this. Clients are free to create local copies for disconnected work and local editing. However, we would like to have it be possible to access the working resource remotely too to support remote authoring. Do we need a definition of "Default Target"? This is part of the issue of defining target selection for the human meaningful URL for a versioned resource in the context of a workspace. Can regular resources (not collections) have baselines? A configuration for a non-collection resource is the same as a label of the non-collection resource, so baseline is not necessary. Will level 2 have lots of options? Need another level? Activities, configurations, versioned collections, baselines? Consider deleting anything that might be an option to simplify the protocol. See the discussion below on core versioning and options. History URL is the URL that refers to the versioned resource as a whole. It is not mapped by the workspace. Revision selectors have no effect. This is not a versionable resource. Is generated by the server. Can be used to get the history. Can only do PROPFIND and PROPPATCH on the resource bound to a history URL. Have 3 URLs, the versioned resource URL which is the human URL and is mapped by the workspace to a revision; the URL of a particular revision, and the URL of the history resource. See discussion below on new names for these three concepts. Consider removing repository. Not clear it is needed, or the only way to accomplish its role. Related to putting properties on the document root, "/". Need a place to put server properties. Probably could use OPTIONS. See below in the issues section for further details. How are property attributes discovered and set? This is schema management which may not be a specification issue. However, we need to define the schema for our properties, and it would be useful if such schema specifications were discoverable by clients. It is also difficult to implement servers without knowing these property attributes in some inter-operable way. *** Property collections could be replaced by using XML documents. Chris proposed that the concept of property collections should be removed from the protocol. A property collection is a read-only, or live property, managed by the server whose value is the (href) URL of some collection. Issues: Introduces a new resource type with restrictions. Introduces URLs that must be managed by the server. Requires extra round trips. But goes both ways. Collections aren't as extensible as a property. Editing and updating is difficult through a collection. But there is no way to specify updates to an XML document. Encapsulation is better with XML document. Extensibility can be accomplished by adding properties to the collections. There may be bandwidth and processing issues. Intuitiveness? Chris's server implementation cannot use property collections. ** Geoff and Chris will get together to resolve this. David has some interest in helping. Resolution: any properties that were property collections will be XML elements that may contain an optional href that will let that XML document be treated as a collection which can be updated with the usual collection operations. Servers that don't support the href must provide the information in the XML document. Is it one property with an href, or is it two properties? Is redundant information being maintained? Not necessarily. These are just two representations of the same data maintained by the server. They provide different ways to interchange and update versioning data. DAV:workspaces is the same as DAV:checkout-discovery. Missing a property for DAV:history-resource? Is DAV:linear-history the same thing? Should this be DAV:history? Does this require 2 round trips? We had a goal to get it in one. Can DAV:successors contain the revision id's instead of bindings to the revisions? Same for DAV:predecessor and DAV:merge-predecessors? That is, for the href to the property collection, can the members of that collection be the revision id? Is DAV:last-checkin the same as DAV:creation-date? When is the revision id allocated, checkin or checkout? Or does this need to be defined? *** Two questions on revision history that need to be resolved: how do we get it (property of a revision, or a resource one does a PROPFIND on) and what does it contain (fixed information, or extensible). DAV:checkin-policy is a possible candidate for removal for protocol simplification. Clients can effect these policies if they want, but other clients might not. So some feel it should be an enforced server policy so all clients are forced to do it. Need to examine the interaction with baseline creation (deep checkin). Need to define what properties are used to establish equality of revisions with working resources. For example, do live properties or mutable properties get included in determining equality? Checkin must be atomic. Be consistent about id, URL, and URN. What is DAV:history-id for? To determine equality across servers. Introduce both DAV:current-label and DAV:current-activity. Level 2 servers support both. Does a member of a versioned collection need to be a versioned resource? Current spec says that a PUT to create a new member of a versioned collection (that is checked out to a working collection resource) creates a versioned resource. Is it checked in? What's its contents? Empty? Properties? Put on dav mailing list to allow PROPPATCH to create a resource. See if MKRESOURCE could be removed and just use PROPPATCH. Some properties must be set at resource creation because they should not be changed once the resource has been created. For example, collection, activity, workspace, configurations, etc. These are resource types defined by WebDAV and WebDAV needs to manage them. Default target didn't seem like a good term for the resource selected by the workspaces. Consider using selection instead of target for the thing the workspace selects, either a working resource or a selected revision? Need to decide on a name and define it properly. See below for a resolution of this issue. Need to define what GET and PUT means for workspaces, activities, and configurations. Might need a confirmation header for delete on a workspace to prevent lots of working resources from being deleted. Can an activity be the current activity in more than one workspace at a time? Not clear what properties are copied when a revision is copied. Consider putting a depth header on checkin/checkout as a convenience. Checking in a collection with depth infinity would checkin this collection, and any of its checked out members. If we are restricting where in the URL namespace some resource types can be created, how do we find out where? Can the server restrict where resource types can be created? Would this break functional cohesion desired by resource authors? Can we resolve this by allowing servers to create bindings if they want to access resource types from some well-known place? Can servers deny binds to workspaces and activities. *** Creation of workspaces as light-weight tokens needs to be written up in more detail. The light-weight workspaces are created when? Deleted when? Reusable by the client after the first one is returned? We decided to introduce simple workspace and extended workspace where extended workspace allows RSR. Simple workspaces are created with mkresource or on checkout, can be reused and specified on other checkouts. When are they deleted? Internationalization issues? Configuration Management Properties *** DAV:workspaces introduces a problem in that we are specifying how servers have to organize workspaces so they can be located. This is a general query problem that applies to all resource types. Doing it for workspaces is a special case. History resource id is a reference to something that contains a list of the revisions. It is the versioned resource this revision is a revision of. Chris indicated there may be a need to maintain other kinds of history like "URL histories", what a URL referenced at some time. For example, /a/b --> vr75. At some point, b is deleted and recreated to be /a/b -> vr99. Want to maintain the history of what /a/b was. This is similar to maintaining the history of what revisions were selected by a workspace. We agreed that servers are free to extend workspaces with properties to maintain this history, but this would not be discussed in the protocol. These properties could be published in order to facilitate interoperability. The above example is captured in the history of the parent collection /a. Have to define the format of all versioning properties. History resource id could be either an id or URL or both. Need to decide which. Agreed that href's would be better because a client can do something with them, and down-level clients don't have to interpret ids. There is no merge method because all that is needed is to update the DAV:merge-predecessors property. Updating the merge-predecessor must automatically update the corresponding merge-successor. Alternatively make merge-successor read only. Decided update can only be done on merge-predecessor. Needed-activities might be better named required activities or prerequisites. Dependents is direction ambiguous. Putting a required activity in a workspace RSR implies merging its required activities. Need to use the same name for dependent configurations. An activity can only be the current activity of one DAV:workspace at a time. Activities capture a unit of work which support better conflict detection. ** Geoff will examine the places in the protocol where he counts on an activity only being current in one workspace at a time and determine the effect of relaxing this constraint. Allowing users to share activities in different workspaces enables proactive parallel development. It's dangerous to allow checkouts to automatically create branches that will need to be merged later. The current protocol does not specify a method for adding or removing a revision to/from a configuration. Consider using BIND to put a revision in a configuration. The BIND puts an entry in the DAV:roots property and puts the selected revision in the configuration. Putting a collection into the configuration recursively puts all its members into the configuration but only adds one entry to the DAV:roots. The issue of leveling has come up again now that we have a clearer understanding of workspaces, activities, and configurations. The issue needs to be examined again as there is a possibility that level 2 branding may apply to a very small number of servers compromising interoperability. See the issues section below for a resolution of this issue. *** Think about another name for History Resource. It corresponds to the versioned resource as a whole, not a particular revision or working resource. See issues section below for a resolution. Some of the properties of a repository are really properties of the server, independent of any particular repository. For example, default workspace, workspaces, etc. Perhaps these need to be in some well-known place like /server/workspaces, etc. Users could create the workspaces in some other more logical location, but the server would create a binding in its well-known place. The difference between a workspace and say a .html file is that workspaces are used by WebDAV semantics while .html files aren't. This is the motivation for being able to locate these resources. *** Need to find a place to get server properties like where are the workspaces, etc. Use OPTIONS? See the issues section for further details. GET-CONFLICTS should be just CONFLICTS. Can't start a method name with a prefix equal to some other method. Merge support is another candidate for being optional. COMPARE might be another option candidate. We're starting to get a lot of these. Perhaps compare is an example of a more general problem for getting reports. This method could be REPORT with a header describing the report type. Then the report type and result body can be defined as extensions that are described in a manner similar to property schemas. Web servers already have extensibility mechanisms like CGI or servlets. We shouldn't create another extensibility mechanism specific to DAV. Another option is to allow functions to be defined using XML whose bodies are either scripts, or server implemented. Jim Amsden has a proposal for extending XML with behavior called Dynamic XML (DXML) which might be useful for supporting generic, user-extensible WebDAV reports. Issues/Actions for next rev of spec: Do one method completely to show the method template and establish an expectation for all the methods. Add the template to all other methods, but it is not necessary to fill them all in for the next revision of the draft. Decide on leveling, especially as it effects spec organization More levels? optionally functionality? delete optionally functionality and let servers add it? The issue is that level two is too hard to do in a scalable way, or is not required by most users so there won't be many servers that will ever implement it. Jeff: define base versioning and have each additional capability be optional. This could result in combinatoric expansion of client implementations. Each feature would need to be orthogonal to make this tractable. Decision start with DAV versioning core (what's currently in level 1) with options: - extended workspaces (with modifiable revision selection rules) - activities - configurations - versioned collections - baselined collections: requires configurations and versioned collections - merging: requires extended workspaces - compare/reporting: requires configurations as defined x checkin policy: discovery is in level 1 and all entries are always optional x collections behind "property collections": discovery information must be in level 1 anyway Resolve use of '-' in property name and terminology Property names should look like DAV. See if the mailing list cares if '-''s are used in property names. The documents should not use words in inconsistent ways whether they are hyphenated or not. Use proper English conventions for -'s in the document. For the property collection issue, is there one property with an href to the collection, or separate properties Why are two ways to get this information required? There is a set of information we want to get and sometimes edit. For example the merge-predecessors of a resource. This information is accessed through properties. Accessing the properties returns an XML document containing the data. for example: <predecessors> <predecessor href="..." revisionId="..."/> </predecessors> This has all the information needed and is extensible and internationalized. Modifying this requires lock, propfind, proppatch, and unlock, lots of round trips. Document may be large, and editing through DOM is an issue. But it is easy to return to the client and display. Solve these problems by introducing a property resource to identify a resource that captures the same information and is edited with existing collection and resource editing mechanisms. <predecessors> <property-resource>http://server/repo/aa/bb/</property-resource> <predecessor href="..." revisionId="..."/> </predecessors> or double the number of properties <property-resource>http://server/repo/aa/bb/</property-resource> <predecessors> <predecessor href="..." revisionId="..."/> </predecessors> Edit by adding or removing bindings in the collection. No need to lock because bind is atomic. Limits the number of round trips. Allows PROPFIND to get extended information. Having both is not necessarily redundant data. These are just different representations of the same data. Resolution is to present and manipulate the data through both views, and see if the mailing list members prefer one or the other or both. Use a header to select which view to return. Never return both. Propfind allprop returns the default which is the XML document. Decide on name for selected revision (target). Resolution: keep target but define it a little better. Overview should introduce it. How is server versioning meta-data discovered? Properties on a distinguished collection? On '/'? With OPTIONS? Consider two approaches: implicit property on all resources, or OPTIONS on a resource or *. Properties can be structured, options can't. Could consider extending options to take and return an XML request/response body. Use PROPFIND semantics to specify the body. Options exists because properties didn't. OPTIONS * is the only way to talk to the server. Everything else is to a resource. Resolution: use OPTIONS with PROPFIND entity request body on * for server and resources for meta-data on resources. Returns server/versioning meta-data. Don't want this information on allprop. Provides extensibility for OPTIONS. Geoff wants to use a repository object so it can be a resource to provide extensibility. Want to put in the repository: workspaces, activities, history resources, configurations, ...? Discover the location of the repositories from OPTIONS *. DAV:checkout-discovery vs. DAV:workspaces: what is the name of the property and what is returned. Two questions: 1) what workspace is this resource checked out in, and 2) that resources are checked out in this workspace? DAV:resources answers 1, DAV:checkout-discovery answers both. Resolution: do DAV:checkout-discovery, Geoff will see if there are any problems. Decide on a name for the versioned resource as a whole, currently called a history resource. Use the new name in the *-id property names. Resolution: use resource, revision, and versioned resource instead of versioned resource, revision, history resource. In cases where it the resource must be a revision, note it. compare vs. generic report. Chris's suggestion: REPORT is a method that has a request URL. URL determines what the report is on. Entity request body specifies report type and any additional required parameters. Entity response body is returned. Have some way to discover what reports are available. Only server implementer can extend report types. Specify a report type for compare. Probably should be doing history this way too. Don't use it for something that can be done with a PROPFIND. That is, there are not executable semantics associated with the report type that need to be calculated. Looks like there is some overlap with this and PROPFIND. Reports need to be read-only, require calculation, need parameters, don't need to be searched with DASL, etc. Otherwise use property. Use a property unless you can't. Consider using DXML for a formalized, extensible way of handling reports and other behavior. Can servers restrict where resource types can be mapped? Workspaces, activities, etc. If so, how do we find them? (Related to OPTIONS/repository/meta-data properties issue) Can servers deny BIND to them? Can users create these resources (with MKRESOURCE) in their namespaces and the server creates its own binding for its own use? Geoff will see if this restriction is required.