5/26 Working Group Meeting - Concord MA

jamsden@us.ibm.com
Tue, 8 Jun 1999 17:45:38 -0400


From: jamsden@us.ibm.com
To: ietf-dav-versioning@w3.org
Message-ID: <8525678A.0077C4E2.00@d54mta03.raleigh.ibm.com>
Date: Tue, 8 Jun 1999 17:45:38 -0400
Subject: 5/26 Working Group Meeting - Concord MA




---------------------- Forwarded by Jim Amsden/Raleigh/IBM on 06/08/99 05:43 PM
---------------------------
|------------------>
|    Discussion    |
|    Main Topic    |
|                  |
|       Jim        |
|Amsden/Raleigh/IBM|
|  05/26 08:45 AM  |
|                  |
|                  |
|------------------>
  >------------------------------------------------------------------|
  | .                                                                |
  |                                                                  |
  | Subject:                                                         |
  | .                                                                |
  |            5/26 Working Group Meeting - Concord MA               |
  |            .                                                     |
  | Category:                                                        |
  |            Working Group Meetings                                |
  |                                                                  |
  |                                                                  |
  |                                                                  |
  >------------------------------------------------------------------|




The DELTA-V (WebDAV Versioning design team) held a meeting held on 5/26 and 5/27
in Concord MA. Thanks to Rational and Geoff Clemm for excellent accommodations
in a wonderfully historic site. The following contents are the meeting notes.

Attendees:

Chris Kaler
Jeff McAffer
Bruce Kragan
Jim Amsden
Geoff Clemm
David Durand
John Vasta Rational

** indicates action item
*** issue needing discussion before adjourning

Discussed minor changes to Goals document.

**Chris will merge the definitions in goals document with the versioning
overview in model document into the introduction part of the protocol document.

We will try to get updates of all documents ready for Oslo IETF.

Versioning Overview

Chris doesn't like checkin to create a new versioned resource. Wants a separate
method to create a new versioned resource to avoid overload checkin, simplify
implementation, and allow depth header to checkin a whole collection. Do we even
want to allow shallow checking of a collection to create a new versioned
resource. Can we have a versioned collection without versioned members? Its not
clear that checking in an unversioned resource is that much of an overload of
checkin. The semantics are similar in that a new revision is created and put in
the checked in state. There is no ambiguity or extra information required for a
server to detect that the checkin is being applied to an unversioned resource. A
depth header could be added to checkin in order to check in all checked out
members in a single operation. Adding new methods is a significant overhead in
HTTP because the interaction with all headers must be defined.

How do URLs map after checkin: the original, human URL that was used to checkin
the revision, a particular revision URL, a URL corresponding to the versioned
resource as a whole and its history? There are three possible URLs to consider:
1) The URL that was used to create the versioned resource in the first place,
known in the protocol as the versioned resource URL. That is, the human
meaningful URL that is known to users. This URL must be used in conjunction with
a revision selector to access a particular revision. The server maps the URL to
a versioned resource while the revision selector specifies the target revision.
2) The server generated URL for a particular revision of the versioned resource.
This can be used by down-level, non-versioning aware clients to access
particular revisions outside of that selected by the default workspace. 3) a URL
of the versioned resource as a whole that can be used to access meta-data
(properties) about all revisions such as the revision history, versioning
properties, deletion of the versioned resource as a whole including all its
revisions, etc. This URL is called the "history" resource in the current
protocol document.

There is no URL to a versioned resource without a workspace. Do we need one? For
down-level clients? Can a checkout create a working resource on some other
platform, a local copy? We are proposing a server-generated URL for each
revision that could play this role. Checkout creates a working resource on the
server. A client is free to copy this resource to a local file or WebDAV
repository if desired.

Auto versioning isn't mentioned.

"or" and "merge" not described well in Selecting Revisions Through the
Workspace. Grafted in without adequate description. This needs a better
write-up.

The overview doesn't describe how to create a workspace or activity. Use
PROPPATCH to create resources? Protocol document uses MKRESOURCE for this.

Need to explain versioned collections. Review versioned collections write-up
especially with respect to baselines.

Chris thinks merging should be optional. He thinks we might not be defining the
conflicts properly or there may be many ways to determine conflicts. Too
complex. Could be shortened up. The current model does make the assumption that
a successor logically contains its predecessor. This was considered to be a core
concept for versioning model and for determining merge conflicts.

** Chris will merge this overview section into the protocol document and will
put in sections introducing and describing levels. Note below that we decided to
split versioning into core versioning capabilities with options instead of
defining multiple levels. See below for details.


Advanced Collections overview

Geoff gave an update on the advanced collections protocol, especially the new
BIND method and semantics.

Can say the collection is ordered. Ordered will be maintained, and client can
insert and delete in an order. The ordering is client defined, server
maintained.

Binding is the ability to cause two URLs to map to the same resource under
client control. Mapping: URL-->resource. State of a collection is a set
(possibly ordered) of bindings, a URI segment --> resource. A null resource is
represented by a URL that is not mapped to any resource. PUT or COPY to a null
resource creates a new resource and a binding in the parent collection.

In addition to PUT and COPY which create new resources, advanced collections
introduced the BIND method to create a new binding to an existing resource.

BIND <URL existing resource> with <new destination URL>. For example: BIND
/home/test/x.html with /x/y/zz.html would fail if /home/test/x.html is not
currently mapped to a resource. Problem with direction of resource vs. new URL.
BIND implies opposite direction. MKALIAS might be better. Thinking of it as bind
with, not bind to, helps. If /x/y/zz.html already exists, it will be rebound.
Can be controlled by overwrite header. Say /home/test/x.html is mapped to
resource R13. After bind, /x/y/zz.html will also be mapped to R13. X and Y would
have to exist. PROPFIND depth infinity must detect cycles.

GET on any URL bound to a resource returns the same resource. Same with
PROPFIND. There are no separate properties. DELETE removes the binding from its
parent collection. There is no implied destruction of the resource state. Server
is free to remove resource only if there are no further bindings. There is an
optional back reference property so a resource can know its bindings. Absence of
this property cannot be used to determine non-existence of bindings. Garbage
collection is the server's problem. Can refuse to bind or allow delete if server
can't enforce semantics.

There is a new header on DELETE to remove all bindings. Server can fail this if
it can't do it. There is a problem with proxies removing headers they don't
understand. This is a generic problem with WebDAV that may require update
proxies for authoring servers.

BIND /x/y/ with /a/b/ means if /x/y/ maps to collection CR-97, then after the
bind, /a/b/ is also mapped to CR-97. CR-97's state is a set of bindings. Say
CR-97 has members a.html --> R21. Then /a/b/a.html is mapped to R21.

Could do BIND /x/y/ with /x/y/subx/ can create a cycle. /x/y/ --> CR-97. So is
/x/y/subx/ --> CR-97. Note CR-97's state has been changed to contain itself.
Must detect these cycles on any depth infinity operation.

Redirect references are retained. Is a separate resource with its own
properties. Redirect is responsibility of client. Server never implicitly does
redirects as was the case with direct references.


Versioning Protocol

Next we walked through the draft-ietf-webdav-versioning-01.3 document.

Organization of document needs to be considered. Need to put in template for
each method specifying preconditions, semantics, postconditions, error codes,
etc. Complete one method to establish the template and set expectation. Need to
think about how the organization should be influenced by leveling. The current
spec splits versioning and configurations across leveling but leveling issues
have been re-introduced so this organization may not reflect leveling properly.
We are considering a core set of versioning capabilities with options for more
advanced functions.

Use of "-" in property names and terms needs to be consistent with DAV and
English.

Names for levels vs. classes for DAV conformance. Should we be consistent with
the established mechanism for denoting conformance, or does versioning need to
introduce a new mechanism? We'll probably want to introduce a class 3 server
which supports core versioning and some means of specifying the supported
options. Perhaps simply through the methods returned by OPTIONS.

Need to specify valid data types for each property.

Auto-versioning also must support DAV class 1 and 2 clients too, not just
HTTP/1.1.

How does null resource relate to a lock null resource. It's a URL that responds
with a 404 not found for all methods. A lock null resource can respond to
PROPFIND.

Is a revision name (id or label) is a string? If so what is the encoding? How
will encodings be handled in labels? Will they be URI's?

Does a working resource have state on the server? Most members hope it does.
Bruce has some issues with this. Clients are free to create local copies for
disconnected work and local editing. However, we would like to have it be
possible to access the working resource remotely too to support remote
authoring.

Do we need a definition of "Default Target"? This is part of the issue of
defining target selection for the human meaningful URL for a versioned resource
in the context of a workspace.

Can regular resources (not collections) have baselines? A configuration for a
non-collection resource is the same as a label of the non-collection resource,
so baseline is not necessary.

Will level 2 have lots of options? Need another level? Activities,
configurations, versioned collections, baselines? Consider deleting anything
that might be an option to simplify the protocol. See the discussion below on
core versioning and options.

History URL is the URL that refers to the versioned resource as a whole. It is
not mapped by the workspace. Revision selectors have no effect. This is not a
versionable resource. Is generated by the server. Can be used to get the
history. Can only do PROPFIND and PROPPATCH on the resource bound to a history
URL. Have 3 URLs, the versioned resource URL which is the human URL and is
mapped by the workspace to a revision; the URL of a particular revision, and the
URL of the history resource. See discussion below on new names for these three
concepts.

Consider removing repository. Not clear it is needed, or the only way to
accomplish its role. Related to putting properties on the document root, "/".
Need a place to put server properties. Probably could use OPTIONS. See below in
the issues section for further details.

How are property attributes discovered and set? This is schema management which
may not be a specification issue. However, we need to define the schema for our
properties, and it would be useful if such schema specifications were
discoverable by clients. It is also difficult to implement servers without
knowing these property attributes in some inter-operable way.

*** Property collections could be replaced by using XML documents. Chris
proposed that the concept of property collections should be removed from the
protocol. A property collection is a read-only, or live property, managed by the
server whose value is the (href) URL of some collection. Issues: Introduces a
new resource type with restrictions. Introduces URLs that must be managed by the
server. Requires extra round trips. But goes both ways. Collections aren't as
extensible as a property. Editing and updating is difficult through a
collection. But there is no way to specify updates to an XML document.
Encapsulation is better with XML document. Extensibility can be accomplished by
adding properties to the collections. There may be bandwidth and processing
issues. Intuitiveness? Chris's server implementation cannot use property
collections.
** Geoff and Chris will get together to resolve this. David has some interest in
helping. Resolution: any properties that were property collections will be XML
elements that may contain an optional href that will let that XML document be
treated as a collection which can be updated with the usual collection
operations. Servers that don't support the href must provide the information in
the XML document. Is it one property with an href, or is it two properties? Is
redundant information being maintained? Not necessarily. These are just two
representations of the same data maintained by the server. They provide
different ways to interchange and update versioning data.

DAV:workspaces is the same as DAV:checkout-discovery.

Missing a property for DAV:history-resource? Is DAV:linear-history the same
thing? Should this be DAV:history? Does this require 2 round trips? We had a
goal to get it in one.

Can DAV:successors contain the revision id's instead of bindings to the
revisions? Same for DAV:predecessor and DAV:merge-predecessors? That is, for the
href to the property collection, can the members of that collection be the
revision id?

Is DAV:last-checkin the same as DAV:creation-date?

When is the revision id allocated, checkin or checkout? Or does this need to be
defined?

*** Two questions on revision history that need to be resolved: how do we get it
(property of a revision, or a resource one does a PROPFIND on) and what does it
contain (fixed information, or extensible).

DAV:checkin-policy is a possible candidate for removal for protocol
simplification. Clients can effect these policies if they want, but other
clients might not. So some feel it should be an enforced server policy so all
clients are forced to do it. Need to examine the interaction with baseline
creation (deep checkin). Need to define what properties are used to establish
equality of revisions with working resources. For example, do live properties or
mutable properties get included in determining equality?

Checkin must be atomic.

Be consistent about id, URL, and URN. What is DAV:history-id for? To determine
equality across servers.

Introduce both DAV:current-label and DAV:current-activity. Level 2 servers
support both.

Does a member of a versioned collection need to be a versioned resource? Current
spec says that a PUT to create a new member of a versioned collection (that is
checked out to a working collection resource) creates a versioned resource. Is
it checked in? What's its contents? Empty? Properties?

Put on dav mailing list to allow PROPPATCH to create a resource. See if
MKRESOURCE could be removed and just use PROPPATCH. Some properties must  be set
at resource creation because they should not be changed once the resource has
been created. For example, collection, activity, workspace, configurations, etc.
These are resource types defined by WebDAV and WebDAV needs to manage them.

Default target didn't seem like a good term for the resource selected by the
workspaces. Consider using selection instead of target for the thing the
workspace selects, either a working resource or a selected revision? Need to
decide on a name and define it properly. See below for a resolution of this
issue.

Need to define what GET and PUT means for workspaces, activities, and
configurations.

Might need a confirmation header for delete on a workspace to prevent lots of
working resources from being deleted.

Can an activity be the current activity in more than one workspace at a time?

Not clear what properties are copied when a revision is copied.

Consider putting a depth header on checkin/checkout as a convenience. Checking
in a collection with depth infinity would checkin this collection, and any of
its checked out members.

If we are restricting where in the URL namespace some resource types can be
created, how do we find out where? Can the server restrict where resource types
can be created? Would this break functional cohesion desired by resource
authors? Can we resolve this by allowing servers to create bindings if they want
to access resource types from some well-known place? Can servers deny binds to
workspaces and activities.

*** Creation of workspaces as light-weight tokens needs to be written up in more
detail. The light-weight workspaces are created when? Deleted when? Reusable by
the client after the first one is returned? We decided to introduce simple
workspace and extended workspace where extended workspace allows RSR. Simple
workspaces are created with mkresource or on checkout, can be reused and
specified on other checkouts. When are they deleted?

Internationalization issues?

Configuration Management Properties

*** DAV:workspaces introduces a problem in that we are specifying how servers
have to organize workspaces so they can be located. This is a general query
problem that applies to all resource types. Doing it for workspaces is a special
case.

History resource id is a reference to something that contains a list of the
revisions. It is the versioned resource this revision is a revision of.

Chris indicated there may be a need to maintain other kinds of history like "URL
histories", what a URL referenced at some time. For example, /a/b --> vr75. At
some point, b is deleted and recreated to  be /a/b -> vr99. Want to maintain the
history of what /a/b was. This is similar to maintaining the history of what
revisions were selected by a workspace. We agreed that servers are free to
extend workspaces with properties to maintain this history, but this would not
be discussed in the protocol. These properties could be published in order to
facilitate interoperability. The above example is captured in the history of the
parent collection /a.

Have to define the format of all versioning properties. History resource id
could be either an id or URL or both. Need to decide which. Agreed that href's
would be better because a client can do something with them, and down-level
clients don't have to interpret ids.

There is no merge method because all that is needed is to update the
DAV:merge-predecessors property. Updating the merge-predecessor must
automatically update the corresponding merge-successor. Alternatively make
merge-successor read only. Decided update can only be done on merge-predecessor.

Needed-activities might be better named required activities or prerequisites.
Dependents is direction ambiguous. Putting a required activity in a workspace
RSR implies merging its required activities. Need to use the same name for
dependent configurations.

An activity can only be the current activity of one DAV:workspace at a time.
Activities capture a unit of work which support better conflict detection.
** Geoff will examine the places in the protocol where he counts on an activity
only being current in one workspace at a time and determine the effect of
relaxing this constraint. Allowing users to share activities in different
workspaces enables proactive parallel development. It's dangerous to allow
checkouts to automatically create branches that will need to be merged later.

The current protocol does not specify a method for adding or removing a revision
to/from a configuration. Consider using BIND to put a revision in a
configuration. The BIND puts an entry in the DAV:roots property and puts the
selected revision in the configuration. Putting a collection into the
configuration recursively puts all its members into the configuration but only
adds one entry to the DAV:roots.

The issue of leveling has come up again now that we have a clearer understanding
of workspaces, activities, and configurations. The issue needs to be examined
again as there is a possibility  that level 2 branding may apply to a very small
number of servers compromising interoperability. See the issues section below
for a resolution of this issue.

*** Think about another name for History Resource. It corresponds to the
versioned resource as a whole, not a particular revision or working resource.
See issues section below for a resolution.

Some of the properties of a repository are really properties of the server,
independent of any particular repository. For example, default workspace,
workspaces, etc. Perhaps these need to be in some well-known place like
/server/workspaces, etc. Users could create the workspaces in some other more
logical location, but the server would create a binding in its well-known place.
The difference between a workspace and say a .html file is that workspaces are
used by WebDAV semantics while .html files aren't. This is the motivation for
being able to locate these resources.
*** Need to find a place to get server properties like where are the workspaces,
etc. Use OPTIONS? See the issues section for further details.

GET-CONFLICTS should be just CONFLICTS. Can't start a method name with a prefix
equal to some other method. Merge support is another candidate for being
optional.

COMPARE might be another option candidate. We're starting to get a lot of these.
Perhaps compare is an example of a more general problem for getting reports.
This method could be REPORT with a header describing the report type. Then the
report type and result body can be defined as extensions that are described in a
manner similar to property schemas. Web servers already have extensibility
mechanisms like CGI or servlets. We shouldn't create another extensibility
mechanism specific to DAV. Another option is to allow functions to be defined
using XML whose bodies are either scripts, or server implemented. Jim Amsden has
a proposal for extending XML with behavior called Dynamic XML (DXML) which might
be useful for supporting generic, user-extensible WebDAV reports.


Issues/Actions for next rev of spec:

Do one method completely to show the method template and establish an
expectation for all the methods.

Add the template to all other methods, but it is not necessary to fill them all
in for the next revision of the draft.

Decide on leveling, especially as it effects spec organization
More levels? optionally functionality? delete optionally functionality and let
servers add it? The issue is that level two is too hard to do in a scalable way,
or is not required by most users so there won't be many servers that will ever
implement it. Jeff: define base versioning and have each additional capability
be optional. This could result in combinatoric expansion of client
implementations. Each feature would need to be orthogonal to make this
tractable. Decision start with DAV versioning core (what's currently in level 1)
with options:
   - extended workspaces (with modifiable revision selection rules)
   - activities
   - configurations
   - versioned collections
   - baselined collections: requires configurations and versioned collections
   - merging: requires extended workspaces
   - compare/reporting: requires configurations as defined
   x checkin policy: discovery is in level 1 and all entries are always optional
   x collections behind "property collections": discovery information must be in
level 1 anyway

Resolve use of '-' in property name and terminology
Property names should look like DAV. See if the mailing list cares if '-''s are
used in property names. The documents should not use words in inconsistent ways
whether they are hyphenated or not. Use proper English conventions for -'s in
the document.

For the property collection issue, is there one property with an href to the
collection, or separate properties
Why are two ways to get this information required? There is a set of information
we want to get and sometimes edit. For example the merge-predecessors of a
resource. This information is accessed through properties. Accessing the
properties returns an XML document containing the data. for example:
  <predecessors>
      <predecessor href="..." revisionId="..."/>
  </predecessors>
This has all the information needed and is extensible and internationalized.
Modifying this requires lock, propfind, proppatch, and unlock, lots of round
trips. Document may be large, and editing through DOM is an issue. But it is
easy to return to the client and display. Solve these problems by introducing a
property resource to identify a resource that captures the same information and
is edited with existing collection and resource editing mechanisms.
  <predecessors>
      <property-resource>http://server/repo/aa/bb/</property-resource>
      <predecessor href="..." revisionId="..."/>
  </predecessors>
or double the number of properties
  <property-resource>http://server/repo/aa/bb/</property-resource>
  <predecessors>
      <predecessor href="..." revisionId="..."/>
  </predecessors>
Edit by adding or removing bindings in the collection. No need to lock because
bind is atomic. Limits the number of round trips. Allows PROPFIND to get
extended information. Having both is not necessarily redundant data. These are
just different representations of the same data. Resolution is to present and
manipulate the data through both views, and see if the mailing list members
prefer one or the other or both. Use a header to select which view to return.
Never return both. Propfind allprop returns the default which is the XML
document.

Decide on name for selected revision (target). Resolution: keep target but
define it a little better. Overview should introduce it.

How is server versioning meta-data discovered? Properties on a distinguished
collection? On '/'? With OPTIONS?
Consider two approaches: implicit property on all resources, or OPTIONS on a
resource or *. Properties can be structured, options can't. Could consider
extending options to take and return an XML request/response body. Use PROPFIND
semantics to specify the body. Options exists because properties didn't. OPTIONS
* is the only way to talk to the server. Everything else is to a resource.
Resolution: use OPTIONS with PROPFIND entity request body on * for server and
resources for meta-data on resources. Returns server/versioning meta-data. Don't
want this information on allprop. Provides extensibility for OPTIONS. Geoff
wants to use a repository object so it can be a resource to provide
extensibility. Want to put in the repository: workspaces, activities, history
resources, configurations, ...? Discover the location of the repositories from
OPTIONS *.

DAV:checkout-discovery vs. DAV:workspaces: what is the name of the property and
what is returned. Two questions: 1) what workspace is this resource checked out
in, and 2) that resources are checked out in this workspace? DAV:resources
answers 1, DAV:checkout-discovery answers both. Resolution: do
DAV:checkout-discovery, Geoff will see if there are any problems.

Decide on a name for the versioned resource as a whole, currently called a
history resource. Use the new name in the *-id property names.
Resolution: use resource, revision, and versioned resource instead of versioned
resource, revision, history resource. In cases where it the resource must be a
revision, note it.

compare vs. generic report. Chris's suggestion: REPORT is a method that has a
request URL. URL determines what the report is on. Entity request body specifies
report type and any additional required parameters. Entity response body is
returned. Have some way to discover what reports are available. Only server
implementer can extend report types. Specify a report type for compare. Probably
should be doing history this way too. Don't use it for something that can be
done with a PROPFIND. That is, there are not executable semantics associated
with the report type that need to be calculated. Looks like there is some
overlap with this and PROPFIND. Reports need to be read-only, require
calculation, need parameters, don't need to be searched with DASL, etc.
Otherwise use property. Use a property unless you can't. Consider using DXML for
a formalized, extensible way of handling reports and other behavior.

Can servers restrict where resource types can be mapped? Workspaces, activities,
etc. If so, how do we find them? (Related to OPTIONS/repository/meta-data
properties issue) Can servers deny BIND to them? Can users create these
resources (with MKRESOURCE) in their namespaces and the server creates its own
binding for its own use? Geoff will see if this restriction is required.