Additional DAV Requirements

Since my summary of DMA didn't elicit much interest except among members of
the DMA technical committee, I thought I would try a more direct and
provocative approach to putting more document management functionality into
DAV.  I would like to propose some additions to the DAV requirements.  Since
our schedule doesn't show us submitting requirements as an Internet Draft
until January, we should have time to discuss these changes and think about
how to implement them.

----------------

Here are my proposed changes to "Requirements on HTTP for Distributed
Content Editing":

4a.  No-Modify Locks.  It should be possible, via HTTP, to indicate to the
HTTP server that the contents of a resource should not be modified until all
no-modify locks are released.  . . .

4b.  Existence Locks.  It should be possible, via HTTP, to indicate to the
HTTP server that the resource should not be deleted until all existence
locks are released.

11.  Attributes.  Attributes are internet resources.  Operations on
attributes can be partitioned into three groups, corresponding to three
levels of compliance.  For all three levels of compliance, support must be
available for multi-valued attributes and for composite (structured)
attributes.  It must be possible for attributes to attach to any internet
resource, including other attributes.  

The first group of operations supports basic authoring and includes the
following operations:  setting attribute values, reading attribute values.
It should be possible to set / read the value of a single attribute, a list
of attributes, or all the attributes of a resource in a single operation. 

The second group makes authoring easier, by providing information that would
only become known through error messages if these operations were not
present:  reading an attribute definition (name, description, id, data type,
max length, editable, multi-valued, etc.), discovering what attributes are
present for a given resource, and discovering what attributes can
legitimately be attached to a given resource.

The third group provides the site administration operations related to
attributes:  Define a new attribute, remove an attribute definition, modify
an attribute definition, define an attribute set, define the valid attribute
set(s) for a given resource, collection of resources, or site.

(Rationale as before)

In order for attributes to be the basis of searches, particularly searches
of multiple collections or sites, some mechanism needs to be defined for
determining which attributes are semantically equivalent.  It also needs to
be possible to find out for a given collection or site, what attributes can
legitimately be attached to the resources it contains.

12.  Containers.  Containers can be used to model collections or compound
documents.  (URL hierarchies and file system hierarchies are particular
implementations of collections.)  There are at least these differences
between collections and compound documents:

· A collection typically has no inherent ordering.  The components of a
compound document often do have a defined ordering.

· A collection may have a bibliography or catalog.  A compound document is
more likely to have a table of contents or index.

· There is no expectation that the objects in a collection will have a
common style when printed or displayed.  It is expected that a compound
document will have a consistent style throughout, and sequential page
numbering throughout, sequential chapter numbering, etc.

· There is no expectation that all the objects in a collection will have the
same representations available, but there is likely to be the expectation
that all components of a compound document will be available in the same
representations.

Containers are resources that can contain other resources (including other
containers).  Containers can be nested to any depth.  In principle, a given
resource can be in more than one container.  Some sites or containers may be
constrained, however, to allow a given resource to exist in only a single
container.  Resources in a container may reside anywhere -- a single
container may hold resources from many different sites.

It should be possible to get such information as who put a given resource
into a given container and when.  This information belongs neither to the
resource nor to the container, but rather to the resource-in-the-container.

As in the case of attributes, operations on containers can be partitioned
into groups corresponding to levels of compliance.

The first group of operations is the most basic:  add a resource to a
container (possibly at a particular location in the container), remove a
resource from a container, list resources in a container, list container
hierarchy.  The list operations should allow the caller to specify what
attributes of the list items should be returned.

The second group of operations is more likely to be used by administrators:
create container, delete container.  The delete operation must allow the
caller to specify whether the resources in the container are also to be
deleted, and whether the operation is to be recursive.  (In cases where
resources are allowed to be in more than one container, the server or the
repository behind it may enforce referential integrity as it sees fit.  It
may, for example, decline to delete a resource that still belongs to some
other container.)

Navigating a large (wide and deep) container hierarchy must be efficient; so
must searching for objects within a container hierarchy.

It must be possible for many people to be adding, removing, and modifying
the contents of a container at once, as we do with file systems.

---------------- 

I propose the following changes to "Functional Requirements and Framework
for Versioning on the WWW":

3.3 and 3.4 are not independent:  Versioning policy at the server (or the
repository behind it) will always win out when there is a conflict.  The
repository may treat reserve / lock / get as an atomic operation.  In that
case, the user will not be able to perform these operations separately.  Or
the repository may require a reserve before a get, in which case an attempt
to perform a get without first reserving the resource will fail.  (None of
this alters the fact that the protocol should allow the user to make these
requests separately and in any order.  It just means that the protocol must
provide a syntax for the server to tell the user what it actually did or
report an appropriate error.)

3.7 Any resource can in principle be versioned, though particular sites may
constrain which resources can be versioned. (For example, a site might not
allow attributes to be versioned.)

4.3 There are several nodes that it would be useful to be able to request
without knowing their identifiers.  For editing, it is useful to be able to
get the tip of the primary series or of any branch.  (This means that it
must be possible to identify the branch we care about.)  In addition, there
may be a node that is the "published" node -- the one that should be
returned to a general reader who does not specify a version.  This published
node need not be the tip of any series.

4.5 . . . (As before) . . . Topologies that include branching, merging,
threaded versions, or any combination of these should be expected.     In
these complicated topologies one path may be identified as the primary
version series.  Branching and merging are familiar from software projects
where new functionality is added along a primary path, while bug fixes are
made along a branch and eventually merged into the primary path.  Threaded
versioning might occur, for example, when one series contains all internal
and external releases of a publication, while another series contains only
the external releases.

A single resource may participate in more than one version series (for
example, if it is the node where a branch or merge occurs), in the same
version series at more than one point (for example, if it were decided to
reject the changes made in the most recent versions and copy an earlier
version to the tip), or in more than one completely separate topology (for
example, if different organizations want to use the resource as a starting
point for very different projects).  So potentially many different
version-specific URLs point to the same resource.

When requesting a topology, it should be possible to specify what attributes
should be returned for (1) the topology as a whole, (2) each version series
in the topology, (3) each node in each version series, and (4) each
resource.  (1) might include such attributes as title, authors, create date
that apply to the entire resource including all its versions.  (2) might
include series name, whether it is the primary series, whether there are any
reservations or locks on it, etc.  (3) might include a version number or
label, editor, comments (reason for changes), date added to series, etc.
(4) might include such attributes as authors, title, lock information, etc.
(possibly different from the authors, title, etc., of the topology as a whole).

(We might want to talk about levels of compliance here, and make the lowest
level of compliance be support for just simple linear versioning, where only
a single version series is allowed for any versioned resource.) 

12b.  Revoke a reservation.  A way to declare the end of a reservation
without committing any updates or creating a new version.

19.  A way to delete a version or a list of versions.  A way to request that
all but the most recent n versions be deleted.

20.  A way to get reservation information and lock information about any
resource.

21.  A way to find out whether a given resource is a versioned resource or not.

22.  A way to specify, when creating a new resource, whether it will be a
versioned resource or not.
Name:		Judith A. Slein
E-Mail:		slein@wrc.xerox.com
Phone:  	8*222-5169
Fax:		(716) 265-7133
MailStop:	128-29E

Received on Tuesday, 26 November 1996 15:28:23 UTC