Draft Combined Requirements Document from Judith Slein on 1997-02-10 (w3c-dist-auth@w3.org from January to March 1997)

From: Judith Slein <slein@wrc.xerox.com>
Date: Mon, 10 Feb 1997 11:23:55 PST
To: w3c-dist-auth@w3.org
Message-Id: <2.2.32.19970210192355.0092fa34@pop-server.wrc.xerox.com>
Here is a draft of a merged requirements document.  I tried not to make many
substantive changes, since we seemed to have consensus on the content of the
two original papers.  Fabio Vitali provided lots of help and good advice in
merging the two papers, although I did not follow his advice on all points.

Comments from Fabio and me are marked *** in this draft.  I have tried to
mark points that conflict with the current specification or that seem to be
subjects of controversy within the group.

Please review this document and provide comments.  If we are to meet our
schedule, which calls for the requirements to be an Internet Draft by the
end of February, we need to try to resolve outstanding issues by February 24.

I'm also sending out a separate mailnote that summarizes what I take to be
the areas where the requirements and the specification are in conflict, and
areas that are controversial.

Thanks for your help.

--Judy

--------------------------

WEBDAV Working Group				J.A. Slein
INTERNET-DRAFT      				Xerox Corporation                         
< >						E.J. Whitehead, Jr.
						U.C. Irvine
						D.G. Durand
						Boston University
						F. Vitali
						University of Bologna              
						February 1997

Expires August 1997

   Requirements on HTTP for Distributed Authoring and Versioning


Status of this Memo

This document is an Internet draft. Internet drafts are working
documents of the Internet Engineering Task Force (IETF), its areas and
its working groups. Note that other groups may also distribute working
information as Internet drafts.

Internet Drafts are draft documents valid for a maximum of six months
and can be updated, replaced or obsoleted by other documents at any
time. It is inappropriate to use Internet drafts as reference material
or to cite them as other than as "work in progress".

To learn the current status of any Internet draft please check the
"lid-abstracts.txt" listing contained in the Internet drafts shadow
directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East coast) or
ftp.isi.edu (US West coast). Further information about the IETF can be
found at URL: http://www.ietf.org/

Distribution of this document is unlimited. Please send comments to the
WWW Distributed Authoring and Versioning mailing list,
<w3c-dist-auth@w3.org>, which may be joined by sending a message with
subject "subscribe" to <w3c-dist-auth-request@w3.org>. Discussions are
archived at URL:
http://www.w3.org/pub/WWW/Archives/Public/w3c-dist-auth/. The HTTP
working group at <http-wg@cuckoo.hpl.hp.com> also discusses the HTTP
protocol.  Discussions of the HTTP working group are archived at URL:
http://www.ics.uci.edu/pub/ietf/http/. General discussions about HTTP
and the applications which use HTTP should take place on the
<www-talk@w3.org> mailing list.


Abstract

The HyperText Transfer Protocol, version 1.1 (HTTP/1.1), provides
simple support for applications which allow remote editing of typed
data. In practice, the existing capabilities of HTTP/1.1 have proven
inadequate to support efficient, scalable remote editing free of
overwriting conflicts.  This document presents a list of features in
the form of requirements which, if implemented, would improve the
efficiency of common remote editing operations, provide a locking
mechanism to prevent overwrite conflicts, improve relationship
management support between non-HTML data types, provide a simple
attribute-value metadata facility, provide for the creation and
reading of container data types, and integrate versioning into the WWW.


1. Introduction

This document describes functionality which, if provided in the
HyperText Transfer Protocol (HTTP) [4], would support the
interoperability of tools which allow remote loading, editing and
saving (publishing) of various media types using HTTP. As much as
possible, this functionality is described without suggesting a proposed
implementation, since there are many ways to perform the functionality
within the HTTP framework. It is also possible that a single mechanism
within HTTP could simultaneously satisfy several requirements.

***Fabio - Many of the versioning requirements call for extensions to
URLs, not to HTTP.

***Judy - There is controversy in the group about whether we should be
extending HTTP or defining a separate protocol.

2. Rationale

The HTTP protocol contains functionality which enables the editing of 
web content at a remote location, without direct access to the storage 
media via an operating system. This capability is exploited by several 
existing HTML distributed authoring tools, and by a growing number of 
mainstream applications (e.g. word processors) which allow users to 
write (publish) their work to an HTTP server. To date, experience from 
the HTML authoring tools has shown they are unable to meet their users' 
needs using the facilities of the HTTP protocol. The consequence of 
this is either postponed introduction of distributed authoring 
capability, or the addition of nonstandard extensions to the HTTP 
protocol.  These extensions, developed in isolation, are not 
interoperable.

Other authoring applications have wanted to access document repositories 
or version control systems through Web gateways, and have been similarly
frustrated.  Where this access is available at all, it is through
nonstandard extensions to HTTP that force clients to use a different
interface for each vendor's service.

This document describes requirements for a set of standard extensions
to HTTP that would allow distributed Web authoring tools to provide
the functionality their users need by means of the same standard
syntax across all compliant servers. The broad categories of 
functionality that need to be standardized are:

	Attributes
	Relationships
	Locking
	Notification of Intent to Edit
	Retrieval of Unprocessed Source for Editing
	Partial Write
	Name Space Manipulation
	Collections
	Versioning

3. General Principles

This section describes a set of general principles that the HTTP
extensions should follow.  These principles cut across categories of
functionality.

3.1. User Agent Interoperability

All clients should be able to work with any WebDAV-compliant HTTP
server. It is acceptable for some client/server combinations to provide
special features that are not universally available, but the protocol
should be sufficient that a basic level of functionality will be
universal. It should be possible for servers and clients to negotiate
the use of optional features.

3.2. Legacy Client Support

WebDAV-compliant servers should be able to interoperate with non-WebDAV
clients.

3.3. Data Format Compatibility.

WebDAV-compliant servers should be able to work with existing resources 
and URLs. Special additional information should not become a mandatory 
part of document formats.

3.4. HTTP Compatibility (new)

Our aim is to make extended authoring capabilities available through
HTTP.  In extending HTTP, we are obligated to follow its design
conventions and stay within its spirit.  This means, for example, that
methods should operate only on resources.  It means that parameters
should be communicated in headers.  These and other conventions should
be observed in the design of the extensions.

3.5. Replicated, Distributed Systems (new)

Distribution and replication are at the heart of the Internet.  All
WebDAV extensions should be designed to allow for distribution and
replication.  Version trees should be able to be split across multiple
servers.  Collections may have members on different servers.  Resources
may have attributes on different servers.  Any resources may be cached
or replicated for mobile computing or other reasons.  Consequently, we
must keep these issues in mind through all our design efforts.

4. Requirements

In the requirement descriptions below, the requirement will be stated,
followed by its rationale. If any distributed authoring tools
currently implement the requirement, this is also mentioned. It is
assumed that "server" means "a program which receives and responds to
HTTP requests," and that "distributed authoring tool" or "intranet
enabled tool" means "a program which can retrieve a source entity via
HTTP, allow editing of this entity, and then save/publish this entity
to a server using HTTP." A "client" is "a program which issues HTTP
requests and accepts responses."

(Get rid of references to current tools altogether, or do more thorough
research.)

4.1. Attributes

Via HTTP, it should be possible to create, modify, query, read and 
delete arbitrary attributes on resources of any media type.

***Judy - Query is not supported in the specification.

Attributes can be used to define fields such as author, title, subject, 
and organization, on resources of any media type. These attributes have 
many uses, such as supporting searches on attribute contents, and the 
creation of catalog entries as a placeholder for an object which is 
not available in electronic form, or which will be available later.

4.2. Relationships

Via HTTP, it should be possible to create, query, and delete typed 
relationships between resources of any media type.

A hypertext link is a relationship between resources which is browsable 
using a hypertext style point-and-click user interface. Relationships, 
whether they are browsable hypertext links, or simply a means of 
capturing a interrelation between resources, have many purposes.  
Relationships can support pushbutton printing of a multi-resource 
document in a prescribed order, jumping to the access control page for 
an resource, and quick browsing of related information, such as a table 
of contents, an index, a glossary, help pages, etc. While relationship 
support is provided by the HTML "LINK" element, this is limited only to 
HTML resources, and does not support bitmap image types, and other 
non-HTML media types.  

AOLpress from America Online [1] currently "allows pages to add toolbar 
buttons on the fly using the HTML 3.2 <LINK REL....> tag. For example, 
your page can add toolbar buttons that link to a home page, table of 
contents, index, glossary, copyright page, next page, previous page, 
help page, higher level page, or a bookmark in the document."

***Fabio - The definition of locking here conflicts with the one that
was used in the versioning requirements paper.  More in a separate
mail note.

4.3. Locking

4.3.1. General Principles

4.3.1.1. Independence of locks. It should be possible to lock a resource
without re-reading the resource, and without committing to editing the 
resource.

4.3.1.2. Multi-Resource Locking. It should be possible to take out a 
lock on multiple resources in the same action, and this locking 
operation must be atomic across these resources.

***Judy - Multi-resource locking is not in the specification

4.3.1.3. Partial-Resource Locking. It should be possible to take out a 
lock on subsections of an resource.

***Judy - Controversy on this issue at Irvine.

4.3.1.4. Multi-Person Locking.  It should be possible to assign a lock
to a single person or to multiple persons with a single action.

***Judy - Multi-person locking is not in the specification.

***Fabio - Add a statement that support for locking is optional.  Also
say that systems that do not support locking should provide some other
type of consistency management.

4.3.2. Functional Requirements

4.3.2.1.  Write Locks. It should be possible, via HTTP, to restrict 
modification of a resource to a specific person, or list of persons.

***Fabio - The definition of a write lock should be this:  A write lock
states that no consistency problem will ever occur by changing the
resource, not that no one else is allowed access to that resource.  On
the other hand, it can be said that access rights to successfully
Unlocked resources should be allowed to all authorized users.

4.3.2.2.  Read Locks. It should be possible, via HTTP, to indicate to 
the HTTP server that the contents of a resource should not be modified 
until the read lock is released.

***Judy - Read locks are not in the specification.

4.3.2.3. Lock Query. It should be possible to query for whether a given 
URL has any active modification restrictions, and if so, who currently
has modification permission.

***Judy - Should add Unlock.

4.3.3. Rationale

At present, HTTP provides limited support for preventing two or more 
people from overwriting each other's modifications when they save to a 
given URL. Furthermore, there is no way for people to discover if 
someone else is currently making modifications to a resource. This is 
known as the "lost update problem," or the "overwrite problem." Since 
there can be significant cost associated with discovering and repairing 
lost modifications, preventing this problem is crucial for supporting 
distributed authoring. A "write" lock ensures that only one person (or 
list of persons) may modify a resource, preventing overwrites.
Furthermore, locking support is also a key component of many versioning 
schemes, a desirable capability for distributed authoring.

An author may wish to lock an entire web of resources even though he 
is editing just a single resource, to keep the other resources from 
changing. In this way, an author can ensure that if a local hypertext 
web is consistent in his distributed authoring tool, it will then be 
consistent when he writes it to the server. Because of this, it should 
be possible to take out a lock without also causing transmission of the 
contents of a resource. Since it should not be assumed that because a 
resource is locked, that it will necessarily be modified, and since 
many people may wish to have simultaneous guarantees that a resource 
will not be modified, but still not want to modify the resource 
themselves, it is desirable to have a "read" lock capability. A read 
lock, by being less restrictive, provides better support than a write 
lock for providing a guarantee that a resource will not be modified. 
Put differently, a read lock states that the resource is guaranteed not 
to change for the duration of the lock. A write lock states that a 
resource is guaranteed not to change only if the owner of the lock 
does not change it, and only the owner of the lock may change it.

It is often necessary to guarantee that a lock or unlock operation 
occurs at the same time across multiple resources, a feature which is 
supported by the multiple-resource locking requirement. This is useful 
for preventing a collision between two people trying to establish locks 
on the same set of resources, since with multi-resource locking, one of 
the two people will get a lock. If this same multiple-resource locking 
scenario was repeated by using atomic lock operations iterated across 
the resources, the result would be a splitting of the locks between the 
two people, based on resource ordering and race conditions.

Partial resource locking provides support for collaborative editing 
applications, where multiple users may be editing the same resource
simultaneously. Partial resource locking also allows multiple people to 
simultaneously work on a database type resource.

4.4. Notification of Intention to Edit. 

It should be possible to notify the HTTP server that a resource is about 
to be edited by a given person. It should be possible to query the HTTP 
server for the list of people who have notified the server of their 
intent to edit a resource.

***Judy - It should be possible to notify the server that one no longer
intends to edit the resource.

***Judy - Support for notification of intent to edit is found in the
specification only in the context of version management.  The 
specification does not allow such notification for non-versioned
resources.

Experience from configuration management systems has shown that people 
need to know when they are about to enter a parallel editing situation. 
Once notified, they either decide not to edit in parallel with the 
other authors, or they use out-of-band communication (face-to-face, 
telephone, etc.) to coordinate their editing to minimize the difficulty 
of merging their results. Notification is separate from locking, since 
a write lock does not necessarily imply a resource will be edited, and 
a notification of intention to edit does not carry with it any access 
restrictions. This capability is supportive of versioning, since a 
check-out typically involves taking out a write lock, making a 
notification of intention to edit, and getting the resource to be 
edited.

4.5. Retrieval of Unprocessed Source for Editing

The source of any given entity should be retrievable via HTTP.

***Judy - Not in the specification.

There are many cases where the source stored on a server does 
not correspond to the actual entity transmitted in response to an HTTP 
GET. Current known cases are server side include directives, and 
Standard Generalized Markup Language (SGML) source which is
converted on the fly to HyperText Markup Language (HTML) [2] output 
entities. There are many possible cases, such as automatic conversion 
of bitmap images into several variant bitmap media types (e.g. GIF, 
JPEG), and automatic conversion of an application's native media type 
into HTML. As an example of this last case, a word processor could 
store its native media type on a server which automatically converts 
it to HTML. A GET of this resource would retrieve the HTML. Retrieving 
the source would retrieve the word processor native format.

This requirement should be met by a general mechanism which can handle 
both the "single-step" source processing described above, where the 
source is converted into the transmission entity via a single 
conversion step, as well as "multi-step" source processing, where there 
are one or more intermediary processing steps and outputs. An example 
of multi-step source processing is the relationship between an 
executable binary image, its object files, and its source language 
files. It should be noted that the relationship between source and 
transmission entity could be expressed using the relationship 
functionality described above in "4.2. Relationships."

4.6. Partial Write. 

After editing a resource, it should be possible, via HTTP, to write 
only the changes to the resource, rather than retransmitting the entire 
resource.

***Judy - Not in the specification.

During distributed editing which occurs over wide geographic separations
and/or over low bandwidth connections, it would be extremely inefficient
(and frustrating) to rewrite a large resource after minor changes, such 
as a one-character spelling correction. Ideally, support will be 
provided for transmitting "insert" (e.g., add this sentence in the 
middle of a document) and "delete" (e.g. remove this paragraph from the 
middle of a document) style updates. Support for partial resource 
updates will make small edits more efficient, and allow distributed 
authoring tools to scale up for editing of large documents.

4.7. Name Space Manipulation

***Fabio - A general treatment of server's name space management from
clients should be introduced here.

***Judy - Need more details of the semantics of copy and move,
especially for collections, versioned resources, and resources with
attributes.

***Judy - In the specification, but not mentioned here:  Destroy,
Undelete, CopyHead, MoveHead.

4.7.1. Copy. 

Via HTTP, it should be possible to make a byte-for-byte duplicate of a 
resource without a client loading, then resaving the resource. This copy 
should leave an audit trail.

There are many reasons why a resource might need to be duplicated, such 
as change of ownership, a precursor to major modifications, or to make 
a backup. In combination with delete functionality, copy can be used to 
implement rename and move capabilities, by performing a copy to a new 
name, and a delete of the old name. Due to network costs associated 
with loading and saving a resource, it is far preferable to have a 
server perform a resource copy than a client. If a copied resource 
records which resource it is a copy of, then it would be possible for 
a cache to avoid loading the copied resource if it already locally 
stores the original.

4.7.2. Move/Rename. 

Via HTTP, it should be possible to change the URL of a resource without 
a client loading, then resaving the resource under a different name.

It is often necessary to change the name of a resource, for example due 
to adoption of a new naming convention, or if a typing error was made 
entering the name originally. Due to network costs, it is undesirable 
to perform this operation by loading, then resaving the resource,
followed by a delete of the old resource. Similarly, a single rename 
operation is more efficient than a copy followed by a delete operation.
Ideally an HTTP server should record the move operation, and issue a 
"301 Moved Permanently" status code for requests on the old URL. A move 
operation, if implemented with attribute support, should also preserve 
most attributes across a move. Note that moving a resource is considered 
the same function as renaming a resource.

4.8. Collections

4.8.1. List Collection. A listing of all resources, along with
their media type, and last modified date, which are located in a
specific collection should be accessible via HTTP.

***Judy - Not in the specification.

In [3] it states that, "some URL schemes (such as the ftp, http, and 
file schemes) contain names that can be considered hierarchical." 
Especially for HTTP servers which directly map all or part of their URL 
name space into a filesystem, it is very useful to get a listing of all 
resources located at a particular hierarchy level. This functionality 
supports "Save As..." dialog boxes, which provide a listing of the 
entities at a current hierarchy level, and allow navigation through 
the hierarchy. It also supports the creation of graphical visualizations
(typically as a network) of the hypertext structure among the entities 
at a hierarchy level, or set of levels. It also supports a tree
visualization of the entities and their hierarchy levels.

In addition, document management systems may want to make their 
documents accessible through HTTP.  They typically allow the 
organization of documents into collections, and so also want their users
to be able to view the collection hierarchy through HTTP.

There are many instances where there is not a strong correlation between
a URL hierarchy level and the notion of a collection. One example is a 
server in which the URL hierarchy level maps to a computational process 
which performs some resolution on the name. In this case, the contents 
of the URL hierarchy level can vary depending on the input to the 
computation, and the number of resources accessible via the computation 
can be very large. It does not make sense to implement a directory 
feature for such a namespace. However, the utility of listing the 
contents of those URL hierarchy levels which do correspond to 
collections, such as the large number of HTTP servers which map their 
namespace to a filesystem, argue for the inclusion of this capability, 
despite not being meaningful in all cases. If listing the contents of 
a URL hierarchy level does not makes sense for a particular URL, then 
a "405 Method Not Allowed" status code could be issued.

AOLpress from America Online currently supports "Save As..." dialog 
boxes, and graphical network visualization of a portion of a site's 
hypertext structure, which they term a "mini-web." FrontPage from 
Microsoft [6] also currently supports a graphical network visualization 
and additionally supports a tree visualization of a portion of a 
site's structure.

4.8.2. Make Collection. Via HTTP, it should be possible to
create a new collection.

The ability to create collections to hold related resources supports 
management of a name space by packaging its members into small, related 
clusters. The utility of this capability is demonstrated by the broad 
implementation of directories in recent operating systems. The ability 
to create a collection also supports the creation of "Save As..." 
dialog boxes with "New Level/Folder/Directory" capability, common in 
many applications.

AOLpress from America Online currently supports this capability 
through their "Save As..." dialog box, and their custom MKDIR method.

4.9. Versioning

In the following discussion, "versioned resource" means a resource that
has the structure of a directed acyclic graph, each node of which is 
a version. "Version" means a node in this structure, which is itself 
a resource. Each version typically stands in a "derived from" 
relationship to its predecessor(s).

***Judy - new definitions

4.9.1. General Principles

4.9.1.1. Stableness of versions. Most versioning systems are intended to
provide an accurate record of the history of evolution of a document. 
This accuracy is ensured by the fact that a version eventually becomes 
"frozen" and immutable. Once a version is frozen, further changes will 
create new versions rather than modifying the original. In order for 
caching and persistent references to be properly maintained, a client 
must be able to determine that a version has been frozen. We require 
that unlocked resource versions be frozen. This enables the common 
practice of keeping unfrozen "working versions". Any successful attempt 
to retrieve a frozen version of a resource will always retrieve exactly 
the same content, or return an error if that version (or the resource 
itself) are no longer available.  Since URLs may be reassigned at a 
server's discretion this requirement applies only for that period of 
time during which a URL identifies the same resource. HTTP 1.1's Entity 
tags will need to be integrated into the versioning strategy in order 
for caching to work properly.

***Judy - Does the specification support this?

4.9.1.2. Policy-free Versioning. Haake and Hicks [5] have identified 
the notion of versioning styles (referred to here as versioning 
policies, to reflect the nature of client/server interaction) as one 
way to think about the different policies that versioning systems 
implement. Versioning policies include decisions on the shape of 
version histories (linear or branched), the granularity of change 
tracking, locking requirements made by a server, etc. The protocol 
should not unnecessarily restrict version management policies to any 
one paradigm. For instance, locking and version number assignment 
should be inter-operable across servers and clients, even if there are 
some differences in their preferred models.

4.9.1.3. Separation of resource retrieval and concurrency control. The 
protocol must separate the reservation and release of versioned 
resources from their access methods. Provided that consistency 
constraints are met before, during and after the modification of a 
versioned resource, no single policy for accessing a resource should be 
enforced by the protocol. For instance, a user may declare an intention 
to write before or after retrieving a resource via GET, may PUT a
resource without releasing the lock, and might even request a lock via
HTTP, but then retrieve the document using another communication
channel such as FTP.

***Judy - The specification assumes that it's the server, not the user,
that determines the policy -- order of operations and what operations
are required.

***Judy - "Separation of resource retrieval and concurrency control" is 
supported by the Request-Lock, Request-Intent, and Request-Working-Loc 
parameters to the CheckOut method and the discovery mechanism. This is 
all embroiled in the controversy over how much latitude we want to give 
servers, how simple we want to make things for clients, whether we want 
to rely on the discovery mechanism, etc.

4.9.2. Functional Requirements

***Judy - In the specification, but not mentioned here: Diff/Merge,
ServerMerge, UnVersion.

4.9.2.1. Access to specific versions via a URL. For each version of a 
resource, on a server, there should be a URL to refer to that version.
That is, a version is itself a resource. 

This is required for version-specific linking, and for non-versioning 
client support.

4.9.2.2. A URL to denote a versioned resource itself, rather than 
specific versions of it.

This identifier is needed for queries about the versioning status of a
resource, that do not apply only to one version of that resource. It is
also used to perform operations (such as adjusting attributes, changing
locks, or reassigning URLs) that affect all versions of a resource,
rather than any specific version.

4.9.2.3. Direct access to a server-defined "default", "current" or "tip"
version of a resource.

This is one of the simplest ways to guarantee non-versioning client
compatibility. If no special version information is provided, the
server will provide a default. This does not rule out the possibility
of a server returning an error when no sensible default exists, but it
does provide a standard way to support non-versioning clients, and one
of the most common version access disciplines.

4.9.2.4. A way to access common related URLs from the URL of a 
particular version or of a versioned resource:
   o root version(s) of this document
   o predecessor version(s) of this document
   o successor version(s) of this document
   o default version of this document
It must be possible in some way for a versioning client to access 
versions related to a resource whose URL it has. In particular, access 
to the "default" version of a resource is an extremely important 
operation, that a client should be able to perform at any time that 
a URL for a particular version or for a versioned resource is seen.

***Judy - Specification provides some, but not all, of these navigation
paths.

4.9.2.5. A way to retrieve the complete version topology for a resource.
There should be a way to retrieve information about all versions of a
resource. The format for this information must be standardized so that
the basic information can be used by all clients. Other specialized
formats should be accomodated, for servers and clients that require
information that cannot be included in the standard topology.

4.9.2.6. A way to determine whether a given URL points to a version 
of a versioned resource.

***Judy - Are we requiring that you be able to tell this just by
examining the URL?

4.9.2.7. A way to distinguish, given a URL of a version, the part of
the URL that identifies the version from the part that identifies the
versioned resource.

***Judy - Do we really have to (want to) require that you be able to
find out the URL of the versioned resource by examining the URL of the
version?  Is the requirement really just that there be some way to find
out, for any version, the URL of its versioned resource?

***Judy - Specification does not provide a way to find out the URL of
the versioned resource(s) to which a version belongs.  

Being able to determine the URL of the versioned resource makes it 
possible to implement browsing the version tree. 

It also supports some comparison operations: It makes it possible to
determine whether two URLs designate versions of the same versioned 
resource. However, given the phenomenon of URL aliasing, it 
is insufficient to determine that they are not versions of the same 
resource.

***Judy - If 4.9.2.8 - 14 are intended to require separate operations
for each of these functions, they conflict with the approach taken in
the WEBDAV specification.

4.9.2.8. A way to request exclusive access to a version of a resource 
(Lock). (See Section 4.3 "Locking" above.)

Since not all systems implement lock-based access, the protocol should
not require clients to take out a lock before editing, nor should it
require servers to support locking. 

4.9.2.9. A way to release exclusive access to a resource (Unlock). This 
is the inverse of Lock.

4.9.2.10. A way for a client to declare an intention to modify a 
resource (Reserve). (See Section 4.4 "Notification of Intent to Edit"
above.)

This operation is required before any versioned update. Its effects may
vary depending on server policy, from locking a resource, to forking a
new variant, to a NOOP on servers that do not track sessions or restrict
updates. If this operation returns a version number, the client is
required to make sure that it uses a copy of the data associated with
that version number of the resource for any update operations it
carries out. Servers that wish to enforce a mandatory GET operation
before update, should simply use a fresh version identifier on the
return from this operation.

4.9.2.11. A way to declare the end of an intention to write a resource 
(Release). This is the inverse of Reserve. Typically, servers will 
commit updates at this time, and return a final version identifier if 
possible and if it was not already returned.

4.9.2.12. A way to submit a new version of a resource (PUT). The server 
should be able to attach it to the correct part of the version tree, 
based on the version number associated with the resource before its 
modification.

4.9.2.13. A way for a client to request a version identifier for a 
checked out version. Such an identifier will not be used by any other 
client in the meantime. The server may refuse the request.

4.9.2.14. A way for a client to propose a version identifier upon 
submitting a version of a resource. The server may refuse to to use 
the client's suggested version identifier.

4.9.2.15. A way for a client to supply metadata to be associated with 
a version. (See Section 4.1 "Attributes" above.)

The kinds of data supplied here might be simple textual comments or
more structured data. An ability to attach arbitrary fields and content
is probably required, but a standard set of attributes that would
enable interoperation would be useful.  At a minimum, it must be 
possible to associate comments with a version, explaining what changes
were made, when it is checked in.

4.9.2.16. A way for a server to provide a version identifier to be used 
for a resource in further operations.

This identifier must accompany client requests to manipulate the
resource. In particular, if a resource is being modified, the identifier
must be used when submitting an update. This allows servers to track 
active sessions by assigning version identifiers when documents are 
retrieved, locked, or reserved.

4.9.2.17. A way to track resources that have been Reserved (Session 
Tracking).  This allows the server to ensure that the user operating
on a resource is the same one who Reserved it.

***Judy -- Not in the specification.

***Judy - Uncheckout is neither in the requirements nor in the
specification.  Do we need it?

4.9.3. Rationale

Versioning in the context of the world-wide web offers a variety of
benefits:

It provides infrastructure for efficient and controlled management of 
large evolving web sites. Modern configuration management systems are 
built on some form of repository that can track the revision history of
individual resources, and provide the higher-level tools to manage 
those saved versions. Basic versioning capabilities are required to 
support such systems.

It allows parallel development and update of single resources. Since 
versioning systems register change by creating new objects, they
enable simultaneous write access by allowing the creation of variant
versions. Many also provide merge support to ease the reverse operation.

It provides a framework for access control over resources. While 
specifics vary, most systems provide some method of controlling or 
tracking access to enable collaborative resource development.

It allows browsing through past and alternative versions of a resource.
Frequently the modification and authorship history of a resource is
critical information in itself.

It provides stable names that can support externally stored links for
annotation and link-server support. Both annotation and link servers 
frequently need to store stable references to portions of resources 
that are not under their direct control. By providing stable states of 
resources, version control systems allow not only stable pointers into 
those resources, but also well-defined methods to determine the 
relationships of those states of a resource.

It allows explicit semantic representation of single resources with 
multiple states. A versioning system directly represents the fact that 
a resource has an explicit history, and a persistent identity across 
the various states it has had during the course of that history.

5. Acknowledgements (Get current mailing list)

Our understanding of these issues has emerged as the result of much
thoughtful discussion, email, and assistance by many people, who
deserve recognition for their effort.

Martin Cagan, Continuus Software, Marty_Cagan@continuus.com
Dan Connolly, World Wide Web Consortium, connolly@w3.org
Ron Fein, Microsoft, ronfe@microsoft.com
David Fiander, Mortice Kern Systems, davidf@mks.com
Roy Fielding, U.C. Irvine, fielding@ics.uci.edu
Yaron Goland, Microsoft, yarong@microsoft.com
Phill Hallam-Baker, MIT, hallam@ai.mit.edu
Dennis Hamilton, Xerox PARC, hamilton@parc.xerox.com
Andre van der Hoek, University of Colorado, Boulder,
  andre@bigtime.cs.colorado.edu
Gail Kaiser, Columbia University, kaiser@cs.columbia.edu
Rohit Khare, World Wide Web Consortium, khare@w3.org
Dave Long, America Online, dave@sb.aol.com
Henrik Frystyk Nielsen, World Wide Web Consortium, frystyk@w3.org
Ora Lassila, Nokia Research Center, ora.lassila@research.nokia.com
Larry Masinter, Xerox PARC, masinter@parc.xerox.com
Murray Maloney, SoftQuad, murray@sq.com
Jim Miller, World Wide Web Consortium, jmiller@w3.org
Andrew Schulert, Microsoft, andyschu@microsoft.com
Christopher Seiwald, Perforce Software, seiwald@perforce.com
Richard Taylor, U.C. Irvine, taylor@ics.uci.edu
Robert Thau, MIT, rst@ai.mit.edu

6. References

[1] America Online, "AOL Web Tools -- AOLpress 1.2 Features." WWW page.
http://www.aolpress.com/press/1.2features.html.

[2] T. Berners-Lee, D. Connolly. "HyperText Markup Language
Specification - 2.0." RFC 1866, MIT/LCS, November 1995.

[3] T. Berners-Lee, L. Masinter, M. McCahill. "Uniform Resource
Locators (URL)." RFC 1738, CERN, Xerox PARC, University of Minnesota,
December 1994.

[4] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and
T. Berners-Lee.  "Hypertext Transfer Protocol -- HTTP/1.1." RFC 2068,
U.C. Irvine, DEC, MIT/LCS, January 1997.

[5] A. Haake, D. Hicks. "VerSE: Towards Hypertext Versioning Styles", 
Proc. Hypertext'96, the Seventh ACM Conference on Hypertext, 1996,
pages 224-234.

[6] Microsoft. "Microsoft FrontPage for Windows Data Sheet." WWW page.
http://www.microsoft.com/msoffice/frontpage/productinfo/brochure/
default.htm.

[7] K. Osterbye. "Structural and Congitive Problems in Providing Version
Control for Hypertext", Proceedings of the ACM Conference on Hypertext,
Milano, Italy, 1992, pp 33-42.

[8] "Version Control in Hypermedia Databases" Technical report
TAMU-HRL-91-004, Hypertext Research Lab, Texas A&M University. 1991.

Authors' Addresses

Judith Slein
Xerox Corporation
800 Phillips Road 128-29E
Webster, NY 14580

EMail: slein@wrc.xerox.com

E. James Whitehead, Jr.
Department of Information and Computer Science
University of California
Irvine, CA 92697-3425

Fax: 714-824-4056
EMail: ejw@ics.uci.edu

David G. Durand
Department of Computer Science
Boston University
Boston, MA

EMail: dgd@cs.bu.edu

Fabio Vitali
Department of Computer Science
University of Bologna
ITALY

EMail: fabio@cs.unibo.it

-------------------------------
Name:			Judith A. Slein
E-Mail:			slein@wrc.xerox.com
Internal Phone:  	8*222-5169
External Phone:		(716) 422-5169
Fax:			(716) 265-7133
MailStop:		128-29E
Received on Monday, 10 February 1997 14:20:32 UTC