David Durand and Fabio Vitali
As the WebDAV group is now preparing to consider versioning in detail, and we will not be attending the meeting this June, we felt that our best contribution would be to record a list of possible decisions that should be avoided for versioning on the WWW. Our experience is that there are large number of pitfalls in this area. We've certainly fallen prey to some of them ourselves (as well as occasionally falling prey to systems that have them). This list is intended to point out some awaiting problems.
By arguing that these decisions should be pre-rejected, we intend to forestall possbilities. This is not an accusation, but a warning. The reaction that would please us most, would that everyone says "Why are they wasting our time with this obvious stuff?"
Furthermore, this document is not meant to provide answers to the problems we list. We do feel that our VTML format, properly applied would help solve most of these problems, but we would like to avoid partisanship, and concentrate on ensuring that the issues are fully discussed and then dismissed.
Imposing a linear style of versioning allows very simple applications to be set up: each author checks out the last version of the document, modifies it and checks it back in; all other authors have to wait for the first to check the document in; old versions are read-only; it is not possible to create variants of a document without making a copy and editing that.
Even if we decides that linear versioning is all that DAV must require in the first revision, the time will come when more complex versioning schemes will be needed, and the protocols must not require fundamental change to accomodate more sophisticated servers. Linear versioning protocols forbid parallel editing by multiple authors, imposes particular (and rigid) turn-taking and workflow management policies on the authors, requires the server to lock files (or allow "lost updates"). The need for strong locking protocols strongly suggests a single server implementation that will prevent disconnected operation.
Protocols and data formats that are only capable of linear versioning will very soon come to their limits in any real world situation. Locking the standard into linear versioning implies that the whole standard will need to be re-discussed very soon. On the other hand, allowing branching versions in the protocols and data formats, even if first generation servers only deal with linear versions, allows future releases to be developed without changing the fundamental levels of the standard.
Branching versions could be allowed both explicitly (a specific operation is necessary to create a branch, or alternate version, of a document) and implicitly (if more than one author edits the same document, their versions will be parallel branches of the original version). We strongly encourage considering implicit branching, i.e., support for parallel edits, right at the core of the discussion on versioning styles.
Parallel editing is required for collaboration and is fundamental for many kinds of detached and disconnected authoring. While parallel editing implies branching versions, it need not imply lack of locks: where workflows need to be controlled and guided, locking policies control how many and what people can edit a document simultaneously. DAV should allow parallel editing to happen at any moment based on the policies required by particular clients and servers, rather than by "one size fits all (almost)" policies wired into the basic protocol.
Requiring serial edits forces a particular and restrictive collaboration policy that is inadequate in many situations. We believe embedding specific policies in the protocols is the worst possible move the group could make, and that forcing serial editing is the kind of decision that could haunt the standard right for its entire life.
If a markup language (such as XML or HTML) is used as part of the protocol (for instance to describe and transmit updates of versioned documents), it should be completely separated from structural markup.
There are several good reasons for not mixing versioning and structure:
All these reasons leads us to suggest to strongly separate information about versioning and from document structure: there is just no simple and satisfying way to use tags for versioning that interact and mix with tags for structure. Versioning information should be kept separate and dealt with in a separate step in the processing of the document, in much the same way as content-encodings or compression are. Only in this manner we can safely create versioning protocols and languages for any kind of document, data format and versioning structure. We already tried this mix in the first VTML paper, but it became overly complex whenever we tried to integrate the needs of versioning-unaware tools and more complex versioning policies.
If there is one lesson we should learn from the success of the WWW (and the prior history of SGML-based markup), it is the power of text-based protocols and data formats. Texts (even ones not intended for human processing) are much easier to understand, debug and proof. Text-based tools can be built in a fraction of the time required by similar binary-based tools, can be tested without requiring special environments (e.g. one can test an HTTP server by simply telnetting to it), and can be debugged more easily. The only drawback of text-based formats is that they may be verbose.
Of course, change information may become quite large after even just a few editing operations on a file. For instance, a single global change may make thousands of changes in a file, each of which has to be stored singularly. This is a very appropriate place to apply compression technology. It can reduce storage and transmission costs considerably. But, while the temptation to create a binary format may be strong, probably the best way to compress change information is simply to use compression algorithms on the set of changes. Furthermore, HTTP 1.1 provides a standard way to do this already. Compression should be an additional, optional step in the processing of versioning information. There are several solutions, from ICCPC to MIME's Content-Transfer-Encoding to plain and simple zip files, that all allow consistent space savings without mixing concerns. Once space-saving for storage and transmission is dealt with independently from the issue of describing versioning information, then we believe there are no more good reasons not to choose a text-based format.
We believe that XML is the best bet for text-based formats for the future (not just a fashionble one), and that choosing an XML-based data format for describing changes in documents is a good candidate for a solution.
It is hard to design things so that systems with very different architectural and design criteria can interoperate, but it is particularly important for versioning, despite the fact that it is also particularly hard for versioning. There are so many different applications and needs that any particular behavior or policy may block an effective solution for some class of applications. The need to write simple-minded servers and clients has to be respected, but it is important not to hardwire specific behaviors (e.g., locking, check-out/check-in sequences, branching operations, etc.) that would prevent advanced systems from even existing.
There are two kinds of strategy to ensure this: (1) implicit fallback rules for client and server behavior when they prefer differing models; or (2) explicit feature testing operations, so that clients and servers can match capabilities before making requests.
If we use XML as a basis for a change description language, as we would suggest, then extra care is required to handle the versioning of different data formats. XML is defined in terms of characters (using the character set defined by Unicode and ISO). This means that XML documents can be transcoded to totally different byte-streams and still convey the same information. This also means that the kind of internal format used in most diff languages will cause problems for binary data.
In particular, it's not possible to apply the trick common in many ASCII formats of just using a special byte escape sequence, and then ensuring that applications process all byte equally. When an XML document is represented in UTF-16 or another 16-bit encoding, not all sequences of bytes are legal, and even if they were, the escaping rules for octet sequences would become unworkably complex. This could be a fatal blow to the use of XML syntax, except that the benefits of widely available parsing software, familiarity, and so forth would be lost.
We believe that the best approach is to keep the XML philosophy and possibly even XML tools (reusing existing parsers and taking advantage of current XML popularity) and use a Multipart MIME type to separately package change information (in XML) and the data (in an application/binary part). This allows great freedom in managing changes to any kind of text-based or binary. The key thing is not to assume that it's easy to pack arbitrary bytes directly into XML documents. It's not practical, and it also violates the intent of the specification by making the correctness of the document dependent on a particular character encoding.
An important application of versioning systems it to be able to determine the position, content and identity of all subparts of a resource in all its versions in which they exist in some form. Ideally it should be possible to retrieve and identify any single character of a document in all the versions in which it existed.
There are many applications for this information: Perhaps the most important is that it enables persistence and update of externally stored links (from other documents or in XML Linking-style independent links). External link bases can use address propagation to determine the identity of link endpoints in the current version of a document, even if it has changed. Transclusions can automatically update themselves in spite of the changes occurred in both the source and the host documents (and reveal the fact and form of the change at a later date, on user request). Dynamic changes can be applied by knowledgeable applications even if disconnected editing operations changed data outside of their control (e.g., standing commands such as "sort paragraph" or "globally change 'x' to 'y'" could still be executed even if the underlying document has changed).
All these situations share the need to determine the "address" of one or more "relevant object" in any affected version. Objects (e.g. characters, words, elements, subparts) may be identified by names, structural relations or offsets relative to fixed, well-known parts. We consider these all as "addresses" for that object. We are not advocating that the group determines the semantics and behaviors of all dynamic operations potentially affecting the content of a document. But sufficiently fine-grained diff formats allow knowledgeable applications to compute the current address of objects based on the history of changes these objects have been subjected to. Therefore we urge the group to consider this or any other method that would allow either client-side or server-side applications to be able to determine the history and evolution of any subpart of the resources they are interested in. Certainly, the final result should not prevent the use of such fine-grained information within the framework.
We are not brainwashed advocates for VTML (in any of its versions). On the other hand, VTML has addressed all the problems here except for #6 (and #6 is easy to deal with). We urge the group to look at our solutions; we believe that VTML may give some insight into ways of dealing with the many pitfalls that lie in this area.
The suggestions contained so far in the document clearly point to a specific class of solutions that we believe embody the best possible approach for providing versioning support on the Web: versioning information is provided in the form of a delta package containing finely grained differences from one version to another, stored as XML elements pointing to the relevant separate data using a Multipart MIME object.
VTML lies in this class of solutions, but it by no means comprises the whole class. On the other hand, if we also ask for a minimal set of change operations, and ask for the set of change attributes to be an open set, then the differences between VTML and any proposed alternative solution becomes negligible.