INS, DEL and Collaborative Document Design

Ian Graham <igraham@hprc.utoronto.ca> wrote in message
<9512050146.AA00534@www10.w3.org>:

> I am planning a project to look at collaborative HTML document 
> development via the Web. I am thinking of using the proposed 
> HTML 3 elements INS and DEL to delineate the changes associated 
> with different versions or a document, with appropriate 
> attributes to reflect authorship, version numbers and so on. 

> Has anyone looked at integrating this type of functionality
> into HTML or other markup languages?

I haven't done that, but I have experimented with different ways
of augmenting _plain text_ for similar purposes. The details
are unimportant, but the _functionality_ I found useful to
support may be of interest and could be provided by means of
HTML or SGML.

SCENARIO: A long, complicated document is developed over a long
period of time jointly by several persons, working in different
places, without daily contact. One of them has the function of
document _editor_, the others are _contributors_. (The word
"editor" is here used for a human, not for editing programs.)

MODEL: The document evolves through a number of different
versions. Each version is composed by the editor alone. These
steps are followed:

1) The process starts with a _strawman document_. This may be
   merely a plan for what should be included in the final
   version document.

2) If the editor finds that the current document version is good
   enough, he reformats it to a finished document.

3) This version of the document is distributed to the
   contributors (together with any comment compilation,
   see step 6).

4) If the document is finished: Stop.

5) During a subsequent _comment period_ the contributors are
   requested to submit _contributions_, which can be comments on
   this version, proposals for changes, and sometimes new text
   for a part that has been delegated to a specific person.

6) The editor combines all contributions with the text of the
   document to form a special working document, the _comment_
   _compilation_, where all comments can be seen in their right
   context and competing comments can be compared. (This step
   may be shortcut in some cases.)

7) The editor then creates a _new version_ of the document being
   worked out, where new text, as well as places where text has
   been removed, is marked. This makes it possible for the
   contributors to easily see what has been changed since the
   last version of the document. For particularly important
   changes, the editor may include comments explaining the
   choices he/she has made.

8) Goto step 2.

FUNCTIONALITY: The contributions, the comment compilation, and
the document itself can use the same document type. When
implemented in HTML or SGML, it may be feasible to include both
the previous version, all contributions suggesting changes, and
the changes actually implemented by the editor in the new
version. I didn't attempt this in my augmented plain text
format, but kept the comment compilation and the revised version
as separate text files. (I'm skeptical to a possible further
generalization, including not only two versions of the document
in one file, but all the previous versions and all previous
comments.)

Compared to a normal document, this REVITEXT document type will
have use for at least this funtionality:

A) For a part of the document, indication of who has
   _contributed_ it (or that it is provided by the editor).

B) A way to distinguish _metatext_ from _object text_. Metatext
   is text about the object text and can include justifications
   for proposals, discussion of alternatives, plans for the
   future work, and free comments, that are not intended to end
   up in the final version if the document.

C) A mechanism to _connect_ a certain piece of metatext with the
   parts or points of the object text which it is about.

D) A way to include directly in the object text short
   _meta-descriptions_ that indicates what eventually will be
   included at this place, rather than the (not yet written)
   wording itself.

E) For the proposal of a contribution, and for the new version
   of the document, in comparison to the old version: Ways to
   indicate:
   1) _new_ text
   2) _deleted_ text
   3) text that has been _moved_
   4) the former _place_ of moved text.

F) A way to include several _alternative_ ways of changing the
   same part of the old version.

G) In the new version of the document: Visual indications by
   different kinds of change bars of:
   1) parts where the _substance_ has been changed or amended
   2) parts where only _editorial_ changes have been made.

Authoring tools can take advantage of this to offer
several _views_ of the document:
-  the previous version of the document
-  the new version of the document
-  the new version of the document with changes indicated
   according to item G
-  the new version with new parts marked and old parts
   (that are removed) included
-  the hypothetical version corresponding to one contributor's
   contribution, together with the new version
-  the new version with removed parts, also including the
   contribution from a certain contributor
-  for a certain part of the document: the old text, comments
   from all contributors, the new text.

In all views of the document either only the object text, or
both the object text and comments on it, can be displayed.

It can be noted that functions A-C are almost all that is needed
for a general _annotation_ mechanism for documents and messages,
something that's useful in many other situations than the rather
formalized collective document development process modelled
here.

The RANGE element from the now withdrawn HTML 3.0 draft should
be usable for function C. The INS and DEL elements of that draft
could be used for function E. Several other new elements and
attributes are needed a full implementation, I think.

PROBLEMS: I haven't addressed here the following problems (in
difficulty order):

-- A naming scheme for the different versions and contributions.

-- How contributions and new versions are communicated between
   the involved persons.

-- A possible differentiation of the contributors into a core
   team with a quicker cycle of "working versions", and a wider
   circle of "commenters", who only are bothered with and asked
   to comment on a smaller number of "main versions".

-- A more "democratic" process where the editor isn't a
   dictator.

-- A more "decentralized" process with more than one editor.

-- Possibilities to split the development process into several
   parallel documents, or parallel versions of a part of the
   document, which later are merged or kept as different
   results of the process.

/Olle

--
Olle Jarnefors, Royal Institute of Technology, Stockholm <ojarnef@admin.kth.se>

Received on Tuesday, 12 December 1995 14:16:57 UTC