- From: David G. Durand <david@dynamicdiagrams.com>
- Date: Mon, 1 Jun 1998 16:31:56 -0700
- To: WEBDAV WG <w3c-dist-auth@w3.org>
>From: Marcus Jager <mjager@microsoft.com> >Hmm, >All of these cases must be handled by the versioning system >already. Well, no, some of them may not be _best_ handled by the versioning system. >It must >be able to refer to versions that are unavailable for various reasons. Sure... I'm with you so far. Availability is not the same as referenceability. >The first and second examples are simply instances of version 1 that has >branched into 1a and 1b, and version 1 has been deleted. But both branches >are still available. The translations may have separate revision trees. They may also have variants by format that are automatically generated on request by the server (this information should _not_ be part of the version tree). On the other hand, some variants may be hand-derived from an original (the relation between your leaf and root versions) this may start separate revision histories, that diverge from the original document's history, while occasionally synching up with it. For instance, when the second edition of the German original is issued, a new set of translated versions must be created (and may be separately revised further for accuracy). The point isn't that the DAV standard must define a system that implements any particular policies for such situations, but that it should enable interoperability between arbitrary clients and arbitrary servers at the most appropriate common level. One key to doing this is to isolate and modularize concerns as much as possible. If we define the versioning stuff so that it works for providing editorial support for change tracking, that's enough. Most software engineering systems nowadays try to separate these concerns as well, because it becomes to difficult to manage variants of a complex system as if they were simply items with a peculiar historical relationship. It's not helpful to have rev 1.3, and successors of display.c be the Windows version, and rev 1.6 and successors be the Motif version of display.c. While I think that document authorship and software engineering make radically different demands on a system, this is a case where the experience transfers well. The analogy in this case is pretty close: alternative delivery formats are like object code, in that they lack their own private revision trees, and their content is not historically related to the content of the source file (original document). They are the output of an automatic process applied to the editorial object of interest (source code, or an original document). Version relationships are quite similar, except that documents often require rather different sorts of change tracking since document formats are rather different from source code. Variant documents, in both cases, are often editorially derived from some version of an original, but once they are so derived, the variant usually takes on a life of its own, and acquires its own separate revision history. When this list first started 2 years ago, we referred to the configuration problem. This was a way to separate the issues of managing the correlations between different versions of individual resources into complete sets (configurations) representing a desired global state like a proofread web site, or a working system build. This separation also makes sense in software configuration management, and also simplifies the problems by separating concerns. As the translation examples we've been tossing around show, the total problem of change in sets of related resources is terribly complex. We can simplify it by making some orthogonal distinctions: 1. Variants represent human-derived, related but separate documents. 2. We don't have a term for variant data formats, (perhaps data-variants is clear enough). These, like object files in SCM, often don't need to be tracked separately, as they can be derived from the "base" or "source" files. 3. Versions are historically related files that are under revision (by human beings), and for which the system may provide support for operations like registering new versions, applying a list of changes to a version to create a new version, merging versions, providings lists of differences between versions, etc. 4. Configurations (or configuration management functions) support sets of versions of resources that satisfy user or system requirements for consistency (these may range from provable correctness, to satisfaction of workflow requirements). This typically includes the specification of cross-resource version relationships, like the fact that version 1.3.4 of the Englsih translation corresponds to rev. 2.0 of the German original, and such like. >I guess we need to split the question into two parts that are dealt with >separately, (more parts than that) >(a) Should the relationship between variants of a document be expressed as >parts of the version graph? >I believe that (a) is an easy yes, because the relationships are not any >different. Not be be reflexively contradictory, but this is an easy no. One way to see that versioning is different from variants, is to see that (as I listed in my last note) there are a lot of specific operations (like difference tracking, and incremental update) that make sense for historically related documents, and do not make any sense for arbitrary variants, such as data format variants. Variants don't sensibly accomodate the kinds of operations that reasonable version handling demands. It's not as if it's sensible to ask for the diffs that transform a tex file into the coresponding .ps file because they are not editorially related. Thus, however we represent versioning-related derived-from relationships, WebDAV should support special operations for such related documents. Similarly, it isn't sensible to support those operations for variant objects. To me it seems self-evident that a protocol for the support of editorial systems will probably need special semantics for editorial processes and data relationships. >And (b) Should that knowledge be used by the server in content negotiation? >(b) is the hard one. Since (a) is "no" this is actually easy to answer. Sometimes it makes sense to hide variants behind content-negotiation, and sometimes they need to be user-visible. My longer examples were intended to show that both situations can occur. The point is that this is an area where WebDAV should _not_ require a particular approach because there are several sensible policies that could serve different applications. Content negotiation and variants are sometimes intimately related, and sometimes not. It depends on the server, and the policies that server is implemented to support. >> I agree. I was thinking about it this weekend, and I came up some >> scenarios >> where trying to express alternate versions in the revision graph breaks >> down: >> >> * The French and English versions are both online, but they are >> translations >> of the German version, which is not. This may be unlikely in the >> case >> of >> new documents (e.g., translations of a company's Web site), but not >> in >> the >> case of older books being translated--for example, consider a site in >> Canada >> that wants to publish a translation of Goethe. >> * The document was composed electronically, but the original format is >> not >> going to be put on the Web site--e.g., it was written in Word, but it >> will >> be published in HTML and PDF. Again, a "A derives from B" graph will >> not >> contain the original document. >> * The Arabic version derives from revision 1.5 of the English version, >> but >> later revisions have not been translated (say, because they contain >> references to things which would get the document censored in some >> Muslim >> countries); some time later, the site manager wants to prune his >> revision >> graph, but he can't prune revision 1.5, because then the graph will >> not > show >> any relation between the Arabic and English versions. >> >> These could all be handled by workarounds of some sort (permitting >> revision >> relations for documents that don't actually exist), but I believe it would >> be >> cleaner to express equivalent versions as a separate equivalence relation, >> rather >> than forcing it into the graph. The graph for the Goethe example would >> then >> be a >> forest of revision trees; certain revisions in each tree would then be >> marked as >> valid avatars of the document, and content negotiation would select the >> best >> available avatar. >> >> Of course, I'm not saying that each avatar must be in a separate tree--we >> certainly want to be able to express is-derived-from relationships where >> they >> exist--but those won't always be the only relationships. ------------------------------------------+---------------------------- David Durand dgd@cs.bu.edu| david@dynamicDiagrams.com Boston University Computer Science | Dynamic Diagrams http://www.cs.bu.edu/students/grads/dgd/ | http://dynamicDiagrams.com/ | MAPA: mapping for the WWW
Received on Monday, 1 June 1998 16:42:25 UTC