Definition of Variant: Was "RE: Versioning goals doc"

Larry:

John Stracke wrote the definition of variant I quoted. 
He also wrote on 7/30/98, "The current versioning 
& variants proposal stores every
version/variant as a separate resource, 
meaning that each one can
have totally separate properties."
So, I assumed that his definition probably meant that
variants are in separate resources. But, I couldn't
tell whether or not that was the case from the definition.
Every WebDAV resource has its own set of properties,
which include the required WebDAV properties,
e.g., "resourcetype". (You get error 404, "not found", 
from PROPFIND for undefined properties when searching 
for specific properties. For "resourcetype",
you do not get error 404 -- you get the value, which
is either "DAV:collection" or empty content for 
non collection resources, and no error. So, "resourcetype" 
is defined for every resource. In contrast, 
"getcontentlength" is not defined for every resource -- 
you can get error 404 on resources without content.) My 
attempt at rewording the definition was based on the 
above assumptions.

However, you pointed out that there is another type
of "variant" -- a "variant" that doesn't have its
own properties. So, the question is, is V&V
going to deal with both, or just one of these?
Whatever the answer is, we need terminology
for both kinds of variants so that we can discuss 
them:

An ordinary WebDAV resource that has content would
map to a document in a Document Management
System (DMS). All documents (and all resources) have 
properties. If the document has multiple renderings
of the content, one of which is authored, and
the rest of which are mechanically generated
from the rendering that is authored or one of the
mechanically generated ones, then the mechanically
generated renderings are called "renditions"
in the terminology of DMA and some DMS's.

I agree with you that any content that is authored 
"should" (you say "must", below) be a separate
document (i.e., resource), and therefore have
its own properties (every WebDAV resource has
the WebDAV-required properties, e.g., "resourcetype",
etc.), be discoverable by PROPFIND
and DASL searches, etc. I can't stop people
from abusing this model and putting two
authored renderings as renditions of the same
document, so I say "should". I would like to be 
able to say "must" as you do. I have no sympathy
for people who abuse this model and then get
confused.

So, let us discuss the general case (which we can later 
reject handling -- or not): An English document has
content in the proprietary format of a word processor,
WP. It also has a rendition of the content in PDF
format.  There is another document that is a hand
translation of it into German. The German document
also has two renditions, one in WP format and one
in PDF format. Both documents have a Title property.
The contents of the Title property are in English
for the English version, and in German for the
German version, e.g., "A Little House" and "Eine
Kleine Haus". So, what terminology do we use to
talk about this situation? How about the following:

The renderings that are intrinsic to the resource
are called "internal variants". Thus, the properties 
of the resource are associated with all of the 
renderings. (The properties and the renderings taken 
together ARE the resource.) The German hand translation
is called an "external variant", because it is a
full status, separate, resource -- not an intrinsic 
part of the current resource. A "variant" is
not specific as to whether it is an external variant
or an internal variant. So, the term "variant"
covers both types.

After thinking a little longer, I don't think that
comments on the optionality of variants belong in
the definition of variants. The decision must be
made, and has to be somewhere in the draft, but
not in the definition. This leads me to the following
proposed definition:

"5. A "variant" is content, and is either an "internal 
variant" or an "external variant". A resource has zero
or more internal variants. The resource consists of
its properties and its internal variants. One of the 
internal variants is authored, and the rest are 
generated mechanically from it or from one
of the other mechanically generated internal variants.
An external variant is a separate resource,
whose content is semantically the same or very similar to
the content of the current resource, e.g., a hand translation
of the content of the current resource into a different 
natural language."

This definition is independent of versioning, as it MUST
be. Each revision is a resource. The above definition 
applies to each revision of an abstract versioned resource.

Now, it is up to the V&V effort to (1) decide whether or not
internal variants are handled as well as external
variants, (2) decide whether variants or either or both
kinds are optional or not, (3) decide how to make and break
associations between external variants for both versioned and 
non versioned resources, (4) decide how the desired 
content is called out. The WebDAV model has getcontenttype,
getcontentlength, getlanguagetype, etc. as scalars. 
Furthermore, GET only gets one of the renderings, and
PUT only puts one of them. Thus, the V&V effort must 
specify what things in the header pin down the content. 
There are two separate dimensions: 
(1) the internal dimension, and (2) the external dimension. 
For the internal dimension, the User-Agent string might be
useful in some situations, as you say below. Something
in the header is probably required. For the external 
dimension, a preferred language parameter in the header 
might be useful. The internal and external dimensions are
independent, and both must be specified for the general
case. It would seem we need defaults for both dimensions, 
because I doubt that we can require both to be specified 
in all headers.

This e-mail thread is intended for discussion of the 
DEFINTION of "variant". We should have separate e-mail threads 
for how the variants are called out, whether they should be 
optional, etc.

Alan Babich

> -----Original Message-----
> From: Larry Masinter [mailto:masinter@parc.xerox.com]
> Sent: September 30, 1998 7:33 AM
> To: Babich, Alan; WebDAV WG
> Subject: RE: Versioning goals doc
> 
> 
> > Definitions:
> > 
> > "5. A variant is a representation of a resource. A resource
> > may have one or more representations associated with it at
> > any given time. Not all versioning DAV servers need support
> > variants."
> > 
> > This definition needs work. My understanding of variants is 
> > that variants are not merely mechanically generated alternate
> > renditions of the same document that all share the 
> > properties of the original rendition of the document. My
> > understanding of variants is that they are full status
> > resources, and that they have their own properties.
> 
> In HTTP, a variant may or may not be a separate resource. However,
> for the purpose of WebDAV, I would propose that any variant than
> can be authored independently must be a separate resource.
> 
> > Furthermore, variants may exist even if versioning doesn't.
> > Variants are independent of versioning.
> > The principal use case for variants seems to be
> > language translations of documents. 
> 
> There are a variety of cases for variants. The most common case
> for variants in use in the web today seems to be to have different
> variants of the page depending on which version of browser your
> User-Agent string reports. Other variants may depend on other
> factors. Content may 'vary' on any of a variety of request headers.
> 
> > Obviously, a
> > translated version of a document has its title property
> > translated as well as its content, etc.
> 
> This isn't 'obvious' at all. 
> 
> > In other words, variants are separate documents (resources), 
> > and they have nothing directly to do with versioning.
> 
> The design of versioning needs to account for the relationship
> of versions of a resource and versions of its variants for
> all of the cases of uses of variants. Since there are several
> cases, a case analysis is necessary. You've analyzed only
> one of the cases.
> 
> > The original English document and a German variant of
> > it may both be versioned. It is possible that each
> > revision of the English document has a variant that is
> > a revision of the German variant. However, it is also
> > possible that the translator guy was busy, and so
> > didn't translate some of the revisions. So some
> > of the revisions don't have corresponding variants
> > in the German document.
> > 
> > Here is my suggested rewording:
> > 
> > "5. One resource is said to be a variant of another
> > resource when a sufficiently strong relationship exists 
> > between the content of the two resources, e.g., when each
> > resource is a language translation of the other.
> > In this regard, "resource" is, of course, understood to 
> > include the case of a resource that is a revision
> > of an abstract versioned resource."
> 
> This is one use of 'variant' but not comprehensive.
> 
> > Then, of course, we need goals for variants.
> > Otherwise, why have them at all?
> 
> Variants are defined in HTTP/1.1. Variants are used by web servers.
> WebDAV's goal is to support distributed authoring and 
> versioning for the
> Web. Thus, WebDAV needs to define distributed authoring and versioning
> for variants, since they are used in the web.
> 
> > "13. Given a resource, it must be possible to retrieve
> > all the resources that are considered to be variants of it.
> 
> This is unworkable in the case of computed variants.
> 
> > 14. Server support for variants is optional."
> 
> Having this in the 'goals' document will just give you trouble,
> since it can be interpreted in so many ways.
> 
> > Alan Babich
> > 
> > 
> 

Received on Wednesday, 30 September 1998 18:48:14 UTC