RE: [long] Re: I-D ACTION:draft-ietf-webdav-versioning-01.txt from Chris Kaler on 1999-02-02 (w3c-dist-auth@w3.org from January to March 1999)

From: Chris Kaler <ckaler@microsoft.com>
Date: Tue, 2 Feb 1999 12:10:07 -0800
To: "'Max Rible'" <max@glyphica.com>, WEBDAV WG <w3c-dist-auth@w3.org>
Message-ID: <4FD6422BE942D111908D00805F3158DF0A757C91@RED-MSG-52>
Comments below...

-----Original Message-----
From: Max Rible [mailto:max@glyphica.com]
Sent: Monday, February 01, 1999 7:55 PM
To: WEBDAV WG
Subject: [long] Re: I-D ACTION:draft-ietf-webdav-versioning-01.txt


This looks fundamentally quite good; it covers all the versioning
scenarios I've ever encountered and more.  Kudos to the authors
who must have put in some very long hours on it!

[CK] Thanks!

***

My biggest issue with the new draft of the versioning standard
is the usage of gibberish temporary URIs where a user might have
to cope with them or a system administrator might have to clean them
up.  Are there any actual cases where temporaries are actually 
required, as opposed to a gibberish token that can be used in
relation to a comprehensible URI?

[CK] The point of using "giberish" in the draft is to reinforce
     that the server determines the value.  A server could make it
     giberish or something else.  One limitation we have is the
     strong reluctance in IETF to "munge" URLs.

I'm thinking usability here:  if incomprehensible, unmemorable temporary 
names are being used in WebDAV, it should be easy to make the usage 
[a] seamless from the point of view of the end user and [b] straightforward 
for the system administrator who will inevitably have to clean up
the incomprehensibly named files cluttering the temporary area.
Is the intention here that a client program would use PROPFIND
to locate all these obscurely named resources?  That all such
resources have a lifetime and will automatically be reaped?

[CK] I think it is important to remember that these are protocol-level
     resources.  I would assume that the UI on top of the protocol hides
     all this.  For example, you "checkout" and "checkin" based on the
     resource name you desire.  The working copy resource is part of the
     implementation and isn't visible to the user.  As well, there is
     always the DAV:displayname property.

[CK] That is really a per-server administrative function.  A server may
     choose, for example, to put all working copies in /tmp/wc.  As well,
     they might choose to have a lifetime.  Although that is dangerous 
     because many systems support long-running checkouts.  Especially if
     the checkout doesn't involve a lock.  That is, it will merge on
     check-in.

There is a certain utility to having the magic files in a magic
directory for ease of implementation-- you know you only have to
treat a file in a special way if it's in a special location.  In
my opinion, if you've already done the necessary work to make MKREF
function,
the additional amount needed to support checked-out files and
configurations going anyplace should be small.

[CK] The general idea was that this is up to the server.  A "checkout"
     request returns the location/URL which is managed by the server.

***

Is there a major flaw with the notion of CHECKOUT creating a locked, 
mutable, non-autoversioned revision that the user holding the lock can 
mutate arbitrarily until a CHECKIN is performed, at which point the 
revision's name changes from a placeholder to a version number?  
(i.e.  
CHECKOUT /foo/bar.html HTTP/1.1
Host: www.foobar.com
...

returns

HTTP/1.1 201 Created
Location: /foo/bar.html
Revision-Id: <opaquelocktocken:rejrei-43343-rereffre>
Lock-Token: <opaquelocktocken:rejrei-43343-rereffre>

and the combination of URI and Revision-Id can then be used for any
number of PUT and PROPPATCH operations.  When the user says

CHECKIN /foo/bar.html HTTP/1.1
Host: www.foobar.com
Revision-Id: <opaquelocktocken:rejrei-43343-rereffre>
Lock-Token: <opaquelocktocken:rejrei-43343-rereffre>
...

the current version is frozen, given a non-temporary name (such as "1.2.1"),
and the lock is released.  If they UNCHECKOUT that URI/revision-id
configuration, the revision quietly goes away.

[CK] I don't think so.  The idea is that a CHECKOUT creates a working copy.
     Technically this is not part of the revision graph for the resource.
     It must be mutable because clients need to be able to make multiple
     changes.  You can't version it because it isn't part of the version
     graph.  It is just a scratch space where PUTs and PROPPATCHs can be
     performed until the resource is ready.  CHECKIN then assigns a revision
     id and makes it part of the graph.  Working copies don't have revision
     ids.  Also, UNCHECKOUT cancels a CHECKOUT.  You can't issue UNCHECKOUT
     once you've issued a CHECKIN.  The draft should be clearer on this
point.

***

The same thing applies to configurations:  do they need to exist
in special areas?  Couldn't they be a part of a user's home
directory on a server?  Direct references would make it possible
to give the illusion of a configuration in your home directory, but
now you have issues regarding cleaning up a user's files when
moving their home directory from one machine to another or removing
it entirely.

[CK] The idea here was to put them in a unified place so that standard
     DAV discovery mechanisms can be used.  Otherwise we need to add
     new methods to discover the configurations.  These would be exactly
     the same as the normal folder discovery mechanism, just under a 
     new name.  We could do that -- we were just trying to use existing
     DAV semantics where appropriate.

***

Is a configuration so different from a collection that it should be
treated as a separate sort of entity?  It looks like a collection
that has a special sort of name (the configuration ID) and holds
nothing but MKREF-created links to particular versions of files.
(An aside:  should it be possible to use MKREF to link to a 
particular version of a file, allowing the reference to provide the 
Revision-Id or Configuration-Id header to a client who knows nothing 
of them?)  Are there any fundamental differences that would make
it difficult to consider a configuration as a collection with
some added rules and functionality?

[CK] Configurations are very similar, but also very different.  A
     configuration collection can be referenced in the Configuration-Id
     header.  That is not true of all MKREF collections.  As well, changes
     to the resources in the context of a configuration are automatically
     represented inside the configuration collection.  That is, if I
     rename foo.htm to bar.htm using MOVE in the /c/1 configuration, then
     inside /c/1 there will be a reference to bar.htm.

Should configurations be able to contain other configurations, or
simply references to them?  I can easily see that a configuration's
user might wish to partition it when it gets large and cluttered.

[CK] This is a really interesting question.  Conceptually, why not?
     However, that is really hard to represent in the resources.  As
     well, some of the semantics start to get really messy.  What does
     it mean to have nested configurations?  What does it mean for a
     resource to be "in" nested configurations.  We opted to say that a
     configuration can be derived from another, but there isn't a notion 
     of containment.

I'm thinking of software development solutions:  a configuration might
represent a project, with subconfigurations containing subprojects.
You'd want automatic inheritance from the core project so any time
someone else added a file to the configuration, you got a reference
to it.  A large project with a couple of dozen subprojects would
otherwise be a pain to bring into a workspace, unless you had a
development tool that dealt with all the repetetive actions for you.
(Direct references to other configurations could be used to 
provide the illusion of nested configurations, but would require
a lot of transactions make the parent configuration and then each
child configuration inheriting from the originals.)

[CK] Another way to think of this is that the collections in the
     namespace represent the project and sub-project relationships
     and that configurations represent various "releases" of those
     projects.  In this way the "V2" configuration can be derived
     from the "V1" configuration.

Might there occasionally be call for having non-reference members of
a configuration?  I could easily see a checkin set that has no other
reason for existence than its membership in a workspace.

[CK] The idea here was that configurations are a way of grouping resources
     that exist in the DAV namespace.  Our thoughts were that you have
     resources and make them part of configurations.  That is /foo/bar.htm
     is a resource whose revisions might be part of zero or more
configurations.
     But /foo/bar.htm exists to DAV on its own and isn't a sub-element of
     a configuration.  Not to say your idea isn't good -- I'm just trying
     to explain our thinking here.

***

Regarding the specification of the BRANCH command:

Why was the decision made to use 

       BRANCH VER:FHHR4959 HTTP/1.1
       Host: www.foobar.com
       Content-Type: text/html
       Content-Length: xxxx

instead of 

       BRANCH /foo/bar.html HTTP/1.1
	Revision-Id: VER:FHHR4959
       Host: www.foobar.com
       Content-Type: text/html
       Content-Length: xxxx

?

The current method makes for a great deal of extra work, and seems
inconsistent with the other usages in the specification.  Without
this, revision IDs need only be unique for a given resource; now, they
must be unique across an entire server.  I was originally expecting
that a lot of "Revision-Id" headers would be exactly the same as the
version number in your source control system; requiring revision ID's
to be unique across all files would make the raw data less readable
to human beings.  (This should also be called out explicitly in
2.10.1, which only specified that it must identify a specific
revision of a given resource.)

This also creates a complication for developers attempting to
implement the WebDAV standard.  If the WebDAV functionality is provided
as a plug-in module for an existing web server, that server might 
wind up doing some extra work in attempting to resolve a rather
mysterious URI-- lacking a leading slash, it might reject it 
entirely.  (Netscape Enterprise Server 3.5.1, for instance, will
immediately return 404 Not Found without even talking to plugins
registered for the method used.)  Granted, this isn't illegal by
the HTTP 1.1 standard, but it could certainly delay the progress
of WebDAV until the server technology has a chance to catch up.
If there's no strong reason to go with BRANCH revision-id, I
suggest switching to BRANCH uri.

[CK] This is something I should have made clearer.  Both are valid.
     The VER:... notion is used to refer to a specific revision via
     a URI.  You can also specify the URL and the Revision-Id header.
     Chalk this up to a bad example.

***

There has also been a massive growth in the number of available DAV
properties.  PROPFIND allprop operations may lead to very large
responses even with Depth: 1, which would slow down performance
for users due to network speeds.  It might be worthwhile to add this
facet to the open issue ALLPROP_AND_COMPUTED.

[CK] This is an excellent point.  It seems reasonable to use properties,
     but live vs. dead selection is interesting and DAV:allprop is
     becoming a very expensive operation.

***

I presume that the results of asking for DAV:defaulthistory,
DAV:activecheckouts, DAV:directlineage, and DAV:fulllineage
should *not* be included in a PROPFIND allprop response?  Also, don't
the new PROPPATCH parameters violate WebDAV spec 12.9.1, since an
identical DAV:href is turning up in multiple DAV:response
tags under the same DAV:multistatus?  (It's probably worth just
making an addendum to the WebDAV spec in the grammar section.)

[CK] The idea here is that you want a separate DAV:response for
     each DAV:href.  If you place DAV:defaulthistory inside of 
     the DAV:prop, you'd get a different response.  I'll take a
     look at the DAV spec again.

***

Regarding SETDEFAULT:  why is it specified as sending an XML body?
It it seems that 
	SETDEFAULT uri HTTP/1.1
	Revision-Id: DAV:none
is equivalent to the request with a body and consistent with
other usages in the specification.  Are there other data that
may be used with SETDEFAULT at some point?

[CK] The idea was to allow the user to specify additional information
     as well as allow the DAV:none to cancel.  We could do this all 
     through headers, but XML seemed appropriate.

-- 
%% Max Rible %% max@glyphica.com %% http://www.amurgsval.org/~slothman/ %%
%% "Before enlightenment:  sharpen claws, catch mice.                   %%
%%  After enlightenment:  sharpen claws, catch mice."            - me   %%

[CK] Thanks for the feedback.  We really appreciate you taking the 
     time to read and comment.  We have a working group meeting next
     week and I'll make sure we talk about the issues you have raised
     here.  Thanks again!
Received on Tuesday, 2 February 1999 15:10:12 UTC