Re: CHECKIN/CHECKOUT - URNs and Destroying Immutable Resources

Yaron Goland (yarong@microsoft.com)
Tue, 19 Jan 1999 19:46:25 -0800


Message-ID: <3FF8121C9B6DD111812100805F31FC0D08792D4D@RED-MSG-59>
From: Yaron Goland <yarong@microsoft.com>
To: "'Geoffrey M. Clemm'" <gclemm@tantalum.atria.com>,
Date: Tue, 19 Jan 1999 19:46:25 -0800
Subject: RE: CHECKIN/CHECKOUT - URNs and Destroying Immutable Resources

Reviewing early drafts of anything is always difficult. By their very nature
early drafts tend to have lots of unresolved issues. As a reviewer you want
to get clarification of those issues but as a working group member you don't
want to discourage authors.

Then again, speaking from long painful experience, the WebDAV WG's normal
review process involves taking the authors out back and beating them
senseless.

I've tried not to cling to closely to tradition in my own review.

I have also decided to break my review into separate posts so as to
facilitate conversation.

			Yaron

So Spoke Geoffrey:

> Putting a Resource under Version Control
> 
> When a resource is put under version control, it becomes unwriteable.
> In order to modify a resource, it must first be checked out, then can
> be modified one or more times, and then checked back in to indicate
> you are done modifying it.  If your CHECKOUT fails, it means someone
> else is currently modifying the document, so you should only do a GET
> with the understanding that the results are only temporarily valid.
> 

1) First you say that a resource under version control is unwriteable and
then you explain how to modify a resource. I'm confused. I suspect you need
to discuss your model a bit. One can infer a lot about the model by reading
the rest of the paper but I dislike having to infer, because I tend to infer
incorrectly.

2) CHECKOUTs can fail for many reasons wholly unrelated to current use. But
the statement does lead one to infer that the proposed versioning system can
not support multiple simultaneous checkouts. Is this true?

> 
> Checkout vs. Lock
> 
> Note the distinction between a (write) LOCK and CHECKOUT.  The LOCK
> takes a resource that is writeable by everyone and temporarily makes
> it unwriteable by everyone except the lock holder (until it is
> UNLOCK'ed).  A CHECKOUT takes a resource that is unwriteable by
> everyone, and temporarily makes it writeable (until it is CHECKIN'd).
> It is reasonable to apply a LOCK to a checked-out resource, but is
> not required.  In particular, many systems will decide the LOCK
> is irrelevant, since a "friendly" client will delay writing until
> it can perform a CHECKOUT, and an "unfriendly" client can just wait
> until the UNLOCK and then trash the resource contents at will.
> 

The distinction between shared and exclusive locks should be pointed out.

I will defer my points regarding mutable resources to another post.

> 
> Immutable-Revisions
> 
> An immutable-revision is a revision whose contents (and immutable
> properties) cannot be changed.  More precisely, an attempt to retrieve
> the contents or immutable properties of an immutable-revision will
> always return the same contents or will fail.  Therefore a server can
> delete the contents or properties of an immutable-revision (resulting
> in a failure when an attempt is made to retrieve those contents or
> properties), but can never delete the immutable-revision itself.
> 

If I understand your meaning in saying "never delete the immutable-revision
itself" you are implying that a server could nuke all the state associated
with the immutable-revision but not a note specifying that once upon a time
such a revision did exist and did hold a certain position in the version
tree. However the reality is that people will want to destroy even notices
of the existence of a revision for any number of reasons, some more
nefarious than others. I suspect it is unrealistic of us to expect the
protocol to be able to prevent this. 

There is the additional problem of what to do if the resource is destroyed
and its HTTP URL gets re-used. Who will return the "this resource has been
nuked" notice? The way the language is current written it would seem that
once you assign an HTTP URL to a version of a resource, even if you destroy
the resource, you are still required to reserve the HTTP URL so it can
return the "this resource doesn't exist anymore" error. I suspect we will
find significant opposition to this idea. People tend to get touchy about
their HTTP URL namespaces.

One alternative is to require that a note be dropped into the version
history specifying that there did once exist a version with a set of
particular characteristics but its resource has since been destroyed. I
don't think this is a good idea because it means that we need to refer to a
resource (even one which doesn't currently exist) without the use of a URI.
This is likely to muck up the protocol in all sorts of unhappy ways. What we
need is a URI which refers to a resource independently of the HTTP URL used
to actually retrieve the resource.

Which brings us to URNs. I don't propose we actually use URNs, I don't like
them very much. But the underlying concept is sound. We should require that
all resources have a URI associated with them that meets the same uniqueness
requirements we place on lock token URIs. The URIs DO NOT HAVE TO BE
RESOLVABLE. If they are, bonus points, but it is not necessary for the
protocol to work properly.

When a resource is created it must be assigned one of these universally
unique URIs. The URI can then be used with the IF header on any requests to
an HTTP URL so as to ensure that the request will only succeed if the
resource is the same resource as the one specified by the URI.

The history graph is then free to refer to both the URI and any known URLs
that the resource is available under. If the HTTP URL is changed or the
resource is destroyed then the graph will only refer to the universally
unique URI. This allows the version to still be refereed to in various
operations (such as creating a child) even though it doesn't exist.