Re: UPDATE method and forking from Werner Donné on 2006-03-17 (w3c-dist-auth@w3.org from January to March 2006)

From: Werner Donné <werner.donne@re.be>
Date: Fri, 17 Mar 2006 09:41:06 +0100
To: Manfred Baedke <manfred.baedke@greenbytes.de>
Cc: w3c-dist-auth@w3.org, geoffrey.clemm@us.ibm.com
Message-ID: <441A7622.3010508@re.be>
Hi Manfred,

Manfred Baedke wrote:
> Hi Werner,
> 
> comments inline:
> 
>>> Werner Donné <werner.donne@re.be> wrote on 03/16/2006 08:01:57 AM:
>>>
>>>> Geoffrey M Clemm wrote:
>>>> >
>>>> > (1) Why does this scenario have to be in one transaction?  In
>>> particular,
>>>> > why isn't it sufficient just to wrap the sequence of operations in
>>>> > a LOCK/UNLOCK? 
>>>> Because anything in between can go wrong, leaving the system in an
>>>> inconsistent state. It can't be the responsibility of the client to
>>>> keep the state consistent. That is a co-operative and hence unreliable
>>>> design.
>>>
>>> The only thing that can go wrong is that the client neglects to perform
>>> an UNLOCK, which is why there are timeout and administrative override
>>> mechanisms
>>> in the locking protocol.  But I agree that many clients would rather not
>>> deal with locks, which is why we introduced the workspace feature. 
>>> So lets
>>> focus on whether or not the workspace feature provides what you need.
>>
>> Between each step in the sequence the dialogue can be interrupted. Not
>> only
>> would the lock not be released, but the checked-in version of the VCR
>> could
>> also be on another version than it was originally. This is an unexpected
>> global state change for all clients, which is not acceptable in a
>> concurrent
>> system.
>>
> 
> So the checked-in version of the VCR changed. Every client (at least
> those who do not hold a lock) must be aware of this possibility at any
> time. Why should this be unexpected?

Several clients can be executing such scenarios at the same time with the
same VCR. In order to avoid sudden changes of the checked-in version they
all have to use locking. Locking, however, is not mandatory, so it is up
to the clients to make sure there are no races. This is a co-operative
process. It is like saying that multiple clients can update tables in a
database, but they have to manage table or record locking themselves. It
was like that decades ago.

The forking and update mechanisms require isolation and locking is not
the good instrument for that. Workspaces achieve concurrency in a natural
way without imposing co-operative scenarios with locking. This is both
safer and simpler.

> 
>> Regarding the lock release the timeout mechanism is not good enough if
>> locking is used like that, because VCR is blocked too long. This impacts
>> concurrency severely. When a transaction fails, for example, everything
>> is rolled back and release immediately.
>>
> 
> Yes, long lasting exclusive locks are evil. Again, why not use
> workspaces or working resources?

They are not evil if the repository supports branching. It is common to
reserve a branch, because branches are a tool to let people work together
on the same topic. Long-lasting locks should not be used on the integration
branch in which all work will be merged.

> 
>> Again, I'm not discussing what I need. I'm not a user.
>>
>>>
>>>> > (2) I agree that the locking mechanism is not designed to implement
>>>> > transactional isolation.  That is what workspaces and working
>>>> resources
>>>> > are designed for.  If (for unexplained reasons :-) you don't want to
>>>> > use the versioning features that were designed for transactional
>>>> > isolation, it should come as no surprise that you have difficulty
>>>> > achieving transactional isolation.  Some clients don't want/need
>>>> > transactional isolation (they just want history to be kept, for
>>> example),
>>>> > which is why workspaces and working resources are optional features
>>>> > (we define these two different ways of achieving transactional
>>>> isolation
>>>> > because some repositories can only support one, while other
>>>> repositories
>>>> > can only support the other, and we could find no way of unifying
>>>> these
>>>> > two different ways under a single feature).
>>>>
>>>> This is obviously not at all about what I want or don't want to use.
>>>> I'm reasoning about the WebDAV model. I haven't said I was against
>>> workspaces.
>>>
>>> Well, to be honest, it is not at all clear to me what you do or do not
>>> want to do (:-).  The purpose of the WebDAV model is provide a uniform
>>> protocol for accessing the various authoring features provided by
>>> a wide variety of repositories.  So the interesting questions are
>>> "how do I as a client use the protocol to achieve this interesting
>>> use case for a user" and "how do I as a repository use the protocol
>>> to expose this particular feature of my repository to a client".
>>> Without one of those questions in hand, one cannot have a productive
>>> discussion about the WebDAV model.
>>
>> I'm assessing the model, because I want to implement it. It is important
>> that a model is consistent and stable, otherwise its implementation will
>> be expensive. In my opinion exposing forking is not consistent. It is not
>> because there is another feature that does the trick, that a particular
>> feature shouldn't be scrutinised.
>>
> 
> Please explain in detail where you see inconsistencies. I can only see a
> non-linear version history so far, which is perfectly consistent.

You have two alternatives for working in parallel, where one suffices. One
method, working resources, exposes physical aspects such as forking and
puts the resposibility for integrity at the client. The other is abstract.
It doesn't expose implementation details, because each "branch" has its
own VCR and the checked-in version of each VCR can be left to be the most
recent version. The integrity responsibility is within the server.

The only functional difference there is supposed to be between workspaces
and working resources is that the latter work in their own configuration,
a term that is nowhere defined by the way prior to section 9 of RFC 3253.

That makes it still possible to use workspaces in a client that would have
to resort to working resources. Workspaces are now positioned as a pure
server functionality. They are, however, an abstract mechanism that can
be perfectly implemented differently in a real server and in a stand-alone
client. For the latter the implementation may be lighter because there is
perhaps no cuncurrency, making the preservation of integrity easier. It
is as if such a client has its own built-in server.

> 
>> Inconsistent parts in a model have a life of their own and can have an
>> impact on other areas of the model during evolution, just for the sake
>> of compatibility.
>>
>>>
>>>> > As for the why support forking, it is an (undesireable, but
>>>> inevitable)
>>>> > side-effect of parallel development.  So we don't define it as an
>>>> > independent feature, but we "deal with it" for those features that
>>>> > support parallel development (i.e., workspaces and working
>>>> resources).
>>>> > In particular, it is what tells you that parallel development has
>>>> > occurred on a particular resource, and that therefore a merge is
>>>> > required (the need for a merge when forking occurs is why it is
>>>> > "undesireable").
>>>> >
>>>> > So creating forks in a non-parallel development context makes no
>>> sense ...
>>>> > you are just making trouble for yourself.  Which is why forking only
>>>> > occurs as a side effect of parallel development, and not as a
>>> feature that
>>>> > you explicitly invoke.
>>>>
>>>> Which is what I reckoned and it is an encapsulation flaw. Clearly,
>>>> forking
>>>> on its own is useless, but it is exposed through the WebDAV
>>>> interface with
>>>> elements such as "checkin-fork", "checkout-fork", "fork-ok". There
>>>> is no
>>>> reason for this, nor for the UPDATE method to be allowed to change the
>>>> checked-in version, which is the source of the problem.
>>>
>>> The way WebDAV (and HTTP) achieves interoperability is by having a
>>> client
>>> use a single uniform protocol against a wide variety of servers.  So the
>>> client always uses the UPDATE method to "restore" a version from
>>> history,
>>> whether it is working with a server that supports working resources,
>>> supports
>>> workspaces, supports both, or supports neither.  Changing the checked-in
>>> version is important when the server supports either workspaces or
>>> working resources, so you have clients uniformly use the UPDATE method.
>>> This means that a client is aware of checkin-fork, checkout-fork, and
>>> fork-ok behavior, so that in behaves properly when it is applied to a
>>> server that supports either workspaces or working resources.
>>
>> It strikes me that workspaces don't expose forking, while working
>> resources
>> do. Workspaces prove it is perfectly possible to offer branching and
>> merging
>> without exposing an implementation artifact such as forking. Note that
>> they
>> also make it possible to work transactionally within the repository
>> without
>> exposing transaction demarcation.
>>
>> Workspaces are well designed, while working resources are not. Their
>> difference
>> should only about configurations, not in the way they offer branching.
>> That
>> should be exactly the same. The two things are orthogonal.
>>
> 
> The main difference between workspaces and working resources is, IMHO,
> whether the client controls the namespace or not. In both cases, you get
> multiple VCRs sharing a common version history. I really do not see a
> relevant difference in fork handling. What am I missing?

See above.

> 
> 
> Regards,
> Manfred
> 

Regards,

Werner.
-- 
Werner Donné  --  Re
Engelbeekstraat 8
B-3300 Tienen
tel: (+32) 486 425803	e-mail: werner.donne@re.be
Received on Friday, 17 March 2006 08:41:05 UTC