Re: BATCH operation [was Re: Comments on draft-ietf-deltav-versioning-08]

From: Ross Wetmore (rwetmore@verticalsky.com)
Date: Mon, Sep 25 2000

  • Next message: Geoffrey M. Clemm: "Re: BATCH operation [was Re: Comments on draft-ietf-deltav-versioning-08]"

    Message-ID: <39CFCCCE.30FF03D4@verticalsky.com>
    Date: Mon, 25 Sep 2000 18:08:14 -0400
    From: Ross Wetmore <rwetmore@verticalsky.com>
    To: ietf-dav-versioning@w3.org
    Subject: Re: BATCH operation [was Re: Comments on   draft-ietf-deltav-versioning-08]
    
    
    
    "Geoffrey M. Clemm" wrote:
    > 
    >    From: Ross Wetmore <rwetmore@verticalsky.com>
    > 
    >    I do not believe a BATCHed atomic operation is anymore limited to
    >    sophisticated database systems than the current set of operations
    >    are, or at least that was not the connotation I had intended, i.e. the
    >    extreme case. I would not like to see this used as a cop-out to avoid
    >    the issue of how to deal with compound operations on the server as
    >    opposed to from the client.
    > 
    > It's important to identify what kind of batching operation
    > you have in mind.  Is it an "atomic" batching operation
    > (that guarantees all operations will be performed atomically, or not
    > at all), an "exclusive" batching operation (that guarantees that
    > no other client will change the state in the middle of the batch of
    > operations), both, or neither?
    > 
    > If it's an atomic batching operation, that's where you need a
    > database system (file system implementations, for example, will
    > not provide atomic batching behavior).
    > 
    > If it's exclusive batching operation, then it is very hard to
    > avoid denial of service attacks (everyone else is locked out while
    > your long compound operation is being performed).
    
      The current set of primitives for checkin of a version selector or
    of a working resource and update of the version selector illustrate the
    sort of problems that a particular choice or flavour of the standard will 
    cause, i.e. the implications of the changes just made to the draft.
    
      Even the single checkin of a version selector is a sequence of server
    operations to update the version history, update version selector 
    properties etc. A poorly written server will exhibit all sorts of 
    dynamic interactions from nearly simultaneous requests. But at least in
    this case the server code can be designed to handle the situation in a
    well-defined way to "restore the server state preceding the request". 
    This probably has to be at least some sort of "exclusive" batching as 
    suggested above, but this is server implementation and not material to 
    the discussion of the split between client and server execution. 
    
      In the case of the working resource checkin, the same net result is
    now required to be done by two separate client requests. The semantics
    of the mandated rollbacks are split into rollbacks for two sub-operations 
    and the number of possible outcomes and error states has increased 
    significantly. You have exported a significant amount of the server load
    in the previous case back to the client plus added significant network
    and failure modes to the problem. Even moreso, there are few ways for the 
    situation to be tightened up by the client. 
    
      The ability to batch or script some of this to put it back on the server 
    would do much to reduce the impact of choices made by the standard, at 
    least on servers that supported this capability.
    
      This doesn't imply any more transaction or exclusive semantics than are
    currently defined or implied by the rollback conditions in the current 
    draft, right? If you agree, then we can relegate those concerns to an
    orthogonal discussion, and just deal with batching.
    
    > If it is neither, then you it certainly can be defined, but you do
    > have to decide what the behavior is if any of the intermediate steps
    > fail.  (Keep going, stop, some kind of conditional behavior?).  You
    > then need to pass back some idea of what steps succeeded, and if you
    > keep going over failure, what the status was of each of the steps.
    > This is all doable, but you need to avoid designing a whole programming
    > language in the process (:-).
    
      Yes! We are in full agreement. The ultimate extreme is to provide the 
    full server programming environment. The test is to find the minimalist
    solution that accomplishes the same or most of the same goals :-) :-)
    
    >    The point I wished to make is that WebDAV versioning extensions have
    >    broken down execution into what is perceived to be an elementary set
    >    of primitive operations. But most versioning systems implement implement
    >    user operations that are a combination of several primitives.
    > 
    > To get interoperable atomic behavior, since most versioning systems
    > don't support user-defined atomic batching behavior, you are pretty
    > much forced to stay at the primitive level.  In particular, if one
    > system allows you to combine the operations X and Y into an atomic
    > compound operation X-Y and another system allows you to combine the
    > operations Y and Z into an atomic compound operation Y-Z, then the
    > only atomic behavior a client can use for both systems is X, Y, and Z,
    > as separate atomic operations.
    
      Yes! The first step is breaking the problem down into manageable
    common elements. 
    
      But the second step is to enable those systems to put themselves back 
    together as compound operations equivalent to their original behaviour. A 
    client in an X-Y environment will behave the same whether or not it is the 
    original X-Y operation or a standards conforming implementation. The same
    client could also use Y-Z semantics to get Y-Z behaviour, and not some flaky 
    partial implementation.
    
      What I am trying to say in my admittedly poor or naive way is that the 
    second step appears to be missing. After splitting the problem down and
    moving significant parts of the application out to the client, some means
    needs to be provided to allow the client to put it back onto the server
    to fully complete the process.
    
      [...]
    
    > A simple non-atomic, non-exclusive batching operation could certainly
    > be defined (I believe), but it is orthogonal to "versioning", so I
    > wouldn't want to delay the versioning protocol to define the
    > batching protocol.  Experience from the batching discussions from
    > the WebDAV protocol indicates that this is not something that is
    > easy to get closure on.
    
      I don't agree that this is orthogonal to "versioning" because choosing
    different ways of portionning the work between client and server is what the
    whole standard is about, right? Providing the most flexible and neutral
    way to do this would seem to be the best way to make the standard universally 
    accceptable.
    
    >    In its simplest form this might simply be the ability to nest successive
    >    operations. An error return from any nested operation would be treated
    >    as an error in the post-condition of the preceding operation, and cause
    >    whatever rollback was appropriate for that operation.
    > 
    > It's the "cause whatever rollback was appropriate" that would be
    > problematic.  For example, many versioning systems do not allow you to
    > delete versions once they have been created.  In addition, unless you
    > reserved the versioning repository for the duration of the compound
    > request (which doesn't scale to supporting dozens or hundreds of users
    > trying to access a shared repository), or unless you have a
    > sophisticated database repository that allows rollback of compound
    > operations, other requests will have been executed that took
    > advantage of your partial results, and those couldn't be rolled back.
    
      I think this was dealt with above. The current standard mandates rollbacks.
    I tried to be as careful as possible to state no more than what was already
    mandated. In case it wasn't clear, a nested operation was assumed to take
    place just before the previous operation completed and add an extra condition 
    to whether or not the operation was successful. If there was a failure the
    rollbacks would take place up the stack as currently defined, and as one is 
    still in the context of the executing operation the code should already be
    in place to do this. I admit, this glosses over a lot of the "details" though.
    
      This is perhaps the most minimalist implementation I could come up with on
    the spur of the moment that might have the desired effect. I am sure there are
    better if there is some sort of concensus that this needs to be considered
    further. I am curious if there are any other opinions on some aspect of this?
    I hope we have at least got the basic elements vs side issues flushed out by 
    the last couple exchanges!
    
    > Cheers,
    > Geoff
    
    Cheers,
    RossW
    =====