RE: Proposed text on reliability in the web services architecture from Assaf Arkin on 2003-01-10 (www-ws-arch@w3.org from January 2003)

From: Assaf Arkin <arkin@intalio.com>
Date: Fri, 10 Jan 2003 11:52:33 -0800
To: "Walden Mathews" <waldenm@optonline.net>
Cc: <www-ws-arch@w3.org>
Message-ID: <IGEJLEPAJBPHKACOOKHNGEFIDAAA.arkin@intalio.com>
> -----Original Message-----
> From: Walden Mathews [mailto:waldenm@optonline.net]
> Sent: Friday, January 10, 2003 6:36 AM
> To: Assaf Arkin
> Cc: www-ws-arch@w3.org
> Subject: Re: Proposed text on reliability in the web services
> architecture
>
>
> arkin,
>
> > > > state, and a write operation to change it to a new state, with
> guarantee
> > > > that the same write given the same initial state would always result
> in
> > > the
> > > > same terminal state (in other words, all actions are idempotent).
> > >
> > > That may be a deterministic finite state machine, but it's not what
> > > idempotent means.  You should check your definition.  For this
> discussion,
> > > it obviously matter... a lot.
> >
> > Walden,
> >
> > Perhaps I don't understand correctly. The definition of idempotent is
> > "repeated applications have the same effect as one", or "Acting
> as if used
> > only once, even if used multiple times." [1]
>
> Ok. "Repeated applications" doesn't mean you reset initial state
> prior to each one, though.

True. But I want's saying I reset initial state prior to each one. States
are never reset, states are always progressing.


> > Given the same input the action I describe above would produce the same
> > output. And you can continually feed it the same input and it will
> > invariably produce the same output (in their terminology decide on the
> same
> > value).
>
> Where I come from, that's called a "function".  If "outcome" space is
> limited to one result, it's "deterministic".  Right?

It's a function meaning it maps an input space to an output space. Two
different inputs may result in two different outputs. The same input will
result in the same output.


> > That's not an option. For the action to be used in such a system it
> > needs to observe these restrictions otherwise everything comes tumbling
> > down.
>
> Right: indeterminism and reliability are a bad mix.  That's intuitive.

That's not intuitive. Failure may occur, so you actually start with an
indeterministic system. You don't know which process would fail or when or
for how long. And you don't try to prevent indeterminism, you try to provide
determinism in the face of failure. The protocol deals with this by
identifying units of work that are deterministic and building a protocol
that provides determinisim out of that.


> > So if you do it once, twice or n times, you get the same
> result. Which is
> > idempotent as far as I understand it.
>
> That's right, as long as each iteration begins with the post state of the
> prior one.  That's what you were missing above when you said "the same
> write given the same initial state would always result in the
> same terminal
> state".

And if you perform the operation at the wrong state it would report that so
it also produces invariably the same result. So regardless of how many times
you perform it (since you may not know if it completed) you would always get
the same result.


> Compare these two formulas
>
> (1) f(x) = f(x) = f(x) ... = f(x)
> (2) f(x) = f(f(x) = f(f(f(x) ...
>
> The first illustrates "function".  The second illustrates "idempotence".

I am quite familiar with that. That's the mathematical definition of
idempotence. It actually works as follows (and I'm basically repeating
exactly what you said, just trying to clarify how it applies to my case):

f(x)=y, f(y)=y -> f(f(x))=f(x)

in other words, if x is the initial state, the function would change it to
y. If y is the initial state the function would remain at y and keep
returning y. f(y) is also idempotent, but observe that f() doesn't do
anything, so by itself f(y) is a no-op. Furthermore, f(x) may result in z,
as long as:

f(y)=y and f(z)=z then f(x)->z or y observes f(f(x))=f(x)

The action I am describing observes that. Given f(x) it would produce either
z or y invariably. The action is not f. The function is a composition of the
action and any inputs specific to the action. In other words:

a1+v1 = f1
a1+v2 = f2
a2+v3 = f3
a3+v4 = f4

and so forth. That's how invocations of software methods are defined as
idempotent. That is, you look at the invocation (method+inputs) as the
function. That's in procedural. In object-oriented you would also add
object.

But there's a twist. In order to define the algorithm, the action is defined
as taken a set of inputs that contains the initial state identifier (x).
Now, the function is still the action + all inputs but the state. What the
action does is compare the known state (x) to the current state, to arrive
at two results:

- Returns an output based on the new state (y, z, etc) regardless of what
the current state of the system is (thus, being idempotent)

- Returns an indication that the current state of the system is neither x
nor y, z, etc performing a no-op

That doesn't break the idempotence rule, since the output of the function is
not the output of the action. The output of the function is the new state.
So multiple performance of the function still results in the same state.
However, the action gives an indication as to whether that happened or not.
Mathematically, this would be written as:

f(f(x))=x

f'(x)=w | 0

where w in z,y, etc

where w is a variable of a state (and 0 is null) but f' only reads the state
and does not modify it. The action is a composition of f(x) and f'(x), thus
idempotent as far as f(x) is defined and also informative by encapsulating
f'(x) and returning the result of f'(x) instead of the state. This recast of
the function (return value or z) simplifies the algorithm and improves the
timing by reducing message count without breaking the idempotence.

It's simply a way to recast a mathematical function into a useful action
that supports the requirements of the algorithm (reducing message count
being a key factor).

arkin

> [Now that I've had coffee and am feeling pedantic, here's another
> formula.  Guess what REST method property it illustrates
>
>     (3) f(x) = x]
>
> Does the above clarify?  If initial state is called x, and the
> post-state of
> the first application of the operation "f" is x', then the key to
> understanding
> the difference between "function" and "idempotent" is that the
> latter is saying f(x') = f(x).  No matter that x' and x may be the same
> values.
>
> >
> > What you pass in the input is the state (identifier) you start with and
> any
> > thing that has to happen to that state (that's how these
> protocols work).
> > Obviously it will only work if the action is still at the same
> state, but
> if
> > it's in a different state it would inform you without doing
> anything. And
> > you can repreat sending the wrong state identifier and continue getting
> the
> > same result.
>
> Not sure about "obvious".  You have to design it that way.  The system
> "works" when an application from any state from x' on is a no-op, with the
> understanding that those states are all identical, but (usually) distinct
> from
> initial state x.
>
> ["safe"] - did you guess?
>
> Walden
>
Received on Friday, 10 January 2003 14:54:07 UTC