RE: Proposed text on reliability in the web services architecture from Assaf Arkin on 2003-01-19 (www-ws-arch@w3.org from January 2003)

From: Assaf Arkin <arkin@intalio.com>
Date: Sat, 18 Jan 2003 21:52:33 -0800
To: "Walden Mathews" <waldenm@optonline.net>, "Peter Furniss" <peter.furniss@choreology.com>, "Champion, Mike" <Mike.Champion@SoftwareAG-USA.com>, <www-ws-arch@w3.org>
Message-ID: <IGEJLEPAJBPHKACOOKHNAEPGDAAA.arkin@intalio.com>
> > I've done stock trading applications I can certainly tell you that both
> > client and server need to keep a ledger and the server ledger
> is 10x more
> > complex. So may the client can send 10% of the information to
> the server,
> > but that's still part of the server-side state change.
>
> No clue what this means.  Let me restate.  Having a client set
> the state of
> an identified deposit is not "shifting the problem away" from
> anything; it's
> something the client understands and means to do.  RM can do something
> similar with a message, but an application level error can cause the
> application to need to patch the deposit amount, and when it does so,
> it "sweeps" RM, in a manner of speaking.  Please don't speak badly about
> an application taking care of application needs.

I think we've already established that RM tries to solve a very specific
problem. I don't want to keep discussing RM in every possible scenario,
because I don't have any evidence you will need to use it in every possible
scenario. I haven't used RM in every single application, only in those
applications that need it. I don't want it to sound like I'm preaching what
I don't practice.

And since we both agree RM doesn't make all your problems go away, I don't
want to keep discussing it in the context of a "the one and only solution".
I know some people would like to think they have "the one and only
solution". Experience has taught me that for every problem there is a
solution, and even if you can find the one and only solution for a specific
problem, it's still not "the one and only solution".

So I want to refocus the discussion with the following permise:

- Consider the use of RM for applications that need RM and ignore
applications that do not need RM
- Consider the use of RM to solve the reliable messaging problem and don't
discuss the use of RM to solve any other kind of problem


> > Yep. That's why I said there are multiple strategies, it all depends on
> what
> > you want to achieve. You may ask for best-effort, which means you either
> > receive it or not. You may to always get the message so loss requires
> > resend. You may even depend (actually most applications do) on
> the strict
> > ordering, so if X was sent before Y you need X before you can process Y.
>
> What if every client defines its own unique strategy?

The client can ask for as many strategies as it wants.

An RM would give you three strategies:

- No ordering with possible loss
- Some ordering (m' follows m) accounting for loss of m
- Strict ordering (i.e. you can't skip any messages in a sequence)

From my experience and that of others these three strategies are generic
enough and common enough that there is benefit in standardizing them. There
might be other things you want to do, which we can discuss. But these three
strategies solve the reliable messaging issue in point-to-point
communication, so they define the scope of what an RM solution needs to
offer.


> > Now, if I send/receive multiple messages then TCP solves the packet
> problem
> > by making sure each message is either full or ignored, but it doesn't
> solve
> > the multi-sequence message problem. So you need something on top of TCP
> > whenever multiple messages are involved.
>
> What is the multi-sequence message problem?

I send a service two messages m and m', where m' can only be processed if m
has been processed. I am using an asynchronous means for delivery, which
means message loss is possible. (If I'm not using asynchronous delivery I'm
not using RM and so I don't really care what an RM would or would not do)

From the perspective of the service it is futile to process m' before it
processes m. The message m' will simply be rejected. But, it is also futile
to reject m' if m could arrive after m'. (Which is possible if we counter
message loss with retransmission). So, if the messaging layer receives m'
followed by m, then delivers m followed by m', the service can process both
messages.

Incidentally, TCP does just that. That why when you are downloaing you may
see 1KB, 2KB, pause, 5KB. Because packets 3-4 arrived before packet 5, but
once packet 5 has arrived the browser gets all three packets at once. It
reduces network congestion. But TCP works at the packet level and here we
are talking about messages.


> > The definition of reliable is (very loosely and not precise so
> don't kill
> me
> > on the wording): you deliver the message exactly once (deliver to the
> > application, you can send/recieve it multiple times though),
> you deliver a
> > message only if the message was actually send (no suprious
> messages), you
> > deliver the message in some designated order.
>
> Is the ordering constraint optional?

The ordering constraint should be imposed by the application based on the
application needs. Otherwise you get an RM that is really good for some
things, but a bit too heavy for others. That's actually a point I raised
regading WS-Reliability, I think it offers strict ordering and that's too
heavy for some applications.


> In the various frameworks I've written for real-time market data,
> implementing them directly in TCP or UDP, I'd agree that certain
> applications benefit from further support in the stack.  For example,
> a popular pattern is for a TCP wrapper to keep connections alive
> and re-establish connections, so as to allow the application to view
> it as a persistent pipe to somewhere.
>
> Concerning the above pattern, while it's helpful to remove the
> tcp connect logic from the application, it can be equally confusing
> for a client when there is something about the network that
> obstructs connection, because the application can't communicate,
> but it may not know it.  So ultimately, no you cannot abstract
> away a real network if you are also going to deal with the stuff
> that does go wrong.

Let's say you have a pipeline where multiple threads can send/receive
messages over a single TCP connection (e.g. proxies do that a lot). So each
client application thinks it owns the connection, but in reality the stack
makes all clients share the same pool of connections (e.g. in some cases 10
clients could experience no latency using a pool of 2 connections).

Of course the client still needs to be notified of errors, but if the
pipelined connection drops the stack could automaticaly create a new
connection, so the client experiences a short delay but goes around its
merry business.

This doesn't protect the client from addressing failure. But, it also means
that the same API you use to have one connection per client could be used to
have a pipeline because the API is abstracted from the protocol. You can
actually do that with HTTP, there are several client libraries out there
that do pipelining and it looks the same to the client application.

But to get that service the client can't get into the details of the
protocol. All it could know is how to open a connection, how to close the
connection, how to determine if the connection is open, and receive errors
when the connection dies in the middle of sending a message (unless the
stack recovers so there's no application error).


> However, in the one project I've done that I'd call a "web service",
> the application protocol was HTTP, and there was no evident
> advantage to having an RM in the loop.  It wasn't about messaging
> reliability; it was about resource state reliability.  Note that while
> TCP was in the stack, I don't call that a "web services RM".

Again we are talking apples vs oranges. You tell me that your application
does not do asynchronous delivery of messages so it does not need an RM, but
I never said it does. And you tell me that TCP does excellent RM for
packets, which I know for a fact, but that's one RM and not the WS RM we are
discussing.

Let me put it like that. WS RM solves reliable delivery of messages. TCP
solves reliable delivery of packets. You don't need RM if all you need is
reliable delivery of packets. That's established. You won't get reliable
delivery of messages if you don't have reliable delivery of packets.

There are a lot of MOMs in the market that use IP multicast. So they
actually have two RM mechanisms, one for dealing with packets and one for
dealing with messages, and even though the logic is similar, you use one to
build the other, since you deliver packets to the MOM and messages to the
app.


> > I also think that abstracting is a good thing becuase I can get
> that layer
> > componentized, I can reuse it across multiple applications, or
> even buy it
> > from a 3rd party, or download it as open source. But to be abstract we
> need
> > some generalized API (or even two or three).
>
> Or even twenty or thirty.  I've developed lots of frameworks, probably
> so many because of the fact that one size did not fit all. 8-(

Ever heard of one-size baseball hats?

Think of RM as a baseball hat. It protects you from the sun, so you probably
want to wear it on a sunny afternoon baseball game, but you don't really
need it at night or when you're sitting at home watching TV. And if you wear
just a baseball hat you won't be allowed into any baseball stadium, in fact
you will be arrested for indicent exposure.

It's probably not a good allegory, but I'm trying to look at different ways
to express the fact that an RM is a solution not "the solution", so judging
it as "the solution" is not a good approach to evaluating its benefits.


> I don't fault you for your optimism.  I've heard lots of promises from
> developers about "totally generic software" that didn't pan out that way
> because irritating little application requirements got in the way.

And I prefectly agree with you. If you position RM as the one solution for
all your application problems, you're going to crash and burn.

There are three ways to prove that a solution is worthless:

1. Look at the problem and discover that the solution does not solve it
2. Look at a different problem and discover that the solution does not solve
it
3. Look at a bigger problem and discover that the solution does not solve
all of it

I would like to refrain from doing 2 or 3. It seems to me to be totally
futile, won't you agree? I like to keep focusing on 1.

So whenever you say "this is not an RM problem" or "this is more than the RM
problem", I would agree with you, and let's just acknowledge its not an RM
problem or it's more than an RM problem and figure out how to solve it using
the most application solution for that particular problem.

arkin

>
> But...damn the torpedos!
>
> Walden
Received on Sunday, 19 January 2003 00:54:38 UTC