- From: Assaf Arkin <arkin@intalio.com>
- Date: Sat, 18 Jan 2003 21:52:33 -0800
- To: "Walden Mathews" <waldenm@optonline.net>, "Peter Furniss" <peter.furniss@choreology.com>, "Champion, Mike" <Mike.Champion@SoftwareAG-USA.com>, <www-ws-arch@w3.org>
> > I've done stock trading applications I can certainly tell you that both > > client and server need to keep a ledger and the server ledger > is 10x more > > complex. So may the client can send 10% of the information to > the server, > > but that's still part of the server-side state change. > > No clue what this means. Let me restate. Having a client set > the state of > an identified deposit is not "shifting the problem away" from > anything; it's > something the client understands and means to do. RM can do something > similar with a message, but an application level error can cause the > application to need to patch the deposit amount, and when it does so, > it "sweeps" RM, in a manner of speaking. Please don't speak badly about > an application taking care of application needs. I think we've already established that RM tries to solve a very specific problem. I don't want to keep discussing RM in every possible scenario, because I don't have any evidence you will need to use it in every possible scenario. I haven't used RM in every single application, only in those applications that need it. I don't want it to sound like I'm preaching what I don't practice. And since we both agree RM doesn't make all your problems go away, I don't want to keep discussing it in the context of a "the one and only solution". I know some people would like to think they have "the one and only solution". Experience has taught me that for every problem there is a solution, and even if you can find the one and only solution for a specific problem, it's still not "the one and only solution". So I want to refocus the discussion with the following permise: - Consider the use of RM for applications that need RM and ignore applications that do not need RM - Consider the use of RM to solve the reliable messaging problem and don't discuss the use of RM to solve any other kind of problem > > Yep. That's why I said there are multiple strategies, it all depends on > what > > you want to achieve. You may ask for best-effort, which means you either > > receive it or not. You may to always get the message so loss requires > > resend. You may even depend (actually most applications do) on > the strict > > ordering, so if X was sent before Y you need X before you can process Y. > > What if every client defines its own unique strategy? The client can ask for as many strategies as it wants. An RM would give you three strategies: - No ordering with possible loss - Some ordering (m' follows m) accounting for loss of m - Strict ordering (i.e. you can't skip any messages in a sequence) From my experience and that of others these three strategies are generic enough and common enough that there is benefit in standardizing them. There might be other things you want to do, which we can discuss. But these three strategies solve the reliable messaging issue in point-to-point communication, so they define the scope of what an RM solution needs to offer. > > Now, if I send/receive multiple messages then TCP solves the packet > problem > > by making sure each message is either full or ignored, but it doesn't > solve > > the multi-sequence message problem. So you need something on top of TCP > > whenever multiple messages are involved. > > What is the multi-sequence message problem? I send a service two messages m and m', where m' can only be processed if m has been processed. I am using an asynchronous means for delivery, which means message loss is possible. (If I'm not using asynchronous delivery I'm not using RM and so I don't really care what an RM would or would not do) From the perspective of the service it is futile to process m' before it processes m. The message m' will simply be rejected. But, it is also futile to reject m' if m could arrive after m'. (Which is possible if we counter message loss with retransmission). So, if the messaging layer receives m' followed by m, then delivers m followed by m', the service can process both messages. Incidentally, TCP does just that. That why when you are downloaing you may see 1KB, 2KB, pause, 5KB. Because packets 3-4 arrived before packet 5, but once packet 5 has arrived the browser gets all three packets at once. It reduces network congestion. But TCP works at the packet level and here we are talking about messages. > > The definition of reliable is (very loosely and not precise so > don't kill > me > > on the wording): you deliver the message exactly once (deliver to the > > application, you can send/recieve it multiple times though), > you deliver a > > message only if the message was actually send (no suprious > messages), you > > deliver the message in some designated order. > > Is the ordering constraint optional? The ordering constraint should be imposed by the application based on the application needs. Otherwise you get an RM that is really good for some things, but a bit too heavy for others. That's actually a point I raised regading WS-Reliability, I think it offers strict ordering and that's too heavy for some applications. > In the various frameworks I've written for real-time market data, > implementing them directly in TCP or UDP, I'd agree that certain > applications benefit from further support in the stack. For example, > a popular pattern is for a TCP wrapper to keep connections alive > and re-establish connections, so as to allow the application to view > it as a persistent pipe to somewhere. > > Concerning the above pattern, while it's helpful to remove the > tcp connect logic from the application, it can be equally confusing > for a client when there is something about the network that > obstructs connection, because the application can't communicate, > but it may not know it. So ultimately, no you cannot abstract > away a real network if you are also going to deal with the stuff > that does go wrong. Let's say you have a pipeline where multiple threads can send/receive messages over a single TCP connection (e.g. proxies do that a lot). So each client application thinks it owns the connection, but in reality the stack makes all clients share the same pool of connections (e.g. in some cases 10 clients could experience no latency using a pool of 2 connections). Of course the client still needs to be notified of errors, but if the pipelined connection drops the stack could automaticaly create a new connection, so the client experiences a short delay but goes around its merry business. This doesn't protect the client from addressing failure. But, it also means that the same API you use to have one connection per client could be used to have a pipeline because the API is abstracted from the protocol. You can actually do that with HTTP, there are several client libraries out there that do pipelining and it looks the same to the client application. But to get that service the client can't get into the details of the protocol. All it could know is how to open a connection, how to close the connection, how to determine if the connection is open, and receive errors when the connection dies in the middle of sending a message (unless the stack recovers so there's no application error). > However, in the one project I've done that I'd call a "web service", > the application protocol was HTTP, and there was no evident > advantage to having an RM in the loop. It wasn't about messaging > reliability; it was about resource state reliability. Note that while > TCP was in the stack, I don't call that a "web services RM". Again we are talking apples vs oranges. You tell me that your application does not do asynchronous delivery of messages so it does not need an RM, but I never said it does. And you tell me that TCP does excellent RM for packets, which I know for a fact, but that's one RM and not the WS RM we are discussing. Let me put it like that. WS RM solves reliable delivery of messages. TCP solves reliable delivery of packets. You don't need RM if all you need is reliable delivery of packets. That's established. You won't get reliable delivery of messages if you don't have reliable delivery of packets. There are a lot of MOMs in the market that use IP multicast. So they actually have two RM mechanisms, one for dealing with packets and one for dealing with messages, and even though the logic is similar, you use one to build the other, since you deliver packets to the MOM and messages to the app. > > I also think that abstracting is a good thing becuase I can get > that layer > > componentized, I can reuse it across multiple applications, or > even buy it > > from a 3rd party, or download it as open source. But to be abstract we > need > > some generalized API (or even two or three). > > Or even twenty or thirty. I've developed lots of frameworks, probably > so many because of the fact that one size did not fit all. 8-( Ever heard of one-size baseball hats? Think of RM as a baseball hat. It protects you from the sun, so you probably want to wear it on a sunny afternoon baseball game, but you don't really need it at night or when you're sitting at home watching TV. And if you wear just a baseball hat you won't be allowed into any baseball stadium, in fact you will be arrested for indicent exposure. It's probably not a good allegory, but I'm trying to look at different ways to express the fact that an RM is a solution not "the solution", so judging it as "the solution" is not a good approach to evaluating its benefits. > I don't fault you for your optimism. I've heard lots of promises from > developers about "totally generic software" that didn't pan out that way > because irritating little application requirements got in the way. And I prefectly agree with you. If you position RM as the one solution for all your application problems, you're going to crash and burn. There are three ways to prove that a solution is worthless: 1. Look at the problem and discover that the solution does not solve it 2. Look at a different problem and discover that the solution does not solve it 3. Look at a bigger problem and discover that the solution does not solve all of it I would like to refrain from doing 2 or 3. It seems to me to be totally futile, won't you agree? I like to keep focusing on 1. So whenever you say "this is not an RM problem" or "this is more than the RM problem", I would agree with you, and let's just acknowledge its not an RM problem or it's more than an RM problem and figure out how to solve it using the most application solution for that particular problem. arkin > > But...damn the torpedos! > > Walden
Received on Sunday, 19 January 2003 00:54:38 UTC