- From: Ricky Ho <riho@cisco.com>
- Date: Sat, 14 Dec 2002 17:09:08 -0800
- To: "Assaf Arkin" <arkin@intalio.com>, <www-ws-arch@w3.org>
- Message-Id: <4.3.2.7.2.20021214170633.0269bb60@franklin.cisco.com>
What you say is correct ! But only at the TCP packet level, not the message level. To some degree, I feel our previous RM handshaking discussion is re-implementing the TCP handshaking at the message level Rgds, Ricky At 11:58 AM 12/14/2002 -0800, Assaf Arkin wrote: >Ricky, > >TCP takes care of that. > >IP is a basic packet routing protocol that sends individual packets from >one machine to another. IP has message loss. A message may not arrive at >its destination. At the IP level the sender does not know whether the >message has arrived, and the received doesn't know a message was sent, so >there's no corrective action that will be taken. > >TCP is an elaborate protocol on top of IP that provides, connection-based >messaging. TCP uses IP which means packets sent from A to B may be lost, >may be received out of order, and may be received multiple times. TCP does >the ordering of the packets, retransmission, acks, etc. > >So it goes something along these lines (not exactly, but it's been a while >since I read the TCP spec): > >Node A opens connection to Node B. >Node A starts sending a message to Node B. >Node A identifies each packet by its order in the message. >Node A identifiers the last packet. >If Node B does not receive a packet it asks for retransmission. >If Node B does receive the packet it lets Node A know (this is only >critical for the last packet) > >Keep in mind that Node A and Node B keep communicating with each other all >the time, sending "is alive" messages back and forth to determine if the >connection is still open. So even if there's no application traffic >between A and B, there's a lot of chatter going over the wire. If A >doesn't hear from B after a while, then A assumes the connection is down >(and vice versa). > >The TCP/IP stack can use the negative acks (retransmit request) in >combination with the is-alive chatter (positive acks) to tell the >application whether the message has been received or not. > >arkin > >Arkin, can you elaborate your point that "using a sync transport protocol, >there will be no possibility of message loss" ?? here is an example. > >Node A sends a message to node B using a sync transport protocol HTTP POST > >A open a TCP connection to B successfully. >A send a stream of request data (in the HTTP format) to B. >Suddenly, the TCP connection drops. > >How does A know if B has received the request message or not ? > >Best regards, >Ricky > >At 08:03 PM 12/13/2002 -0800, Assaf Arkin wrote: >>The two army problem is concerned with the possibility of message loss. >>Message loss could occur when you are using an asynchronous transport >>protocol, though in most literature the term would be medium, where >>protocol is a more generic term that would even cover a choreography. >> >>Although you can have an asynchronous API for performing an operation, >>that API is between you and a messaging engine and typically you would >>use in-process calls or some synchronous transport, so there's no >>possibility of message loss. You can tell without a doubt whether the >>messaging engine is going to send the message or not. >> >>Even if the operation you are doing is asynchronous, you can use a >>synchronous protocol such as HTTP POST to deliver the message in which >>case there is no possibility for message loss. But you can also use an >>asynchronous protocol such as SMTP or UDP, in which case the message >>could be lost on the way to its definition. Lost has a loose definition, >>a message that gets garbled, delayed or routed to the wrong place is >>considered lost. >> >>Addressing message loss is therefore a problem of the protocol you use >>and not the operation you perform. So in my opinion that is outside the >>scope of WSDL abstract operation definition, but in the scope of specific >>protocol bindings, an it would definitely help if the protocol layer >>(XMLP) could address that relieving us of the need to define ack operations. >> >>arkin >>-----Original Message----- >>From: www-ws-arch-request@w3.org [mailto:www-ws-arch-request@w3.org]On >>Behalf Of Cutler, Roger (RogerCutler) >>Sent: Friday, December 13, 2002 1:28 PM >>To: Assaf Arkin; www-ws-arch@w3.org >>Subject: RE: Reliable Messaging - Summary of Threads >> >>Thanks for the support. >>One thing this note reminded me of -- I have seen a number of different >>definitions of "synchronous" floating around this group. In fact, if my >>memory serves, there are three major ones. One concentrates on the idea >>that a call "blocks" if it is synchronous, another has a complicated >>logic that I cannot recall and the third (contained in one of the >>references on the two army problem) concentrates on the length of time it >>takes for a message to arrive. The formality of all of these definitions >>indicates to me that all have had considerable thought put into them and >>that all are, in their context, "correct". They are, however, also >>different. >>-----Original Message----- >>From: Assaf Arkin [mailto:arkin@intalio.com] >>Sent: Friday, December 13, 2002 2:27 PM >>To: Cutler, Roger (RogerCutler); www-ws-arch@w3.org >>Subject: RE: Reliable Messaging - Summary of Threads >> >> >> >> >>3 - There is concern about the "two army" problem, which essentially says >>that it is not possible, given certain assumptions about the types of >>interactions, for all parties in the communication to reliably reach >>consensus about what has happened. I have been trying to encourage the >>objective of documenting the scenarios that can come up in and their >>relative importance and possibly mitigating factors or strategies. I >>haven't seen people violently disagreeing but I wouldn't call this a >>consensus point of view. I consider the ebXML spec as weak in discussing >>the two-army problem. >>The two army problem assumes you are using a non-reliable medium for all >>your communication and proves that it is impossible for the sender to >>reach confidence that the message has arrived and is processed in 100% of >>cases. >>You can increase your level of confidence by using message + ack and >>being able to resend a message and receive a duplicate ack. That get's >>you close to a 100% but not quite there, but it means that in most cases >>the efficient solution (using asynchronous messaging) would work, and so >>presents a viable option. >>In my opinion it is sufficient for a low level protocol to give you that >>level of reliability. And that capability is generic enough that we would >>want to address it at the protocol level in a consistent manner, so we >>reduce at least one level of complexity for the service developer. It is >>also supported by a variety of transport protocols and mediums. >>This still doesn't mean you can get two distributed services to propertly >>communicate with each other in all cases. A problem arises if either the >>message was not received (and is not processed), a message was received >>but no ack recevied (and is processed) or a message was received and an >>ack was received but the message is still not processed. >>That problem is not unique to asynchronous messaging, in fact it also >>presents itself when synchronous messaging is used. With synchronous >>messaging you have 100% confidence that a message was received, but no >>confidence that it will be processed. Furthermore, you may fail before >>you are able to persist that information, in which case your confidence >>is lost. >>If you do not depend on the result of the message being processed than >>you would simply regard each message that is sent as being potentially >>processed. You use the ack/resend mechanism as a way to increase the >>probability that the message indeed reaches its destination, so a >>majority of your messages will be received and. >>I argue that using ack/resend you could reach the same level of >>confidence that the message will be processed as if you were using a >>synchronous protocol, but could do so more efficiently. >>If you do depend on the message being processes, then you are in a >>different class of problem, and simply having a reliable protocol is not >>sufficient since it does not address the possibility that the message was >>received, acked but not processed. It in fact presents the same problem >>that would arise when synchronous protocols are used. >>This is best solved at a higher layer. There are two possible solutions, >>both of which are based on the need to reach a concensus between two >>systems. One solution is based on a two-phase commit protocol, which >>could be extended to use asynchronous patterns. A more efficient solution >>in terms of message passing would be to use state transitions that >>coordinate through the exchange of well defined messages. This could be >>modeled using a choreography language. >>Since this is outside the scope of this discussion I will not go into >>details, but if anyone is interested I would recommend looking at >>protocols for handling failures in distributed systems (in particular >>Paxos). In my understanding these protocols are applicable for modeling >>at the choreography language and are more efficient than using >>transactional protocols and two-phase commit. >>My only point here was to highlight that a solution involving ack/resend >>is sufficient to give you the same level of confidence that a message >>would be processed as if you were using a synchronous operation, and that >>solutions for achieving 100% confidence are required whether you are >>using asynchronous or synchronous messaging. >>This is in support of Roger's recommendation for adding ack support to XMLP. >> regards, >> arkin
Received on Saturday, 14 December 2002 20:09:50 UTC