RE: Different Levels of Reliable Messaging from Ricky Ho on 2002-12-17 (www-ws-arch@w3.org from December 2002)

From: Ricky Ho <riho@cisco.com>
Date: Mon, 16 Dec 2002 16:37:33 -0800
To: "Burdett, David" <david.burdett@commerceone.com>, www-ws-arch@w3.org
Message-Id: <4.3.2.7.2.20021216162751.02bbcfc0@franklin.cisco.com>
There are 2 phases

Phase - 1) The sender send its first message, keep resending until an ACK 
is received, or when the timeout is reached and the sender give up resend

By the end of this phase, the sender can only draw the following conclusion
1a) The message is "delivered" -- when an ACK is received
1b) The message is "in-doubt" -- when no ACK is received

Note: I don't think the sender can draw conclusion that the message is 
undelivered within this phase.

Phase - 2) The sender already given up the resend.  Now the sender want to 
find out the delivery status.  Sender keep sending a query until it answer 
"YES" or "NO"

By the end of this phase, the sender can only draw the following conclusion
2a) The message is "delivered" -- when an "YES" is received
2b) The message is "undelivered" -- when an "NO" is received
2c) The message is "in-doubt" -- when no response of the query is received

I still fail to see how the time expiry play a role here to make drawing 
the conclusion easier.

Thanks and regards,
Ricky

At 10:53 AM 12/16/2002 -0800, Burdett, David wrote:
>Ricky
>
><Ricky>
>This is what our disagreement is.  I think the sender should report to the 
>application that delivery is "in-doubt".  (not "failed")
></Ricky>
>
>Not always. You actually have two use cases:
>1. The message **cannot** be sent, e.g. because the outbound network 
>connection is down. In this the message delivery failed because it could 
>not even be sent
>2. The message was sent but no acknowledgement message was received - this 
>is, as you say "in-doubt", although the probability of success, in this 
>case, should be low.
>
><Ricky>
>I used to call this "Reliable 1-to-M messaging with atomicity". ...
>...
></Ricky>
>
>I agree, but I think this is something that should be layered on top.
>
>David
>-----Original Message-----
>From: Ricky Ho [mailto:riho@cisco.com]
>Sent: Saturday, December 14, 2002 4:36 PM
>To: Burdett, David; www-ws-arch@w3.org
>Subject: RE: Different Levels of Reliable Messaging
>
>Response embedded in <Ricky/>
>
>
>
>><DB2>I thinkj the expiration time is important for RM, and here's why.
>>
>>All RM is based waiting for an ack and repeatedly resending the original 
>>message if you don't get one. The problem is when do you stop resending 
>>and give up. At some point you have to stop but when. There are two ways 
>>of doing this.
>>1. Stop after a fixed number of retries (which is what ebXML MS does), or
>>2. Stop only after the message can no longer validly be processed.
>>
>>The problem with the first approach is that if you say stop sending after 
>>3 retries (i.e. 4 sends of the message in total), then it is still quite 
>>possible that, if the destination system was down, and then came back up 
>>it could pick up the message and process it - this is quite normal 
>>behavior. You could then get into the situation where:
>>1. The sender sent the message be gave up resending after, say 20 minutes 
>>as no reply was received, then
>>2. The sender reports to the application that deliver failed
><Ricky>
>This is what our disagreement is.  I think the sender should report to the 
>application that delivery is "in-doubt".  (not "failed")
></Ricky>
>
>>3. The destination restarts, finds the message, sends the ack and starts 
>>processing it.
>>4. The sender receives the ack and has to tell the application that the 
>>message for which it had just reported a delivery failure had actually 
>>been received and was processed - not a desirable outcome
>>Alternatively by specifying a time out and basing the retries on that you 
>>know, with a high degree of certainty, that even if the message is picked 
>>up, it won't be processed and therefore you are much less likely to have 
>>to report the a delivery failure and then have to reverse it.
><Ricky>
>Lets say the message has been picked up before time expired, and the 
>receiver site is down before the ACK message is sent.
>Now even though the sender keep resending message until time expired, he 
>still cannot conclude the delivery failure
></Ricky>
>
>><DB2>What you describe in this example is a Business Process. It is NOT, 
>>in my opinion, reliable messaging as you make the return of one ack 
>>dependent on the receipt of two other acks. The bottom line is that you 
>>can only do transaction processing if you KNOW that complete rollback of 
>>the state at the sender and receiver is possible. Sometimes it is, and 
>>sometimes it isn't which is why you have to determine how you do the 
>>recovery at the application level.</DB2>
><Ricky>
>I used to call this "Reliable 1-to-M messaging with atomicity".  If 
>successful, multiple destination will get the message.  If failed, none of 
>the destination will get the message, and the sender will be reported with 
>failure.
>
>This has nothing to do with the receiving application.  The receiving 
>application doesn't need to rollback (it actually haven't process the 
>message).  The rollback concept is at the RM level.
></Ricky>
>
>
>
>>By the way, you have raised some very good points David !
>>
>>Best regards,
>>Ricky
Received on Monday, 16 December 2002 19:38:01 UTC