RE: Different Levels of Reliable Messaging

Response embedded in <Ricky/>


><DB2>I thinkj the expiration time is important for RM, and here's why.
>
>All RM is based waiting for an ack and repeatedly resending the original 
>message if you don't get one. The problem is when do you stop resending 
>and give up. At some point you have to stop but when. There are two ways 
>of doing this.
>
>1. Stop after a fixed number of retries (which is what ebXML MS does), or
>2. Stop only after the message can no longer validly be processed.
>
>The problem with the first approach is that if you say stop sending after 
>3 retries (i.e. 4 sends of the message in total), then it is still quite 
>possible that, if the destination system was down, and then came back up 
>it could pick up the message and process it - this is quite normal 
>behavior. You could then get into the situation where:
>
>1. The sender sent the message be gave up resending after, say 20 minutes 
>as no reply was received, then
>2. The sender reports to the application that deliver failed

<Ricky>
This is what our disagreement is.  I think the sender should report to the 
application that delivery is "in-doubt".  (not "failed")
</Ricky>

>3. The destination restarts, finds the message, sends the ack and starts 
>processing it.
>4. The sender receives the ack and has to tell the application that the 
>message for which it had just reported a delivery failure had actually 
>been received and was processed - not a desirable outcome
>
>Alternatively by specifying a time out and basing the retries on that you 
>know, with a high degree of certainty, that even if the message is picked 
>up, it won't be processed and therefore you are much less likely to have 
>to report the a delivery failure and then have to reverse it.

<Ricky>
Lets say the message has been picked up before time expired, and the 
receiver site is down before the ACK message is sent.
Now even though the sender keep resending message until time expired, he 
still cannot conclude the delivery failure
</Ricky>

><DB2>What you describe in this example is a Business Process. It is NOT, 
>in my opinion, reliable messaging as you make the return of one ack 
>dependent on the receipt of two other acks. The bottom line is that you 
>can only do transaction processing if you KNOW that complete rollback of 
>the state at the sender and receiver is possible. Sometimes it is, and 
>sometimes it isn't which is why you have to determine how you do the 
>recovery at the application level.</DB2>

<Ricky>
I used to call this "Reliable 1-to-M messaging with atomicity".  If 
successful, multiple destination will get the message.  If failed, none of 
the destination will get the message, and the sender will be reported with 
failure.

This has nothing to do with the receiving application.  The receiving 
application doesn't need to rollback (it actually haven't process the 
message).  The rollback concept is at the RM level.
</Ricky>


>By the way, you have raised some very good points David !
>
>Best regards,
>Ricky

Received on Saturday, 14 December 2002 20:02:39 UTC