RE: Different Levels of Reliable Messaging

Thanks David, see my followup questions (embedded)


>The "ack" doesn't need to be per-message based.  I can send an ack for a
>bunch of message (of course, sequence number is used).
><DB>Agreed, but now you are adding in an extra level of complexity (sequence
>number) that often won't be needed. What I would suggest is that you split
>this into another two levels:
>1. Sequencing Support. This is a protocol, built on top of reliable
>messaging that ensures that messages arrive in the sequence they were sent.
>2. Reduced Frequency Acknowledgement Messages. You could then vary the
>reliable messaging protocol so that a request for an acknowledgement is
>every so many messages and if it is not received, then corrective action is
>taken.
></DB>

<Ricky>
I was presuming that sequence ordering to be part of reliable 
messaging.  Seems like you consider this as a separate layer.
</Ricky>


>The "time expiry" is unreliable because clocks may be unsync.
><DB>Absolutely right.
>
>The "cheap", but as you say inaccurate way to do this is to set and compare
>"expires at" using a local system clock. The fact that it is an
>approximation to the true time is often not a big issue especially if you
>are doing end-to-end acks where the time between sending a message and when
>it expires is long compared to the clock accuracy (e.g. a day). Even so, it
>is probably good practice that Reliable Messaging solutions take this
>uncertainty in the accuracy of the time into account and extend the "expires
>at" to some time beyond the nominal expiry time.
>
>If time accuracy is *fairly* critical, then the sender and receiver of a
>message SHOULD agree to keep their clocks accurate using, for example,
>protocols such as the Network Time Protocol. If accuracy is *really*
>critical then you can include in the message the accuracy to which the
>system at the destination MUST keep its clocks. If the system does not keep
>its clocks accurate or cannot keep them accurate enough, then the
>destination should reject the message and not process it.</DB>

<Ricky>
Maybe I misunderstood the purpose of expiration time.  I guess your purpose 
of time expiry mechanism is for reducing the "in-doubt" condition.  So if A 
send a message to B which is valid within T minutes.  And A doesn't receive 
an ACK from B.  So A keep resending but still doesn't get back the ACK 
after (T+10) minutes.  Can A at this point simply gives up and conclude 
that the message is undelivered ?  All I try to say that "A cannot draw 
that conclusion".  Sorry, I agree this is irrelevant with the clock sync 
problem.
So I see the expiration time is purely an application level semantic (e.g. 
you send a bid response which is valid within one day).  I don't see what 
role the expiration time play at the RM level.  I must be missing something 
here.
</Ricky>


>I don't think there should be a step 4 in LEVEL 3.  Step 3 should say "Have
>you receive the message ?  If not, forget the message afterwards"
><DB>I don't think you can always say this. For example if you want to place
>an order and there is only one supplier, then even if you message failed,
>you might want to resend it if the connection became available later. In
>this case, the conent/payload/body of the message might be identical but in
>other ways it was a completely new message.</DB>

<Ricky>
What I'm trying to prevent is the situation that the request message 
arrives the receiver after the query (so the receiver respond: "I haven't 
got it"), but before the "forget message" get there.  In this case, the 
message has been delivered, but the sender think it hasn't.

Going back to your example, you should send a query to the supplier "Have 
you receive my purchase order with message id=12345 ?  if you haven't, 
ignore that message if it arrives later".
If you get back an answer "NO", resend your same purchase order with a new 
message id=98765.

However, if you send a separate "forgot" message after you receive a 
"NO".  Then it is possible that the receiver get 2 purchase order (one with 
message id = 12345 and the other with id = 98765).
</Ricky>


>I think LEVEL 5 should be done at the transaction layer, below
>choreography, but above reliable messaging.  Using some 2-phase-interaction
>style like BTP.
><DB>Quite possibly. The problem with two phase commit is the action you take
>when you geet a failure (i.e. a rollback) may not always the right one and
>often it can be impossible to do. For example, if you want to roll back a
>payment, but the payment has already gone to the bank, then its to late. You
>have to do a reversal, or refund instead. Both of these would leave a trace
>in the records of what happened.</DB>

<Ricky>
Of course, you can always handle exception at the application level, which 
can recovered from a partial failure situation is a very application 
specific manner.  However, this can complicates the application flow 
because it mixes the normal flow with exception handling logic under 
different failure scenario.

The beauty of transaction processing is that application can encapsulate 
multiple activities within a transaction block and safely assume everything 
will automatically undone.  In other words, the application doesn't need to 
worry about all failure situations.

Lets look at a simple case where A is sending a "money transfer request" to 
B, which sends a "money deposit request" to C as well as another "money 
withdrawal request" to D.  Let me illustrate the flow based on a 2-phase 
handshaking.

1) A sends "transfer" to B, and wait for "Prepared-ACK-transfer" from B
2) B sends "deposit" to C, and wait for "Prepared-ACK-deposit" from C
3) B sends "withdrawal" to D and wait for "Prepared-ACK-withdrawal" from D
4) After B got back all the "Prepared-ACK" from C and D, it send back the 
"Prepared-ACK-transfer" to A

5) A sends "commit" to B, and wait for "Commited-ACK-transfer" from B
6) B sends "commit" to C, and wait for "Commited-ACK-deposit" from C
7) B sends "commit" to D and wait for "Commited-ACK-withdrawal" from D
8) After B got back all the "Commit-ACK" from C and D, it send back the 
"Commited-ACK-transfer" to A

</Ricky>

By the way, you have raised some very good points David !

Best regards,
Ricky

Received on Saturday, 14 December 2002 02:55:28 UTC