Different Levels of Reliable Messaging from Burdett, David on 2002-12-12 (www-ws-arch@w3.org from December 2002)

From: Burdett, David <david.burdett@commerceone.com>
Date: Thu, 12 Dec 2002 14:48:31 -0800
To: www-ws-arch@w3.org
Message-ID: <C1E0143CD365A445A4417083BF6F42CC053D1529@C1plenaexm07.commerceone.com>
I've been following the Reliable Messaging thread with interest and offer,
for the purposes of discussion, the following five levels of Reliable
Messaging starting from a simple "Acknowledgement Only" and ending with
"Reliable Processing" where each level offers gradually increasing "degrees
of reliability" ...

LEVEL 0 - Acknowledgment only
-----------------------------
Upon request, an acknowledgment message is returned as a response to the
sending of a message. The minimum semantic of the acknowledgement message is
that the original message has been received and persisted and therefore,
barring catastrophes, it should not be lost and therefore will be processed.
The acknowledgement message can *optionally* return the following additional
status information:
a) The message structure is valid (in a SOAP context this could be split
into validation of the envelope, header, body and/or any attachments)
b) All the checks in a) plus checking that the content of the message is
valid, e.g. data, codes and identifiers in the message have been checked for
validity against their datatypes and/or reference information - e.g.
databases
c) Either or a) or b) above and the fact that the message has been passed on
for processing - e.g. to the application

LEVEL 1 - Simple Reliable Messaging
-----------------------------------
This is based on Level 0 (Acknowledgment Only) with the following
extensions:
1. Each "original" message that is sent contains an "expires at" time which
indicates to the destination that, if they receive the message after this
point in time, they MUST NOT process it.
2. If the acknowledgement message is not received by the sender, after some
period of time then the original message is resent
3. Step 2 is repeated as required until an acknowledgement has been received
or the "expires at" times has passed. If no acknowledgment was received, the
sender gives up and *presumes* that the message was not delivered
4. The receiver of the message looks for duplicate messages and, if one is
found, does not "process" it but, instead, resends the acknowledgement
message
5. If the destination receives a message they have not seen before where the
"expires at" time has passed, then they reject the message with an error.

Note that this does not solve the "two army" problem that has been discussed
earlier in this thread - but see Level 3 (Reliable Messaging with Recovery)
below.

LEVEL 2 - Connection based Reliable Messaging
---------------------------------------------
This is based on either Level 0 (Acknowledgement Only) or 1 (Simple Reliable
Messaging) and involves the sending of an inquiry to the destination of the
message **before sending the actual message** to determine the availability
of the service that is accepting messages at the destination, i.e.  is
running or not and is it able to accept messages.

The idea is that if you do a successful inquiry and immediately follow it by
sending the actual message then effectively you have "set up a connection"
and so you are very likely to realize success. This could also be very
useful if you are sending a "large" message.

LEVEL 3 - Reliable Messaging with Recovery
------------------------------------------
This is based on Level 1 (Simple Reliable Messaging), reuses the service
availability inquiry from Level 2 (Connection Based Reliable Messaging) and
adds an inquiry on Message Status. It works as follows.
1. Firstly the sender of the original must have "given up" (see level 1),
then, some time later,
2. The sender optionally uses the service availability inquiry from Level 2
to inquire on the current status of the service that was the destination of
the original message
3. The sender then determines the status of the original message that was
sent by doing a Message Status Inquiry targeted at the destination. In
return the destination sends another message that indicates:
  a) There was no record of the original message, or
  b) The original message was received and so resends the acknowledgement,
together with a status that indicates that processing is either: not
started, in progress, complete or not known
3. Depending on the response, the sender can take one of the following
actions:
  a) Resend the original message (or perhaps a new version of it as the
original might have expired),
  a) Cancel the original message - i.e. do not process it, or
  b) Wait for the response to the message to arrive (see also Level 5 below)

Note that a status on the inquiry response of "not known" is valid since the
solution at the destination that is providing reliable messaging support may
have no way of determining the status of the processing of the message as
the processing is being carried out by another piece of software that cannot
provide that information.

LEVEL 4 - Connection based Reliable Messaging with Recovery
-----------------------------------------------------------
This is a simple combination of levels 2 and 3 where a query on the
availability of the service is done first as in Level 3, but, if the
acknowledgement was not received and the sender "gave up", then a Level 4
Recovery is attempted as well.

LEVEL 5 - Reliable Processing
-----------------------------
Personally I don't think that this level should be part of Reliable
Messaging and should be part of Choreography instead. I am including it in
this email for completeness and so that we can determine that it is out of
scope. Anyway, here's the description ...

All the previous "reliable messaging" approaches are concerned with the
delivery of a single message. However, often a message is sent as part of a
larger (and longer) sequence of exchanges (i.e. a choreography). An example
use case could be where a buyer sends an Order to a Seller. Later, the
Seller should return an Order Response which indicates the extent to which
the seller can (or can not) satisfy the order. 

Now the Order could be sent reliably using one of the levels described
above. Similarly the Order Response could be sent reliably. But suppose the
Order Response does not come when expected - none of the earlier "reliable
messaging levels" help. This will most likely be due to some processing
error at the Seller where the original message was lost.

To handle this you could go the further, final step to support "Reliable
Processing" which includes the following additional steps:
1. The sender of the original message determines when the response message
(e.g. the Order Response and NOT the RM Acknowledgement) should be received.
2. If the response message is not received by the anticipated time:
  a) The sender inquires on the status of the *processing* of the original
message (i.e. not just its delivery). The response to the inquiry will
indicate either:
    - the message was never received (even though it might have been sent
reliably!)
    - the message was received and its processing is either: not started, in
progress, or complete
  b) Depending on the response the sender can either:
    - cancel the original message (e.g. it processing had not been started)
    - wait for processing to complete, or
    - request the response to the message to be resent.

Thoughts?

David
--
Director, Product Management, Web Services
Commerce One
4440 Rosewood Drive, Pleasanton, CA 94588, USA
Tel/VMail: +1 (925) 520 4422; Cell: +1 (925) 216 7704
mailto:david.burdett@commerceone.com; Web: http://www.commerceone.com
Received on Thursday, 12 December 2002 17:48:23 UTC