Goal AC017 of the requirements says that the architecture must satisfy the requirements of enterprises wishing to transition from traditional EDI and more specifically requires that the architecture must support reliable messaging. Subsequent discussions have revealed that there is a hierarchy of concepts involved with reliability and considerable confusion can result from mixing considerations appropriate to the various levels.

The highest level on which one can view reliability is at the system level. That is, the reliability of the Web service as a whole. This general view of reliability focuses on the predictability of the quality of service. In other words, does the system predictably do what it is supposed to do? This consideration is generally separate form concerns of fault tolerance, availability or security, but there may of course be overlapping issues.

There are a number of simple things that can be observed about system reliability. First, reliability may have various aspects and considerations that depend on context. For example, in some contexts a car may be considered reliable on the basis of metrics such as the requirement that turning the key will, with high probability of success, start the engine in a few seconds. But if the context is an icy mountain road the metrics for reliability may be of an entirely different nature. In all cases, however, understanding reliability depends crucially on defining metrics for the quality of service.

Reliability may not be related simply to the end result of the operation but also to factors like visibility. For example, two applications may have the same probability of producing a particular successful result, but if one of them displays the inner workings of the process in such a way that makes it possible to anticipate and avoid developing problems or to analyze and correct errors after the fact, that application may reasonably be considered more reliable than one which acts as a black box.

In practice, many of the aspects of system reliability are handled under the umbrella of the management function. However, there is a subset of system reliability that relates to messaging that is commonly considered separately and referred to as reliable messaging. In general, this refers to a predictable quality of service related to the delivery of the messages involved with the Web service. In more detail, the sender of the message would like to be able to determine whether a given message has been received by its intended receiver so that it is possible to take compensating action in the event the message has not been received. On the other hand, the intended receiver of the message may wish to be assured that it has received and processed a message once and only once. The general goal of reliable messaging is to define mechanisms that make it possible to achieve these objectives with a high probability of success in the face of inevitable but unpredictable network, system and software failures.

The goals of reliable messaging can be made more explicit by considering the issues related to multiple receptions of a message and message intermediaries. If there is an intermediary, does the sender want to know whether the message got to the intermediary or to the intended end recipient? Does the receiver care whether it receives a message more than once? The following classification of reliable messaging expectations is adapted from the ebXML Messaging Services Specification, Version 2.0 (http://www.ebxml.org/specs/ebMS2.pdf).

Duplicate-Elimination Ack Requested From End Receiver Ack Requested from Next Receiver Comment
1 Y Y Y Once-And-Only-Once End-To-End, At Least Once to Intermediate
2 Y Y N Once-and-only-Once End-to-End based on end-to-end retransmission
3 Y N Y At-Least-Once at the intermediate level, Once-and-only-once end-to-end if all the intermediaries are reliable, no end-to-end notification
4 Y N N At-Most-Once end-to-end, no retries at the intermediate level
5 N Y Y At-Least-Once with duplicates possible both end-to-end and at intermediate level
6 N Y N At-Least-Once with duplicates possible both end-to-end and at intermediate level
7 N N Y At-Least-Once to the intermediaries and the end.  No end-to-end notification
8 N N N Best Effort

 

 
The goals of reliable messaging may also be examined with respect to whether one wishes to confirm only the receipt of a message, or perhaps also to confirm the validity of that message. Three questions may be asked about message validity:

1 - Was the message received the same as the one sent? This may be determined by such techniques as byte counts, check sums, digital signatures.

2 - Does the message conform to the formats specified by the agreed upon protocol for the message? Typically determined by automatic systems using syntax constraints (eg xml well formed) and structural constraints (validate against one or more xml schemas or WSDL message definitions).

3 - Does the message conform to the business rules expected by the receiver? For this purpose additional constraints and validity checks related to the business process are typically checked by application logic and/or human process managers.

One may ask which of these message validity issues belong in the technical infrastructure represented by the Web services architecture. The usual answer is that the first is definitely in-scope, the last is definitely not, and the second may or may not be depending on the system being considered. [Note: it seems to me that the WG may wish to decide that 2 is definitely in or definitely out of scope].

The final layer of this conceptual hierarchy is the actual mechanism used to implement the reliable messaging goals. This is most often achieved via an acknoweldgement infrastructure, which is a set of rules defining how the parties to a message should communicate with each other concerning the receipt of that message and its validity. WS-Reliability (www.oasis-open.org/committees/download.php/1461/WS-ReliabilityV1.0Public.zip) is an example of a specification for an acknowledgement infrastructure, as is WS-ReliableMessaging (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnglobspec/html/ws-reliablemessaging.asp).