Correlation and the need for causality

I wanted to raise an issue related to the content of correlation 
elements in a
message and thought it probably belonged in a separate thread from the
discussions on the header/body and CDL/technology-binding issues 
associated with
correlation.

One of the key characteristics of a choreography (for me) is that its 
behaviour
is intrinsically distributed.  In any distributed application, 
causality or a
good approximation of causality must be the basis of any correlation 
decisions
otherwise you have the potential for anomalous and often incorrect 
behaviour if
causality is not respected.  This is because messages can arrive at a
destination out of causal order, even if point-to-point ordering is 
guaranteed.
For example, if a purchase order (PO) from a new customer must be 
accompanied by
a guarantee of credit-worthiness from a trusted third party, you could 
have
something like:

1. Customer sends PO to supplier
2. Customer asks third party to validate credit-worthiness to supplier
3. Third party sends credit validation directly to supplier
4. Supplier receives credit validation from third party
5. Supplier receives PO from customer

This is a simple case, but a "dumb" supplier would probably discard the
credit-worthiness message from the third party because it doesn't know 
anything
about the customer, yet the causality is quite clear and implicit in the
interactions.  It gets more complex when you add the possibilities 
associated
with multiple distinct POs.  Here are two ways of solving this problem:

1. Modify the supplier to hold credit validations until a PO arrives.
    Problem is, how do you match them when a PO does arrive?  You could 
use
    a PO identifier that is unique to the customer.  But how do we 
specify
    correlation with application-specific information?  Does it belong 
in the
    choreography?  If so, then the CDL needs semantics to specify these
    correlation rules and the choreography "engine" must be able to 
interpret
    them.  If not, the correlation must done by the application so why 
bother
    putting it in the CDL at all?

    So, now that we've solved this particular problem, how do we solve 
the next,
    different correlation problem?  We need another specific solution 
with the
    same problem of representing the correlation logic in the CDL.

2. Have a CDL semantics with an explicit notion of causality.  Specify 
that the
    receipt of an order must be accompanied by a causally-succeeding 
credit
    validation in the CDL.  Have an implementation engine or 
infrastructure that
    can interpret and/or enforce such constraints.

There may be other ways, but they all eventually fall back to the 
fundamental
issue of identifying the actual and required causal relationships 
between
messages/events.  The first solution above is doing that in an
application-specific manner and is not particularly friendly or 
portable.  I
would argue, therefore, that the CDL must have a notion of causal 
relationships
and the ability to specify required causal relationships.  In an 
implementation,
a representation of causality will be the content of any correlation 
header or
body element.  There is a considerable body of research in this area 
but at the
end of the day you require vector clocks or a reasonable approximation 
to get
usable semantics.

Other thoughts?

AndyB

Received on Monday, 8 September 2003 10:47:45 UTC