Re: Internal processes and/or external choreographies (was RE: Ev ents and States ... from Assaf Arkin on 2003-04-14 (public-ws-chor@w3.org from April 2003)

From: Assaf Arkin <arkin@intalio.com>
Date: Sun, 13 Apr 2003 18:06:51 -0700
To: "Burdett, David" <david.burdett@commerceone.com>
CC: "'Martin Chapman'" <martin.chapman@oracle.com>, "'Cummins, Fred A'" <fred.cummins@eds.com>, jdart@tibco.com, public-ws-chor@w3.org
Message-ID: <3E9A09AB.3000507@intalio.com>
David,

Before answering your question we need to decide on what path to follow.

The example I have given is loosely based on the pi-calculus model but 
using a more friendly syntax. Inventing yet another process calculus 
language is not interesting and not what we're here to do. We want to 
define a language for WS choreography and that means we need to look at 
more specific issues regarding Web services and how we plan to use them.

Having said that, we need to decide where to take that discussion. Let's 
assume for a second that pi-c is an interesting model we want to use. So 
the next question is: how do we make a language out of it that deals 
with WS. We will definitely need a construct called choreography that is 
different from a construct expressing a role's play in the choreography 
(both of which are called a process in pi-c). We will need to talk about 
WSDL operations and services (end-points) instead of just channels.

But we didn't make such a decision yet. So we need to decide whether we 
want to explore what the pi-c model looks like before working on a WS 
choreography language. To do that we'll need to adopt some pi-c terms, 
e.g. accept that we're talking about channels before exploring what it 
means in terms of services, and accepting that we're talking about all 
kind of processes, some of which we'll end up using to depict 
choreographies, but they're still all processes in the pi-c sense.

>QUESTIONS/COMMENTS
>1. I don't think I would call it "process buyerSeller" as buyer and seller
>are roles and they can have more than one choreography between them. I also
>like the word Choreography rather than process (as you describe), so perhaps
>a better name  would be something like "Choreography orderManagement".
>  
>
>2. Following on in the same theme, using "process seller" and "process
>buyer" is ambiguous as you will have more than one process at the buyer and
>seller. So how about "process acceptOrder" and "process placeOrder" where
>each has a property that identifies the role which performs the process
>giving you: "process acceptOrder, role seller" and "proccess placeOrder,
>role buyer".
>
>  
>
This is merely a matter of naming and a name is just a name. But I 
perfecly agree with you on the choice of names, so let's use acceptOrder 
and placeOrder instead.

As far as the model is concerned, a parallel composition is a type of 
process. I've specifically used the term 'process' to make my example 
consistent with pi-c. I wasn't trying to propose what the language would 
end up looking like in its syntax, but rather show an example from the 
perspective of the model.

When it comes down to the language we'll definitely want a more 
appropriate construct name to distinguish between different type of 
processes. But as far as the model is concerned, a composition of 
concurrent interacting processes is just another process.

>3. The two statements "receive order" and "send order" are not targeted at
>any particular role. If you have more than two roles (I know this example
>hasn't) then where a message is sent to or comes from is not clear. So would
>it make sense to have, instead, "send order to seller" and "receive order
>from buyer"? Alternatively would it make even more sense to target the
>process at the buyer or seller as in "send order to seller:acceptOrder" and
>"receive order from buyer:placeOrder"?
>
Let's look at the problem we're trying to solve. Let's say we have two 
processes (I want to avoid the term role for now and get back to it 
later on), one if placeOrder, one is acceptOrder and one we haven't 
discussed yet is arrangeShipping. When you send a message you want to 
know who is going to receive it.

There are many ways you can do that. But at this point we're discussing 
the pi-c model, so we want to see how pi-c solves that problem. In pi-c 
all communication is based on channels that are known to all the 
communicating processes. So what you want to know is who is listening to 
a particular channel and who is sending to that particular channel.

Channels are typed, so you send different messages to different 
channels. For example, you send an order request on one channel and you 
send an error or another channel. You may have a family of order 
messages, and you may have different channels that you can use to send 
orders, e.g. one channel is used when you submit an order, another is 
used when you change an order, yet another is used when you recieve an 
order in response to a query.

So going back to this example, let's say the seller is listening on the 
orderSubmit channel in the same overall state when the buyer is sending 
on the orderSubmit channel. So it's clear that the buyer is talking to 
the seller. At the same time the shipper may send or receive on other 
channels, but using distinct channels removes all confusion.

You can think of channels like URLs. If X sends to one URL and Y sends 
to another URL and Z listens to both URL, it's clear to Z who sent what 
message because these are distinct URLs. This is a simplification. In 
practice channels are often multiplexed, which means you may send on 
different channels to the same end-point. But there's always some 
mechanism to identify which channel is used even as it's multiplexed. 
For example, different EDI transactions are considered different 
channels. Or different WSDL operations, or different message types in a 
queue, etc.

>4. In the current definitions, the sending and receiving of the order is
>defined separately for each role/process. There is nothing that separately
>identifies the sending of a message. Therefore there is no single place to
>define its semantics. Would it make sense to separately define:
>  a) The Message first (strictly I mean the Message Family),
>  b) The Interactions, i.e the sending of a message in a message family
>(this would let you distinguish between instances when the same message is
>sent more than once), this could also then define the from and to roles /
>processes.
>  c) The use of the interactions in the individual processes (e.g. process
>AcceptOrder)
>
Going back to sending/receiving over channels, two processes communicate 
when they exchange messages over the same channel. For example:

process X
  send abc

process Y
  receive abc

X will send some message (call it message x from family X) over channel 
abc. Y will recieve that message because it's listening to channel abd. 
In contrast:

process X
  send abc

process Y
  receive def

X may send some message x, but Y will not receive it because it's 
listening to some other channel. X may send a message x (order) from 
family X (orderFamily) on channel abc (orderRequest) but at this point 
in time Y is expecting message from family X (orderFamily) on channel 
def (orderChange). So even though the message family is the same, they 
are not talking to each other.

A channel can be as simple as a WSDL operation. So in a WS-based 
language I would say something like:

send operation=xyz
  ... construct message here ...

receive operation=xyz
  .. retrieve message contents here ...

A channel can also be specific to a given service, which is the real 
case - you want to allow multiple services to be used subject to the 
same choreography definition. So you would also add some way to 
reference services, e.g.:

send operation=xyz
  service=myBuyer
  ... construct message here...

receive operation=xyz
  ... retrieve message contents here...

Note that the receiver doesn't have to identify anything. The sender 
uses some property myBuyer that identifies the receiver service (e.g. 
it's end-point). That allows it to bind to that port in order to send 
the message. The receiver is already bound to that port by act of 
listening to incoming messages on its end-point. So it will receive any 
message send to that service (itself) and will not receive messages sent 
to other services.

This is a bit more complicated so we may want to back track for a second 
and go back to discussing the basic, like assuming at this point that 
there is only one buyer and one seller.

>5. For each choreography to be correct, each send of a message of a
>particular type, must have a corresponding receive and vice versa. How do
>you check the individual processes to make sure that this is correct? Also,
>how do you make sure that the individual processes exchange messages in the
>correct sequence? For example, you could have the following incorrect
>choreography ...
>
>process buyer (placeOrder)
>  receive orderResponse
>  ...
>  send order
>  ...
>  receive errorResponse
>
>process seller (acceptOrder)
>  receive order
>  ...
>  send orderResponse
>  
>
This is where we get into reduction. Reduction is what enables you to 
determine that these two processes interact with each other so they both 
end up in the same end-state and don't deadlock or just end up stuck. 
Think for example of using reduction in a mathematical formula:

  x + 4 = 2*x + 2
reduce one x from each side
  4 = x + 2
reduce 2 from each side
  2 = x
decide that x=2
  0 = 0
QED

Let's rewrite the correct placeOrder process in short form:

  placeOrder = send:order.recieve:errorResponse.0

This process can only be reduced to recieve:errorResponse.0 but cannot 
be reduced to 0, so it will never complete. You'll see why in a second.

Another process is acceptOrder which can be written:

  acceptOrder = receive:order.send:errorResponse.0

This process cannot be reduced at all.

Now let's add a parallel composition process called orderPlacement and 
see if it can be reduced:

  orderPlacement = placeOrder | acceptOrder

First step, write it fully by expanding the definitions given above:

  orderPlacement = ( send:order.recieve:errorResponse.0 ) | ( 
receive:order.send:errorResponse.0 )

Now, the left process can be reduced by sending the message over the 
channel order. This means that the right process can receive that 
message (since it uses the same channel) and so can be reduced as well. 
We now get to:

  orderPlacement = ( receive:errorResponse.0 ) | ( send:errorResponse.0 )

Now the left process cannot be reduced on its own. But the right process 
can be reduced by sending on channel errorResponse, and this allows the 
left process to be reduced as well by receiving from that channel. So 
now we get:

  orderPlacement  = 0 | 0

And since 0 = 0 | 0 we get

  orderPlacement  = 0

QED

We have just proved that the process orderPlacement can indeed terminate 
by reducing it to 0. This is basically what it's all about.

If you try to do the same with the example you gave above you will see 
that it cannot be reduced. In other words, that choreography will never 
terminate. And that's something that can easily be determined.

So what you're testing for is the ability to reduce a process to 0. If 
you can reduce it to 0 you know it will terminate. If you cannot reduce 
it to 0 you get stuck at some point and you can look at why it got stuck.*


I didn't comment on the example you gave. It's correct, but at this 
point I'm trying to explain the pi-c model. If we decide not to go with 
this model we can begin to explore other possibilities and the one you 
gave is definitely an interesting one. But as far as the example goes, I 
want to illustrate that the problem simply doesn't exist.

6. I also have some questions about composing one choreography out of two or
more others, but think we should save that for later 

The model defines that that Q | R = (Q) | (R). This is not saying much 
but it means that the following is also true:

Q | R | S | T = ( Q | R ) | ( S | T )

If Q | R is one choreography, and S | T is another choreography, you can 
combine the two to create yet another Q | R | S | T choreography. So 
choreographies can be recursively composed ;-)

arkin

* As a side note, this brings us to consensus algorithms. If you can 
reduce to 0 then you terminate and in other words reach consensus. If 
you cannot reduce to 0 you cannot reach consensus. This is important 
when discussing time-out. If you add time-outs you introduce additional 
ways to reduce that bring about consensus by fact that no message was 
received!
Received on Sunday, 13 April 2003 21:10:14 UTC