RE: New Paper available for PDF download: Workflow is just a Pi process (or WFM is not BPM) from Jean-Jacques Dubray on 2003-11-17 (public-ws-chor@w3.org from November 2003)

From: Jean-Jacques Dubray <jeanjadu@Attachmate.com>
Date: Mon, 17 Nov 2003 09:00:42 -0800
To: 'Greg Meredith' <gregmer@microsoft.com>, Andrew Berry <andyb@whyanbeel.net>
Cc: public-ws-chor@w3.org
Message-ID: <D15269CBED76D51185280008C73323FA023E9486@exch-bel6.attachmate.com>
>It is very difficult to model this without mobility. But, this scenario is
all over the place in workflow. It is >especially prevalent in situations
involving a broker-- which is one of the most dominant patterns to be found
in >the domain. 

It is the most prevalent pattern, simply because the technology that is
being sold today cannot support efficiently another pattern. The challenge
of SOA is precisely to invent a technology where P2P is the prevalent
pattern rather than P2B2P. The role of a broker in SOA is merely reduced to
the one of a registry. Nothing says that I want Amazon to find a shipper for
me. What if I don't like this shipper, what if I use another shipper which
is more efficient for my needs? 

The Broker model, like in EAI brings some economies of scale, but solves the
problem by creating other problems, in particular a central focal point.
This is precisely what SOA projects like ebXML have not built in their
architecture. Sure you could be a broker with an ebXML architecture, that
would work, but nothing in the architecture would require to have a broker.
Brokers are useful, but like every good things, we have to use them
reasonably.

So creating a technology that re-enforces this business model (e.g.
Orchestration as a broker not as a peer) is a non-goal for me.


>1. completeness -- i.e. Turing complete
>2. compositionality -- the model is an algebra, the practical advantage of
which is that large(r) programs are 
>built from small(er) ones 3. concurrency -- the model has an explicit
account of autonomous execution 4. cost -- >the model has an explicit
account of resources like time and space

>Turing machines, for example, fail on features 2 and 3.
>Lambda calculus fails on 3 and 4.
>Petri nets fail on 2.
>CCS, CSP fail on 4.

This argumentation is very seducing (I am seduced), but IFAIK, windows
operting systems and C# for instance can support all 4 points. Does it mean
that windows and C# were build on a pi-foundation? (or maybe I am not
understanding what the 4 points are about). I am not questioning the
validity of your claims about formal models, I am more questioning the
validity of a formal model to get a certain job done. What would pi-calculus
give us that a well designed metamodel executed by existing thenology
(windows and C#, or JVM/Java) would not give us? What would be the impact of
having to use pi-calculus if we establish that this is the only formalism
that we can use: do we have to go back and change microprocessor
architectures, operating systems, languages? Or would pi-calculus give us
hints on how to design the best metamodel of a process?

Cheers,

Jean-Jacques Dubray
Attachmate,
tel: 425-649-6584
Cell: 508-333-7634

-----Original Message-----
From: Greg Meredith [mailto:gregmer@microsoft.com] 
Sent: Monday, November 17, 2003 8:28 AM
To: Andrew Berry; Howard N Smith
Cc: public-ws-chor@w3.org; W.M.P.v.d.Aalst@tm.tue.nl
Subject: RE: New Paper available for PDF download: Workflow is just a Pi
process (or WFM is not BPM)


Andrew,

You raise important concerns for workflow. i completely agree with you that
a decent account of workflow must address locality/distribution and partial
state. But, i must beg to differ on your analysis of the pi-calculus with
regards to partial state.

First, the notion of state must be identified with process in the
pi-calculus. Intuitively, a state is represented by what the process can do
based on what it "knows", i.e. what actions it is willing to engage in,
given what names are in scope. A really good example to consider is modeling
a cell where you can store a value. (See,
http://www.lfcs.informatics.ed.ac.uk/reports/91/ECS-LFCS-91-180/ECS-LFCS
-91-180.ps, page 35.) 

Consider a collection, P_i, of processes. Since each process represents a
state, then an aggregate, or partitioned state may be represented by the
parallel composition of the P_i's, P = P_0 | P_1 | ... | P_N.

Notice that in any standard reduction rules for pi-calculus, the rule for
reduction in the parallel composition context will allow these processes to
reduce independently. Thus,

P_0 | P_1 | ... | P_N ->* P_0 | ... | P_j' | ... | P_k' | ... | P_N.

State change has not happened all at once for all of P. Bit's and pieces of
it have updated, but not the whole thing. You would have to introduce a
protocol, e.g., 2PCPA, amongst the participants of P to get certain kinds of
atomicity and isolation guarantees regarding the visibility of state change.
Fortunately, 2PCPA *is* a protocol and as such can be well described in pi
(see Berger and Honda's paper for a treatment of this,
ftp://ftp.dcs.qmw.ac.uk/lfp/kohei/express00.ps.gz). Therefore, the agents
providing this protocol can be composed with the agents of P to give the
overall semantics desired. 

Note that, since this introduces a coding overhead, various researchers in
the process algebra community have added primitives to the calculus to
abstract this coding. This foreshadows a more general point i want to make
that can be illustrated by considering the issue of modeling
locality/distribution.

i completely agree that locality and distribution are notions almost
completely lacking in plain vanilla pi-calculus. Unfortunately, i think that
a terrible type/token confusion takes over in these discussions. It should
be plainly obvious that barebones, plain pi-calculus cannot be used for
serious applications like workflow without considerable enrichment. For
example, 

1. real workflow applications will describe message flows branching on
numeric computation; the pi-calculus doesn't have a useable theory of
numeric computation; and the encodings of numbers to be found-- though quite
intriguing-- would simply be too arduous with which to code; 2. real
workflow applications will describe message flows with complex message
structure, e.g. messages with structure like XML documents; neither monadic
nor polyadic pi-calculus is up to this task; 3. real workflow applications
require that there is not a global name manager; plain vanilla pi-calculus
requires that there *is* one; 4. real workflow applications are probably not
going to require a heavy-weight protocol to ensure-- in a distributed
setting-- the summation semantics the pi-calculus delineates.

That said, the pi-calculus provides a *framework* in which to develop the
appropriate formalism. This framework is objectively and demonstrably
different from the other models of computation put forward.
And, it is better suited to the modeling of domains like workflow than any
other model put forward so far. i will return to this point in a moment.

So, as long as we recognize that the pi-calculus is really a stand in for
the class of mobile process algebras, then we are much more likely to
achieve an understanding of how the pi-calculus can genuinely help model
scenarios in the workflow domain. With respect to distribution and locality,
there are several very variations of the pi-calculus that provide very
useful accounts of these notions. For example, Vasconselos, et al, recently
developed lsd-pi which addresses distribution in a typed setting
(http://www.di.fc.ul.pt/~vv/papers/02-4.pdf). Another approach to these
problems is found in the join-calculus of Fournet
(http://www.cs.unibo.it/~laneve/papers/bisim.ps), et al. Another approach is
found in the work of Wischik, et al, on explicit fusions
(http://www.cs.unibo.it/~laneve/papers/fm-eabs-concur02.ps). 

Just as you will have to adapt the framework to provide a variant that deals
with complex message structure, you will have to adapt the framework to
provide a variant that deals with distribution. There are several flavors.
Try a few on a few problems and see which one is better suited. If none are
suited, that's wonderful, we have discovered something!

Now, as for the suitability of the framework to this domain, it turns out
that the mobile process algebras are the first model of computation to
simultaneously enjoy four features

1. completeness -- i.e. Turing complete
2. compositionality -- the model is an algebra, the practical advantage of
which is that large(r) programs are built from small(er) ones 3. concurrency
-- the model has an explicit account of autonomous execution 4. cost -- the
model has an explicit account of resources like time and space

Turing machines, for example, fail on features 2 and 3.
Lambda calculus fails on 3 and 4.
Petri nets fail on 2.
CCS, CSP fail on 4.

And, of course, each one of these also has the very same issue in that they
are abstractions, frameworks, not ready-made models, and will have to be
adapted to fit the domain. For example, it would be much too onerous to use
Church numerals (ala lambda calculus) to do the arithmetic calculations on
which to make workflow decisions.

Noting that the pre-mobile process algebras only lack a notion of cost, it
is most instructive to see how the introduction of mobility simultaneously
provides many important features of both practical and theoretical import.
For example, an account of space consumption of a program is available in pi
(and its variants): count the fresh names generated by a computation. It is
also quite necessary as a practical feature in workflow. Consider the
following scenario.

Consumer goes to a well known port of Provider (www.amazon.com) and emits a
message containing a port (consumer@msn.com) at which she would like to be
contacted for further interaction. Provider processes consumers message,
contacts Shipper and emits a messages to Consumer with, among other things,
the port (www.ups.com/tracking) where Consumer may see the status of her
purchase. 

It is very difficult to model this without mobility. But, this scenario is
all over the place in workflow. It is especially prevalent in situations
involving a broker-- which is one of the most dominant patterns to be found
in the domain.

In my brief experience with the domain i have found that the four features
outlined above constitute a bare minimum of requirements of the
computational model necessary to model workflow without imposing undue labor
on the part of the modeler. The mobile process algebras are objectively, the
first models of computation to enjoy these properties simultaneously. 

Very likely, now that we have examples of models that enjoy these properties
together we will come up with new and better ones. But, the only way i know
how to do that is to go about the job of modeling real application scenarios
with the best technology available and seeing where the technology falls
short, and then, seeing what it takes (from minor tweak to paradigm shift)
to account for what's actually happening or needs to happen in the
application.

Best wishes,

L.G. Meredith

P.S. There is a coda to this discussion regarding the difference between
modeling workflow and providing *public descriptions* of a flow. A model may
be quite detailed and provide information about implementation and strategy
that a business is not interested in revealing to its customers or
competitors. A public description has one primary function -- to facilitate
search and discovery. Given this distinction, the language in which public
descriptions are expressed should *not* be complete.

Fortunately, in this connection, the mobile process algebras present another
distinguishing characteristic. Over the past decade, a notion of behavioral
typing has emerged and been effected in the mobile process algebra setting.
The languages for these types have exactly the right properties to be used
as the basis for public descriptions of processes.
See my recent paper in the ACM for a more detailed discussion of these
points.
(http://portal.acm.org/citation.cfm?id=944217.944236&coll=portal&dl=ACM&
idx=J79&part=magazine&WantType=Magazines&title=CACM)

-----Original Message-----
From: public-ws-chor-request@w3.org
[mailto:public-ws-chor-request@w3.org] On Behalf Of Andrew Berry
Sent: Monday, November 17, 2003 2:57 AM
To: Howard N Smith
Cc: public-ws-chor@w3.org; W.M.P.v.d.Aalst@tm.tue.nl
Subject: Re: New Paper available for PDF download: Workflow is just a Pi
process (or WFM is not BPM)


Howard,

You have a fundamental problem with the choice of Pi Calculus: there is no
concept of locality or partial state. In choreography and web services in
general, you can guarantee that participants (processes) are physically
distributed and need to make choices based on a partial view of state.  To
successfully model, program and reason about these processes, you need to be
able to identify and reason about partial states.

Consider your deferred choice semantics.  If the processes identified as
choices are physically distributed, you *cannot* make a choice without
synchronisation of processes because distinct choices can be made in a truly
concurrent fashion.  Pi Calculus has no way of identifying this issue, let
alone reasoning about it.  Explicit synchronisation processes, while solving
the problem for a given process, require that the programmer reason about
distribution and locality outside the bounds of the Pi Calculus semantics.
I would therefore argue that a worflow and in particular a choreography is
not a Pi Process.

Ciao,

AndyB


On Wednesday, November 12, 2003, at 03:00  AM, Howard N Smith wrote:

>
> Choreography pioneers,
>
> Following a short conversation with Steve R-T, he agreed for me to 
> send you this paper.
> It is intended as a draft for discussion.
>
> The paper is new information. It shows how, based on BPML, it is 
> possible to model all of the advanced workflow patterns identified by 
> workflow theorists, whereas most workflow engines only support approx 
> 50% of patterns directly and very few of the advanced patterns.
> In addition, it gives insights into the BPML implementation inherent 
> to a BPMS, and how a BPMS is able to support many process models not 
> supported by workflow technology.
> Screenshots from Intalio|n3 BPMS are given as examples. Further, the 
> workflow engine itself can be modelled in BPML, as reusable processes 
> for use in end-to-end processes. The paper was written to more fully 
> explain the work of BPMI.org and its direction in creating BPMS 
> foundation technologies.
>
> Peter Fingar and I have taken great care with this paper, and do hope 
> it adds to the understanding of BPML/BPMI/BPMS direction. While the 
> paper cannot present proof of these claims, you can consider it a 
> report on the work so far.
>
> The paper can be downloaded from:
>
> http://www.bpm3.com/picalculus/workflow-is-just-a-pi-process.pdf
>
> Regards,
>
> Howard
>
>
> ---
>
> New Book - Business Process Management: The Third Wave www.bpm3.com
>
> Howard Smith/CSC/BPMI.org
> cell             +44 7711 594 494 (worldwide)
> home office +44 20 8660 1963
>
Received on Monday, 17 November 2003 12:11:49 UTC