Re: Formal semantics paper - from Marco Carbone et al

Hi Greg, and all,

I write this mail as a reply to your mail on "search", but in fact it is to
clarify what we did in our work. On the topic of search, Gary already
commented, to which I have little to add, except noting it does look an
interesting subject.  How we search interactions --- how fascinating a
topic it is! In what structure, in what format, in what features! Anyway,
what we did in our work is much simpler: its aim (I hope it has been
somehow achieved) is to underpin a basic status of one useful way to
describe interactions, which we found to be different from what we
have been studying so far.

While I guess,  from what you wrote, you fully understand this point and
raising the issue you are interested in, I feel this can be a good occasion
to clarify the positioning of this work, for those who are in this 
mailing list.
Besides,  our paper got rather long, so a summary may turn out to be
convenient.

What we did in our work is indeed very simple. We have related a language
for global description of interaction (a distilled version of WS-CDL) to 
the
pi-calculus. But what is a global description? Many researchers have been
using the following notation for describing cryptographic protocols:

(*)
Alice --> Trent:   Alice, Bob, Nonce
Trent --> Alice:   {KAB, Nonce}_KA, {KAB}_KB
Alice --> Bob:    {KAB, Nonce2}_KB
Bob--> Alice:  {Nonce2}_KAB
(and Alice and Bob start communication)

We can write this as three pi-calculus processes.

(**)
Alice [ send to Trent <....>, receive from Trent <...>. send to Bob<...>...]
Trent [receive from Alice <....>, send to Alice <...>...]
Bob [receive from Alice <...>. send to Alice <...:>...]

which does describe the target interaction (*) completely: there is no 
loss of
information. But (*) is what people use, not (**). So a global 
description looks
convenient. In fact a lot of similar practice exists already, say UML 
sequence
diagrams, say message sequence charts, etc. etc.  These predecessors of
WS-CDL show how global descriptions are engineeringly useful. WS-CDL is
an instance of such a method, with all basic constructs for describing 
channel-
based interaction.  Meeting this language, we invited scientists asked: 
what is
this? What is its theoretical status?

First of all, we may start from why this descriptive method is useful in 
general.
Many answers are possible, from social to mathematical, but a simple and
direct answer is that, in many applications, it is nothing but this 
global scenario
itself, i.e. how interaction among multiple parties would proceed, which 
is the
central concern of the application (I remember, in early days of our 
participation,
how Nick emphasised the importance of having explicit presentation of 
mutual
interaction as such). 

This is especially true in business protocols, which may involve 
consumers, 
since a business protocol is inherently:

--- inter-domain (in particular inter-organisational);
--- often regulation-bound;
--- demanding clear shared understanding among participants (we do not
     wish to have troubles about what each has decided to do in a protocol,
     after we decide on a shared protocol).

So we need a tool, a language, which can describe such interactions, whose
meaning should at least be clearly (and better formally) understood by all
the participants. In business protocols, these participants may as well 
have
conflict of interests among them: all the more so, it is important they 
share
a clear understanding of how each of them is to behave and what it means
for them to, for example, *violate* the protocol.

This is why a global description is useful, or rather why it is needed. 
We need
description, or else we cannot even agree or disagree how we interact in 
the
first place.  Understanding this point is essential for understanding 
the use
and status of WS-CDL as an engineering tool.

WS-CDL puts forward not only the notion of global description of 
interaction,
but, by the need to describe business protocols precisely,  equips it 
with the
fully expressive constructs for describing communications, including sum,
parallel composition, recursion (loop), as well as predicate-based 
invocation
mechanism --- what we call "when" construct (I am glad to see the influence
from process calculi in the basic structuring constructs of WS-CDL, 
conceived
even before our participation, and which we found out to work very well 
in the
theory of endpoint projection).  This is an original contribution of 
WS-CDL,
and will be recorded in the history as such. 

* * *

As I already wrote,  we started to ask what this language is precisely. 
Now that
we understood why we need it in the first place, we then asked a natural 
question
as pi-calculus experts:  what is its relationship to the pi-calculus? 
Can we relate
it precisely to this basic calculus of communicating processes? And if 
we do so,
what does it give us? Our efforts to answer this question has led to 
this work,
and also to a further clarification of the engineering status of WS-CDL.

We started from a general encoding scheme which can in fact encode 
almost any
WS-CDL descriptions (choreographies) into the pi-calculus processes. It 
needs
a lot of synchronisations, and loops demand a complex encoding, but our 
scheme
looks pretty comprehensive, realising sequencing by having one or more 
control
nodes, playing the role of a synchroniser/conductor.

In spite of its generality, however, we found our first encoding not 
entirely
satisfactory, for several reasons.

(1) The encoding has a lot of extra interactions not found in the original
descriptions which permit many ways to realise. This is engineering
confusing and may not be great for  e.g. standardisation.

(2) Theoretically, the proof to show that the resulting endpoint 
description
is faithful to the original description is very complicated, in a not-too-
illuminating way. Practically the proof looks to scale very poorly when more
language constructs are added.

(3) The encoding has a (few) single point(s) of control hence failure: 
So it is
more about an orchestration, not a choreography, in which we demand

     Dancers dance following a global scenario without a conductor,
     i.e. without a single point of control.

 (I note a different and interesting aspect is addressed by Claudio, 
Nadia and
others' work which Claudio posted a few days ago, in which a local part of
choreography is realised by adjoining several participants, which is an
interesting way to relate choreography and orchestration consistent with the
above notion of choreography).

All in all, our initial encoding looked engineeringly intractable and 
theoretically
inelegant (even though we have learned a lot with it and it has 
technical interest
for us). It is also against the spirit of choreography.  And it is at 
this point we
began to see what we really wish to achieve (it is often the case that 
we know
a real target only after working on a problem for some time...):

     Write down a global scenario: can we project this scenario to endpoint
     processes so that, when these endpoints interact with each other, 
their
     interactions *precisely* follow the original global scenario, 
without no
     more and no less interactions in comparison with the original one?

I have emphasised "precisely" since this is important for engineering (or
business) purposes: it should leave no ambiguity as to how participants
interact with one another, following the original global description. If 
not, 
we may not dare to start business protocols, even if we decided on one:
a loss of time, a loss of a lot of things. Or just think of generating 
endpoint
monitor based on global scenario: what is the status of a monitor if we
are not sure it precisely reflects what we wrote in our original global
description?

 From our first trial, we knew this is far from being simple: a global 
language
in its full expressiveness, such as WS-CDL, allows a very complex 
description
whose endpoint realisations are not at all trivial. And what we found, 
which is
the core part of our work, is the following fact:

    A global description should be "well-structured" in a certain
    technical sense, if it can be projected cleanly and in a scalable
    manner to endpoint descriptions.

In other words, there are "good" disciplines to write down global 
descriptions
which do not let you lose expressiveness and which lead to very simple and
precise endpoint projections. It is like structured programming as found in
standard programming languages such as Java, C# and ML; certainly we lose
the freedom of assemblers, but we obtain many engineering merits while
retaining expressiveness.

We have identified three basic principles, on the basis of session types.
I do not detail on them here: suffice it to say they are not intended to be
ultimate such conditions (in fact we discussed several possible 
relaxations)
but are simple and have a clear status, giving us:

 (i) A very simple and transparent endpoint projection; and
(ii) A perfect match between what you expect from global description
     as behaviour and how endpoint processes interact. All behaviours
     in a global description are realised as endpoint behaviours, and
     nothing else, with exact step-by-step correspondence.

(i) and (ii) may be understood as sort of WYSIWYG principle in this 
context.
(ii) is established as a theorem,  but (i) is as important: if we write 
down a
global scenario, we wish that it communicates clearly how the resulting
interactions would be.

By (ii) we know a rigorous one-to-one correspondence between two forms
of descriptions: with (i) and (ii), we are now sure what a global 
description
means as real interactions.  There is no obscurity, EPP is now a white 
magic
which all good people can use (I am not saying it used to be a black magic
by the way).

Another thing we did is showing our translation (EPP) is *type preserving*.
That is, once we can check:

      My global description X is well-typed (type checking is a business of
      an instant, if not a nano second)

then we can say:

      All endpoint behaviours coming from X is well-typed (which in
      particular means no communication error takes place)

without ever checking the projected code. Since designers/analysts will be
primarily working with global descriptions, it is better to perform various
validations there: well-typedness of global descriptions offers a basic 
form
of such validation. We may extend this further to advanced properties such
as deadlock/livelock  prevention,  conformance to existing code/service/
protocols/regulations. This is a very rich field, whose study will lead 
to the
direct merits to the users of WS-CDL, and where we can make the most of
rich studies on process calculi and the pi-calculus.

Once we have a clear notion of EPP, we can do many things, for example:

(1) We can generate a complete distributed application from
a global description, if we are to specify detailed business
logics in the global language and EPP it --- perhaps to
endpoints written in different languages.

(2) We can generate a prototype code from a global
description, say in Java, C# and BPEL, and elaborate it
at each endpoint.  This is already realised in pi4soa tool.

(3) We can generate endpoint monitors, dynamically
checking the conversation. This can be further enhanced
with multiparty monitors, and can also be used for
testing and debugging (again done in pi4soa).

(4) Suppose a team of programmers is working on a
distributed application. Then they first stipulate a global
interaction description, project it to each endpoint, and
checks now and again if each's code conforms to that
projection.

(5) If we are going to use an existing service or a code
as part of a business protocol, we check conformance
of the EPP of the latter to the former (the former can
also be a global description which can also be handled
similarly). This conformance uses basic techniques in
process calculi, such as (bi-)simulations.

(6) One may take the log of conversations at each end-
point, and check if it conforms to the global protocol
by the latter's EPP.

....

The list goes on and on (these are only elementary ones). But
above all we now have a basic framework, and a non-trivial one
as such, to relate WS-CDL and other global languages to the
pi-calculus. The latter gives us a theory, centring on the notion
of behaviour. This is a marriage of an original great engineering
medium and basic theory, and I am happy and thankful I could
participate in this project so far, with Marco and Nobuko, on
the basis of many people's efforts, including several brilliant
presentations by Greg himself,  and in collaboration with W3C
WS-CDL WG members.

And I know this is just the beginning. As a theory and in its use,
what we did is just a modest starting point. I believe we've managed
to reach a simple and robust theoretical basis upon which many
past and coming works on the pi-calculus,  process algebras and
programming language theory can be exploited. It is time to test
this assertion, meeting the real demands from users. And it may
also be the time for concrete infrastructural planning, even though
pi4soa is already an amazing tool.

* * *

I hope the readers of this list would forgive me for this long post.

Best wishes,

kohei


L.G. Meredith wrote:

> Kohei,
>
> Many thanks for the edifying remarks. i hope you will allow me to 
> probe on a related front.
>
> Suppose we took a different approach to the architecture of WS-CDL in 
> which what is published is something like (collections of) Kobayashi's 
> usages, (an appropriately XML-ified version of) which could be seen as 
> a simple and direct extension of WSDL, leaving more detailed 
> descriptions of service logic (in which resides potentially 
> proprietary business value) behind the firewall, so to speak.
>
> i emphasize publish in this context because i want to call attention 
> to what's of general public interest in a service description. i argue 
> that the primary function of a public service description is search. i 
> submit that in a world where SOA is the norm there will be billions of 
> services, and a service consumer will need to find services that
>
>     * do what they need them to do (a semantic question)
>     * do it in a way that is compatible with their own needs and
>       practices
>
> Further, i argue that the former is far more important than the 
> latter. Consider the case where there is only one provider. The 
> consumer will adapt to any incompatibility. (Note that just because 
> there is only one such provider does not alleviate the search burden. 
> The consumer may not know of the provider, even when there is only one.)
>
> i want to argue that a great deal of semantic information can be 
> gleaned from information about behavior. i can provide some 
> interesting and illustrative examples, if you want them. One question, 
> however, is how much information needs to public to help the consumer 
> address their needs, and what needs to remain behind the firewall to 
> protect hard won business value?
>
> Now, to bring this line of reasoning back to my question above. Does 
> the market need a standard to capture full-blown choreographies? Or, 
> is there another niche that is likely to see better adoption? For 
> example, consider a small extension to WSDL, like Kobayashi's usages, 
> which can -- in Marco's words more powerfully and more generally -- 
> ensure both compatibility issues (like eliminating deadlock) and 
> increase the information available by which to conduct semantic-based 
> search one.
>
> Best wishes,
>
> --greg
>
> On 8/2/06, *Kohei Honda* <kohei@dcs.qmul.ac.uk 
> <mailto:kohei@dcs.qmul.ac.uk>> wrote:
>
>     Hi Greg,
>
>     It is great to get reactions so quickly.
>
>     L.G. Meredith wrote:
>     > Marco, Kohei,
>     >
>     > Thanks for clarifying. i was refering to the EPP theorem on page 85,
>     > which i took as the central result of the work. Sorry, if i was
>     a bit
>     > eliptical. My thought was that the end point calculus is actually a
>     > type system and end point projection was like the calculation of a
>     > kind of minimal type. Then, you can see the EPP theorem in the
>     light
>     > of subject reduction.
>
>     I may rather say that the global calculus and the end point
>     calculus are
>     two different description
>     languages, each with its own typing system, with respective notions of
>     minimum typing.
>
>     The type structures are common, but the way descriptions are typed are
>     quite different.
>
>     The idea is to project each "well-formd" global description to a
>     collection of endpoint processes,
>     which are the code for the participants involved.
>
>     What the EPP theorem asks is:
>
>         Do these projected participants interact following a scenario
>     (choreography) which the
>         original global description has laid out?
>
>     It turns out that, if a global description satisfies certain
>     well-structuredness, or "healthiness conditions",
>     then
>
>     (1) there is a very simple endpoint projection, and
>     (2) the correspondence in behaviour is as exact as can be.
>
>     There are three well-structuredness conditions we have identified.  We
>     believe they offer a natural
>     way to do a well-structured global description.
>
>     I will post a brief discussion on the engineering meaning of this
>     result, but for the time being let us
>     say the EPP theorem offers a way to relate the global description
>     languages and the process calculi.
>     The latter gives rigorous theories of behaviours, their properties
>     and
>     composition, while the former
>     offers a useful engineering medium.
>     >
>     > i know that you have a separate notion of typing laid out in the
>     > paper, but i tend to think -- much in the way Kohei laid out -- of
>     > towers of typing of increasing strength. Abramsky gives a good
>     example
>     > of such in his Marktoberdorff lecture with Simon Gay and Raja
>     > Nagaranjan on types for concurrency.
>
>     On this point I firmly agree: types and various analyses including
>     process/program logics are great toos
>     especially when we know how to integrate them consistently.
>
>
>     >
>     > That said, i was trying to draw an analogy between the EPP and
>     > Kobayashi's usages. More specifically, in my mind there is
>     connection
>     > between the fact that EPP is a function and Kobayashi has a type
>     > inference algorithm -- apart from the practical implication that the
>     > programmer doesn't have to write the type.
>
>     So the EPP theorem is not so much about type discipline or program
>     analysis but rather about a basic way
>     to relate two distinct ways of describing interactions. It is like the
>     result on encoding of some calculi into
>     the pi-calculus, saying the encoding fully respects the original
>     dynamics, under certain conditions.
>
>     These "under certain conditions" are the well-structuredness
>     (healthiness) conditions.
>
>     I will discuss on the general picture further in my coming post.
>
>     Best wishes,
>
>     kohei
>
>
>
>     >
>     > Best wishes,
>     >
>     > --greg
>     >
>     > On 8/2/06, *Marco Carbone* <carbonem@dcs.qmul.ac.uk
>     <mailto:carbonem@dcs.qmul.ac.uk>
>     > <mailto: carbonem@dcs.qmul.ac.uk
>     <mailto:carbonem@dcs.qmul.ac.uk>>> wrote:
>     >
>     >>     This looks like a lot of work. i may be misreading, because i
>     >>     have only skimmed, but it looks as though the main theorem
>     is a
>     >>     subject reduction-like theorem in which the subject
>     reduction is
>     >>     simulation-style (ala Kobayashi's type systems) as opposed to a
>     >>     static subject reduction (ala Honda, et al's type systems).
>     Have
>     >>     you seen, therefore, Kobayashi's 2006 types for concurrency
>     paper (
>     >
>     >
>     >     Just to clarify, the paper has three main theorems:
>     >
>     >     1) Subject Reduction for the global calculus type system
>     (session
>     >     types) i.e. when the system evolves it is still well typed (
>     e.g.
>     >     I evolves to I' and I is well typed then also I' is well typed.
>     >
>     >     2) Subject Reduction for the end-point calculus (similar to
>     point 1)
>     >
>     >     3) EPP Theorem, i.e.
>     >     a) Type Preservation i.e. a well typed global description is
>     >     projected to a well typed global interaction.
>     >     b) Completeness i.e. if a global interaction I evolves to I'
>     then
>     >     its projection evolves to the "projection" of I' (EPP(I') )
>     >     c) Soundness i.e. if a projection EPP ( I ) evolves to N then I
>     >     can evolve to I' and the projection of I' is "similar" to N
>     >
>     >     1) and 2) guarantee that working with typed programs is safe
>     (you
>     >     never evolve to untyped i.e. unwanted things).
>     >
>     >     3) shows that given a global description (e.g. a WS-CDL
>     >     choreography), its end-point projection is good and respects
>     what
>     >     the programmer wanted to specify in the choreography.
>     >
>     >     Hope this clarifies the key points of the paper.
>     >
>     >     Best,
>     >     Marco
>     >
>     >     P.S.
>     >     I didn't read the paper you linked but I believe it is related
>     >     with another of his works i.e. having CCS processes as types.
>     >     Session types are related to this but Kobayashi's are much more
>     >     powerful and lose.
>     >
>     >
>     >
>     >
>     >>    
>     http://www.kb.ecei.tohoku.ac.jp/~koba/papers/concur2006-full.pdf
>     <http://www.kb.ecei.tohoku.ac.jp/%7Ekoba/papers/concur2006-full.pdf>
>     >>    
>     <http://www.kb.ecei.tohoku.ac.jp/%7Ekoba/papers/concur2006-full.pdf>)?
>     >>
>     >>     Best wishes,
>     >>
>     >>     --greg
>     >>
>     >>     On 8/1/06, *Steve Ross-Talbot* < steve@pi4tech.com
>     <mailto:steve@pi4tech.com>
>     >>     <mailto:steve@pi4tech.com <mailto:steve@pi4tech.com>>> wrote:
>     >>
>     >>
>     >>         Is at:
>     >>
>     >>        
>     http://lists.w3.org/Archives/Public/www-archive/2006Aug/att-0000/
>     <http://lists.w3.org/Archives/Public/www-archive/2006Aug/att-0000/>
>     >>         workingNote.pdf
>     >>
>     >>         Please read and comment as soon as possible.
>     >>
>     >>         Cheers
>     >>
>     >>         Steve T
>     >>
>     >>
>     >>
>     >>
>     >>     --
>     >>     L.G. Meredith
>     >>     Partner
>     >>     Biosimilarity LLC
>     >>     505 N 72nd St
>     >>     Seattle, WA 98103
>     >>
>     >>     +1 206.650.3740
>     >
>     >     ---------------------------------------------------------
>     >     Marco Carbone
>     >
>     >     Dept. of Computer Science
>     >     Queen Mary University of London
>     >     Mile End Road
>     >     E1 4NS London
>     >     United Kingdom
>     >
>     >     Phone: +44 (0) 207 882 3659
>     >     Fax:      +44 (0) 208 980 6533
>     >     email:   carbonem@dcs.qmul.ac.uk
>     <mailto:carbonem@dcs.qmul.ac.uk> <mailto:carbonem@dcs.qmul.ac.uk
>     <mailto:carbonem@dcs.qmul.ac.uk>>
>     >     home:   http://www.dcs.qmul.ac.uk/~carbonem
>     <http://www.dcs.qmul.ac.uk/%7Ecarbonem>
>     >     < http://www.dcs.qmul.ac.uk/%7Ecarbonem>
>     >     ---------------------------------------------------------
>     >
>     >
>     >
>     >
>     > --
>     > L.G. Meredith
>     > Partner
>     > Biosimilarity LLC
>     > 505 N 72nd St
>     > Seattle, WA 98103
>     >
>     > +1 206.650.3740
>
>
>
>
> -- 
> L.G. Meredith
> Partner
> Biosimilarity LLC
> 505 N 72nd St
> Seattle, WA 98103
>
> +1 206.650.3740 

Received on Wednesday, 9 August 2006 03:39:47 UTC