RE: BurdettML from Burdett, David on 2003-09-12 (public-ws-chor@w3.org from September 2003)

From: Burdett, David <david.burdett@commerceone.com>
Date: Fri, 12 Sep 2003 12:43:47 -0700
To: "'ygoland@bea.com'" <ygoland@bea.com>, "WS Chor Public (E-mail)" <public-ws-chor@w3.org>
Message-ID: <99F57F955F3EEF4DABA7C88CFA7EB45A0C0C88F7@c1plenaexm04-b.commerceone.com>
Yaron

Firstly, thanks for the kind words ... ;)

Secondly, detailed comments in line ...........

David
PS Apologies for the length of this reply ;)

-----Original Message-----
From: Yaron Goland [mailto:ygoland@bea.com]
Sent: Wednesday, September 10, 2003 9:34 PM
To: WS Chor Public (E-mail)
Subject: BurdettML



Executive Summary - I think the group owes a debt of gratitude to David
Burdett for his excellent spec. I think his spec should become the core of
what will eventually be the WS-Chor spec.
	Most of my comments below have to do with removing features. I see
that as
a positive thing. Pruning is a healthy activity for a spec. It's when you
find yourself having to re-engineer that you need to worry. Thankfully, that
isn't the case here.
	I do think we need to add a few things to David's spec, including
but not
limited to:
		* Make signals visible in the choreography,
		* Figure out how to do runtime versus static role binding,
		* Adding a scope/function call functionality (maybe
ExtendsChoreography/DependsOnChoreography achieves this goal, I'm not sure)
and
		* Adding a true composition/segmentation mechanism (although
I think some
of the hooks we need for this are already in the spec).

	But mostly, I think features need to be removed. Such as:
		* Remove all the choreography progress checking features
mentioned in
section 2.2.5,
		* Collapse name and URN attributes into a single attribute,
		* Get rid of ConditionalEnd, Precondition & Process,
		* Make InteractionDef, MessageFamily and Role much easier to
use in the
'easy' case and
		* Simplify start and end states

Long Winded Version:

Section 2.2.3 - Interactions, Reliable Messaging and Signals - My
understanding of the group consensus is that we do need to surface signal
information in a choreography. For example, a Buyer, upon receiving a
message processing complete signal might then send a message to an 'invoice
tracking' service telling it about the order and asking that the order be
tracked. The challenge before the group is to figure out how we can surface
signal based information in a manner that does not require us to adopt any
particular signaling mechanism. This is especially important given that
there is no standard Web Service signaling mechanism nor does one appear to
be on the horizon. So users of WS-Chor are most likely going to have to
contend with non-standard solutions for the foreseeable future.
<DB>I agree. The way I would be inclined to tackle this is to concentrate on
the useful "semantics", i.e. "what" signals are required rather than "how"
those signals are communicated. Secondly I would concentrate on the signals
that are likely to be "useful", i.e. they are likely to affect the flow of
the choreography. As a starter I think that the following are useful:
- Message received - it's got there, but has not been checked
- Message valid - it's also been checked and found OK so it *will* be
processed
- Message invalid - it's also been checked but was found invalid and
therefore won't be processed
- Message processing started - i.e. it was valid and processing of the
message has started, but not yet finished, this implies that there state
changes at the destination have occurred that may not be completely
reversable
- Message processing completed OK - i.e. the message was processed
successfully
- Message processing failed - i.e. the message was not processed
successfully.

I am *not* suggesting for one minute that all choreographies would use any
or all of these signals, however I do believe that they are likely to be
common to many.

Once you've identified these signals, then you can simply ;) extend the end
states for the "from" and "to" in any interaction to include these signals.
For example you could have ...
<InteractionDef name="SendOrder" fromRole="Buyer" toRole="Seller"
messageFamily="Order">
  <InteractionEndStates fromState="OrderSent, MsgReceived,
MsgProcessStarted" toState="OrderReceived, MsgReceived, MsgProcessStarted"/>
</InteractionDef>

You could then write a "binding" when the choreography was implemented, that
mapped these states to signals that communicated these states. Note that a)
you don't have to include all the possible intermediate states, and b) the
choreography designer could create additional states if they are relevant to
the particular choreography, for example (in caps) ...
  <InteractionEndStates fromState="OrderSent, MsgReceived,
MsgProcessStarted, BUYERCREDITBLOWN" toState="OrderReceived, MsgReceived,
MsgProcessStarted, BUYERCREDITBLOWN"/>
</DB>

Section 2.2.5 - Checking Choreography Progress - Per our last phone
conference I do not believe that we need to define a standard way to specify
what choreography/state a message belongs to at this time. I think this
feature fails the 'the spec is done when there is nothing left to cut' test.
After all, one can clearly implement a choreography without this feature and
even better yet it is easy to add this feature later in a manner that is
backwards compatible with existing WS-Chor systems.
	Although we didn't discuss this issue on the last phone call I would
like
to signal a formal objection to introducing a mechanism for one role to
query another role's choreography state for the same reasons that I object
to putting choreography information into messages. The existence proof for
the ability to survive without this feature is self-evident and, again, we
can always add it later.
<DB>I don't completely agree.
Firstly I think you have to accept that the participants in a choreography
will very often want to check that the choreography is being followed
correctly. If these checks are not made then if an incorrect message is
received, then the result will be unpredictable. What is more this lack of
predictability will, for many implementers, be unacceptable.

Secondly, if you accept that checking that the choreography is being
followed will often be required, then we should at least consider how it
could work. I see two main options:
a) the application or web service that is handling the message does the
checks, or
b) there is some middleware - in front of the web service - that does the
checks.
The application/web service doing the checks is OK, but it is complicated by
the fact that the *same* web service could take part in many different
choreographies and it will need to know, for each time it is called, which
one it is.

This brings me to the third part of the problem which is how does an
application/web service *know* which choreography it is taking part in. This
time I think there are three main options:
a) Force each web service to *only* take part in a only one choreography by
specifying a separate URL for each choreography it can take part in, for
example http://www.bigco.com/salesorderws/choreography1
b) Hold the information as, for example, a SOAP header - my original
suggestion, or 
c) Use data in the message content/payload to determine the choreography
being followed.
I don't like option 'a' as it is transport dependent. I also don't like
option 'c' as you often it could require the extension of an existing
business document definition, for example <Order> ...
<Chor>Choreography1</Chor> ... </Order> which is artificial. This leaves
option 'b' as the one I would prefer. If you do follow option 'b' then it
also makes it much easier for middleware - in front of the web service - to
check that the choreography is being followed as it does not need to know
anything about the structure of the message. It also makes it easier to
include existing, non-choreography-aware, web services into a choreography
as they would not need to be altered in any way when checking that a
choreography is being followed correctly.
</DB>


Section 3 - Scopes/Functions - At some point we will end up having to create
the equivalent of a scope/function call. This will need to be able to define
things like local roles.
<DB>Totally agree. This is also tied into the need to be able to compose one
"new" choreography out of two or more existing ones.</DB>

Section 5 - Organization - While organizing the element definitions
alphabetically makes it easier to find things in the document it makes it
much harder to understand how everything works together. It would perhaps be
easier on the reader if the elements were organized with a focus on
pedagogy.
<DB>I don't have strong views on this and would accept the team
consensus.</DB>

Section 5.2.1 & 5.2.2 - Name and URN - I'm vaguely worried about the idea
that a choreography has two names. The 'name' attribute that pretty much can
only be used inside of a single choreography definition file and a 'URN'
that can be used universally. Having two ways of doing the same thing in a
standard is dangerous, it leads to mistakes. Can we just reduce this to only
have the URN? The time honored hack around having to type in the whole URN
is to define the URN name using XML namespace notation, e.g. X:foo. In that
case so long as the default namespace is X then you can just say 'foo'
within the choreography definition and in more general circumstances give
the full name.
<DB>Sounds OK, although I think we should include some wording around this
in the spec. I am also not sure how this would work when you want to compose
choreographies.</DB>

Section 5.3 - ConditionalEnd - It seems to me that explicitly identifying
ConditionalEnd states isn't necessary. This is something that can always be
determined by examining the choreography definition.
<DB>I'm not sure about this. If this is "always" true, then I agree that we
should simplify the spec as then you would only need to identify the "start
state".</DB>

Section 5.4 - DependsOnChoreography - I agree with the intent of the
'DependsOnChoreography' feature but I think we will need to accomplish it in
another way. For example, does 'following' another choreography mean that
reaching any end state, even one reached as a consequence of a fault,
consists of 'successful completion'? I suspect this feature needs to be
subsumed into a more generic segmentation/composition feature.
<DB>+1</DB>

Section 5.5.2 - Description@Ref - In general I council against the use of
attributes
(http://lists.w3.org/Archives/Public/w3c-dist-auth/1998JulSep/0084.html) and
this is a good example of why. One can quite possibly want to put multiple
Refs on a description. The first reason is that there may be multiple
documents one wants to point to. The second reason is that there may be
multiple alternate mirrors for the document one wants to provide. Trying to
express all of this in an attribute is a nightmare. Why not just switch to
using elements which can handle both scenarios easily?
<DB>Case proved ;)</DB>

Section 5.8 - Import - One of the things that is really annoying about WSDL
is that it only allows a WSDL definition to contain definitions for a single
namespace. This is also a really annoying problem with XML schema. There is
no inherent reason to have this limitation. So I would suggest that the
import element should not specify a namespace for the items to be imported.
Instead the imported items themselves should specify whatever namespace(s)
they are using (which the suggestion for sections 5.2.1/5.2.2 would provide
for). What I would like (and what I suspect was actually intended) is the
ability to provide a unique identifier (in the form of a URL) of the
document to be imported along with the location hint(s).
<DB>Again +1</DB>


Section 5.10 - InteractionDef - I'm a big believer in the idea that easy
things should be easy and hard things should be possible. WSDL fails this
test because even easy things are hard. Part of its complexity is derived
from its insistence on forcing every aspect of a port definition (operation,
portType, Service, Port, etc.) to have its own name even when a definition
is only used in a single context and so could be defined implicitly by
containment rather than by an explicit naming. The result is that people
trying to use WSDL end up coming up with tons of names that make the file
really hard to read.
	I fear that InteractionDef is leading us down a similar road. I
agree that
having re-usable interaction definitions is a good thing. One can easily
imagine extremely complex interaction definitions with lots of configuration
information that one would rather re-use than have to constantly re-define.
So I like the feature.
	But I think we need to make easy things easy by providing a way to
define
an interaction 'inline' so that it doesn't need its own name and is
referenced 'by value' (as a virtue of its location) rather than 'by
reference' as the spec would currently require.
	I realize the road I'm walking down will lead to some non-trivial
changes
in the syntax (but not semantics) of the specification but I think it's
worth going down that road in order to make it easy to do easy things.
<DB>This is definitely an idea worth pursuing. The real use for a separate
definition of an interaction is our experience of the 14 different ways of
placing an order. All these different variations have different
choreographies, but they do share many of the same interactions. On the
other hand, if this is not an issue then having a simple way to do the same
thing is a good idea even though I never like having two ways of doing the
same thing ...</DB>


Section 5.12 - MessageFamily - I support the concept of a Message Family.
Minimally one could imagine using Message Families to allow  a choreography
definition to be used with either WSDL 1.1 or WSDL 1.2 (remember, our
charter mandates support for 1.2 but it doesn't prevent us from supporting
1.1). The fact that Message Families also provide the potential to support
other messaging systems like ebXML is a major bonus.
	But I do think we need to tweak it a bit.
	First, I think the it should be possible to directly reference a
WSDL 1.2
operation if one wants to in an interaction definition without having to use
the abstraction layer offered by MessageFamily. This is another example of
'easy things should be easy' issue discussed in section 5.10. If all someone
wants to do is just hardcode in a WSDL 1.2 operation then they should be
free to do so. We shouldn't force extra layers of abstraction on those who
don't need them.
<DB>Although I understand where you are going, I really don't like this as
it means that there is something in the spec that says and if you want to do
"WSDL 1.2" this is what you do. The question I would have is that this would
then almost certainly force a revision of the WS-Chor spec when WSDL 1.3
comes along. Which means there would be a very tight binding between the two
which is undesirable. So, unless you could find away around this problem,
then I would counsel against this.</DB>
	Second, I think that Message Families should be seen not as a way to
refer
to multiple messages that 'could' meet the definition but rather as a true
placeholder in the same sense that a WSDL portType is a place holder. A WSDL
portType doesn't say 'here are the bindings I could be used with', it just
says 'I need to be bound'. I think in the same way an instance of a
choreography is defined by a binding that binds a Message Family to a
particular message. This means that when one creates a Message Family there
is no need to say anything about what messages it 'could' be bound to in the
same way that a WSDL portType doesn't say 'well I could be bound to SOAP'. I
think this actually calls for us to change the name of MessageFamily to
something a bit more singular and generic like MessagePlaceHolder or
MessageType.
<DB>My intent was to make Message Family a place holder in the same way as
Port Type. We even thought of calling it Message Type but rejected it as it
wa too confusing. To me, examples of a "Message Type" would be a RosettaNet
Order, or a UBL Order or an OAG Order - i.e. the specific representation of
a message. The definition of a message family, on the other hand, is that it
identifies "a set of messages that serve the same purpose", for example a
RosettaNet Order, UBL Order, etc all serve the same purpose and contain very
similar information with very similar semantics. The main difference is in
their representations. So perhaps something like Message Class or Message
Category might work if we don't like Message Family ... but NOT Message
Type.</DB>

Section 5.13 - Precondition - Providing for preconditions is fundamentally
dangerous and actively harmful to the interests of interoperability. Please
see http://lists.w3.org/Archives/Public/public-ws-chor/2003Jul/0135.html for
my full argument. Therefore I would like to see this feature removed.
<DB>Firstly, although you mentioned it on the archive, and I replied at
http://lists.w3.org/Archives/Public/public-ws-chor/2003Jul/0141.html, the
topic was never fully discussed. The sole reason for having a precondition
is to define the pre-existing state of a choreography which causes the next
step in the choreography to occur. In the spec this is defined as a boolean
combination of choreography states. What the spec does not do is define how
those states are determined since this dependent on: the structure of the
actual messages as well as additional "internal" logic which is not part of
the choreography - which I think is your main concern. This means that that
at some point you need to have a mapping of a state to actual conditions, as
in BPEL, which is implementation dependent. A second point is that simply
removing conditions would mean that you remove the ability to show the
dependency of one step in the choreography on another which would make the
spec completely broken unless replaced by something else. </DB>

Section 5.14 - Process - I don't think we need processes. A process is a
strictly internal action that does not need to be visible in a choreography.
So Figure 1 could look like:


   Buyer                           Seller

[Order Sent]--->Send Order--->[Order Received]--->(End Choreography)
                                   |
                                   |
[Order Error <--Send Order Error----
 Received]

	The choreography definition still provides a complete definition of
all
shared behavior, thus illustrating why processes are not necessary.
	Another way to think of this is that WS-Choreography should only
express
those parts of the choreography which are externally 'knowable'. If you can
change a piece of information in the choreography definition in a way that
an external observer could never discover then that information does not
need to be in the choreography definition. In this case one could change the
processes and 'internal states' within the Seller and so long as the
response to a Send Order is either nothing or a Send Order Error then the
Buyer (an external observer) would be none the wiser. This is why processes
can be safely dropped.
<DB>Yaron, basically I agree and said so at the last F2F which you missed. I
also think that we can do away with process.</DB>

	A nice side benefit of dropping processes is that we can get rid of
section
2.2.4 which will prevent us from having to have a really nasty argument
about what exactly a 'domain of control' is.
<DB>I don't quite see the logic in this. The *only* reason you need a
choreography definition is because you have two (or more) indepdent
processes which need to co-ordinate their activities. In this context each
independent process is in a "domain of control".</DB>

Section 5.16 - Role - This is another 'easy things should be easy' argument.
I can see the potential benefit of having roles defined separately from
states but this requires a whole bunch of linking. If someone just wants to
throw together states with hard coded roles they should be free to do so.
They should then have the option, much like I discuss in my comments about
MessageFamily, to put in a layer of abstraction and allow for late binding
of roles. But they shouldn't be required to late bind in all cases.
<DB>I have a number of comments on this at a number of different levels.
Firstly, there's the principle of should you define things first then use
them (as in Java) which allows better validation, or define things on the
fly (as in Basic). I think I prefer the former.
Second, the reason for including roles explicity is so that you can extract
from a definition, what each role has to do. This is important as in an
instance, an individual process will only take part in a single role in a
choreography. If roles are "hard coded" - whatever that actually means, then
it might work if there are only two roles.
Also I am not sure that having a simple way when there are only two roles is
actually worth it. I would need to see an example of how your could include
both ways.
Finally, when you said "Messsage Family" did you mean "Interaction Def"?
</DB>

Section 5.18 - StartEndStates - Why are start and end states defined
separately from the states themselves? I would prefer having explicit
'start' and 'end' states (essentially reserved state names) and then use
interactions to hook other states into those explicit 'start' and 'end'
states rather than using the StartEndStates declaration to turn an existing
state into a hybrid that is both an existing state and a start/end state.
BTW, I realize this issue is dangerously close to 'tastes great - less
filling'.
<DB>The reason for defining start/end states separately is because:
1. You can have the same "states" being reused in multiple different
choreographies - look at appendix B in the spec for an example.
2. "Start/end" is actually an attribute of a state rather than a state in
its own right. Specifically it is used to identify whether a state, when
used in a Choreography:
  a) causes a choreography to start
  b) is an "end state" i.e. no more state transitions can occur
  c) is a "conditional end state" i.e. more state transitions could occur
but need not, or
  d) is a state which should be succeeded by some different state (the
default).
3. Depending on the choreography in which a state is being used it might be
either an end state, a conditional end state or even a start state.
4. If you include identification the start/end states at the time you define
them, then it can prevent those states being used in multiple
choreographies.
5. Bottom line, start/end states are an attribute of the usage of a state in
a choreography and therefore needs to be defined in the choreography.
</DB>
Received on Friday, 12 September 2003 15:41:02 UTC