- From: Burdett, David <david.burdett@commerceone.com>
- Date: Fri, 12 Sep 2003 12:43:47 -0700
- To: "'ygoland@bea.com'" <ygoland@bea.com>, "WS Chor Public (E-mail)" <public-ws-chor@w3.org>
Yaron Firstly, thanks for the kind words ... ;) Secondly, detailed comments in line ........... David PS Apologies for the length of this reply ;) -----Original Message----- From: Yaron Goland [mailto:ygoland@bea.com] Sent: Wednesday, September 10, 2003 9:34 PM To: WS Chor Public (E-mail) Subject: BurdettML Executive Summary - I think the group owes a debt of gratitude to David Burdett for his excellent spec. I think his spec should become the core of what will eventually be the WS-Chor spec. Most of my comments below have to do with removing features. I see that as a positive thing. Pruning is a healthy activity for a spec. It's when you find yourself having to re-engineer that you need to worry. Thankfully, that isn't the case here. I do think we need to add a few things to David's spec, including but not limited to: * Make signals visible in the choreography, * Figure out how to do runtime versus static role binding, * Adding a scope/function call functionality (maybe ExtendsChoreography/DependsOnChoreography achieves this goal, I'm not sure) and * Adding a true composition/segmentation mechanism (although I think some of the hooks we need for this are already in the spec). But mostly, I think features need to be removed. Such as: * Remove all the choreography progress checking features mentioned in section 2.2.5, * Collapse name and URN attributes into a single attribute, * Get rid of ConditionalEnd, Precondition & Process, * Make InteractionDef, MessageFamily and Role much easier to use in the 'easy' case and * Simplify start and end states Long Winded Version: Section 2.2.3 - Interactions, Reliable Messaging and Signals - My understanding of the group consensus is that we do need to surface signal information in a choreography. For example, a Buyer, upon receiving a message processing complete signal might then send a message to an 'invoice tracking' service telling it about the order and asking that the order be tracked. The challenge before the group is to figure out how we can surface signal based information in a manner that does not require us to adopt any particular signaling mechanism. This is especially important given that there is no standard Web Service signaling mechanism nor does one appear to be on the horizon. So users of WS-Chor are most likely going to have to contend with non-standard solutions for the foreseeable future. <DB>I agree. The way I would be inclined to tackle this is to concentrate on the useful "semantics", i.e. "what" signals are required rather than "how" those signals are communicated. Secondly I would concentrate on the signals that are likely to be "useful", i.e. they are likely to affect the flow of the choreography. As a starter I think that the following are useful: - Message received - it's got there, but has not been checked - Message valid - it's also been checked and found OK so it *will* be processed - Message invalid - it's also been checked but was found invalid and therefore won't be processed - Message processing started - i.e. it was valid and processing of the message has started, but not yet finished, this implies that there state changes at the destination have occurred that may not be completely reversable - Message processing completed OK - i.e. the message was processed successfully - Message processing failed - i.e. the message was not processed successfully. I am *not* suggesting for one minute that all choreographies would use any or all of these signals, however I do believe that they are likely to be common to many. Once you've identified these signals, then you can simply ;) extend the end states for the "from" and "to" in any interaction to include these signals. For example you could have ... <InteractionDef name="SendOrder" fromRole="Buyer" toRole="Seller" messageFamily="Order"> <InteractionEndStates fromState="OrderSent, MsgReceived, MsgProcessStarted" toState="OrderReceived, MsgReceived, MsgProcessStarted"/> </InteractionDef> You could then write a "binding" when the choreography was implemented, that mapped these states to signals that communicated these states. Note that a) you don't have to include all the possible intermediate states, and b) the choreography designer could create additional states if they are relevant to the particular choreography, for example (in caps) ... <InteractionEndStates fromState="OrderSent, MsgReceived, MsgProcessStarted, BUYERCREDITBLOWN" toState="OrderReceived, MsgReceived, MsgProcessStarted, BUYERCREDITBLOWN"/> </DB> Section 2.2.5 - Checking Choreography Progress - Per our last phone conference I do not believe that we need to define a standard way to specify what choreography/state a message belongs to at this time. I think this feature fails the 'the spec is done when there is nothing left to cut' test. After all, one can clearly implement a choreography without this feature and even better yet it is easy to add this feature later in a manner that is backwards compatible with existing WS-Chor systems. Although we didn't discuss this issue on the last phone call I would like to signal a formal objection to introducing a mechanism for one role to query another role's choreography state for the same reasons that I object to putting choreography information into messages. The existence proof for the ability to survive without this feature is self-evident and, again, we can always add it later. <DB>I don't completely agree. Firstly I think you have to accept that the participants in a choreography will very often want to check that the choreography is being followed correctly. If these checks are not made then if an incorrect message is received, then the result will be unpredictable. What is more this lack of predictability will, for many implementers, be unacceptable. Secondly, if you accept that checking that the choreography is being followed will often be required, then we should at least consider how it could work. I see two main options: a) the application or web service that is handling the message does the checks, or b) there is some middleware - in front of the web service - that does the checks. The application/web service doing the checks is OK, but it is complicated by the fact that the *same* web service could take part in many different choreographies and it will need to know, for each time it is called, which one it is. This brings me to the third part of the problem which is how does an application/web service *know* which choreography it is taking part in. This time I think there are three main options: a) Force each web service to *only* take part in a only one choreography by specifying a separate URL for each choreography it can take part in, for example http://www.bigco.com/salesorderws/choreography1 b) Hold the information as, for example, a SOAP header - my original suggestion, or c) Use data in the message content/payload to determine the choreography being followed. I don't like option 'a' as it is transport dependent. I also don't like option 'c' as you often it could require the extension of an existing business document definition, for example <Order> ... <Chor>Choreography1</Chor> ... </Order> which is artificial. This leaves option 'b' as the one I would prefer. If you do follow option 'b' then it also makes it much easier for middleware - in front of the web service - to check that the choreography is being followed as it does not need to know anything about the structure of the message. It also makes it easier to include existing, non-choreography-aware, web services into a choreography as they would not need to be altered in any way when checking that a choreography is being followed correctly. </DB> Section 3 - Scopes/Functions - At some point we will end up having to create the equivalent of a scope/function call. This will need to be able to define things like local roles. <DB>Totally agree. This is also tied into the need to be able to compose one "new" choreography out of two or more existing ones.</DB> Section 5 - Organization - While organizing the element definitions alphabetically makes it easier to find things in the document it makes it much harder to understand how everything works together. It would perhaps be easier on the reader if the elements were organized with a focus on pedagogy. <DB>I don't have strong views on this and would accept the team consensus.</DB> Section 5.2.1 & 5.2.2 - Name and URN - I'm vaguely worried about the idea that a choreography has two names. The 'name' attribute that pretty much can only be used inside of a single choreography definition file and a 'URN' that can be used universally. Having two ways of doing the same thing in a standard is dangerous, it leads to mistakes. Can we just reduce this to only have the URN? The time honored hack around having to type in the whole URN is to define the URN name using XML namespace notation, e.g. X:foo. In that case so long as the default namespace is X then you can just say 'foo' within the choreography definition and in more general circumstances give the full name. <DB>Sounds OK, although I think we should include some wording around this in the spec. I am also not sure how this would work when you want to compose choreographies.</DB> Section 5.3 - ConditionalEnd - It seems to me that explicitly identifying ConditionalEnd states isn't necessary. This is something that can always be determined by examining the choreography definition. <DB>I'm not sure about this. If this is "always" true, then I agree that we should simplify the spec as then you would only need to identify the "start state".</DB> Section 5.4 - DependsOnChoreography - I agree with the intent of the 'DependsOnChoreography' feature but I think we will need to accomplish it in another way. For example, does 'following' another choreography mean that reaching any end state, even one reached as a consequence of a fault, consists of 'successful completion'? I suspect this feature needs to be subsumed into a more generic segmentation/composition feature. <DB>+1</DB> Section 5.5.2 - Description@Ref - In general I council against the use of attributes (http://lists.w3.org/Archives/Public/w3c-dist-auth/1998JulSep/0084.html) and this is a good example of why. One can quite possibly want to put multiple Refs on a description. The first reason is that there may be multiple documents one wants to point to. The second reason is that there may be multiple alternate mirrors for the document one wants to provide. Trying to express all of this in an attribute is a nightmare. Why not just switch to using elements which can handle both scenarios easily? <DB>Case proved ;)</DB> Section 5.8 - Import - One of the things that is really annoying about WSDL is that it only allows a WSDL definition to contain definitions for a single namespace. This is also a really annoying problem with XML schema. There is no inherent reason to have this limitation. So I would suggest that the import element should not specify a namespace for the items to be imported. Instead the imported items themselves should specify whatever namespace(s) they are using (which the suggestion for sections 5.2.1/5.2.2 would provide for). What I would like (and what I suspect was actually intended) is the ability to provide a unique identifier (in the form of a URL) of the document to be imported along with the location hint(s). <DB>Again +1</DB> Section 5.10 - InteractionDef - I'm a big believer in the idea that easy things should be easy and hard things should be possible. WSDL fails this test because even easy things are hard. Part of its complexity is derived from its insistence on forcing every aspect of a port definition (operation, portType, Service, Port, etc.) to have its own name even when a definition is only used in a single context and so could be defined implicitly by containment rather than by an explicit naming. The result is that people trying to use WSDL end up coming up with tons of names that make the file really hard to read. I fear that InteractionDef is leading us down a similar road. I agree that having re-usable interaction definitions is a good thing. One can easily imagine extremely complex interaction definitions with lots of configuration information that one would rather re-use than have to constantly re-define. So I like the feature. But I think we need to make easy things easy by providing a way to define an interaction 'inline' so that it doesn't need its own name and is referenced 'by value' (as a virtue of its location) rather than 'by reference' as the spec would currently require. I realize the road I'm walking down will lead to some non-trivial changes in the syntax (but not semantics) of the specification but I think it's worth going down that road in order to make it easy to do easy things. <DB>This is definitely an idea worth pursuing. The real use for a separate definition of an interaction is our experience of the 14 different ways of placing an order. All these different variations have different choreographies, but they do share many of the same interactions. On the other hand, if this is not an issue then having a simple way to do the same thing is a good idea even though I never like having two ways of doing the same thing ...</DB> Section 5.12 - MessageFamily - I support the concept of a Message Family. Minimally one could imagine using Message Families to allow a choreography definition to be used with either WSDL 1.1 or WSDL 1.2 (remember, our charter mandates support for 1.2 but it doesn't prevent us from supporting 1.1). The fact that Message Families also provide the potential to support other messaging systems like ebXML is a major bonus. But I do think we need to tweak it a bit. First, I think the it should be possible to directly reference a WSDL 1.2 operation if one wants to in an interaction definition without having to use the abstraction layer offered by MessageFamily. This is another example of 'easy things should be easy' issue discussed in section 5.10. If all someone wants to do is just hardcode in a WSDL 1.2 operation then they should be free to do so. We shouldn't force extra layers of abstraction on those who don't need them. <DB>Although I understand where you are going, I really don't like this as it means that there is something in the spec that says and if you want to do "WSDL 1.2" this is what you do. The question I would have is that this would then almost certainly force a revision of the WS-Chor spec when WSDL 1.3 comes along. Which means there would be a very tight binding between the two which is undesirable. So, unless you could find away around this problem, then I would counsel against this.</DB> Second, I think that Message Families should be seen not as a way to refer to multiple messages that 'could' meet the definition but rather as a true placeholder in the same sense that a WSDL portType is a place holder. A WSDL portType doesn't say 'here are the bindings I could be used with', it just says 'I need to be bound'. I think in the same way an instance of a choreography is defined by a binding that binds a Message Family to a particular message. This means that when one creates a Message Family there is no need to say anything about what messages it 'could' be bound to in the same way that a WSDL portType doesn't say 'well I could be bound to SOAP'. I think this actually calls for us to change the name of MessageFamily to something a bit more singular and generic like MessagePlaceHolder or MessageType. <DB>My intent was to make Message Family a place holder in the same way as Port Type. We even thought of calling it Message Type but rejected it as it wa too confusing. To me, examples of a "Message Type" would be a RosettaNet Order, or a UBL Order or an OAG Order - i.e. the specific representation of a message. The definition of a message family, on the other hand, is that it identifies "a set of messages that serve the same purpose", for example a RosettaNet Order, UBL Order, etc all serve the same purpose and contain very similar information with very similar semantics. The main difference is in their representations. So perhaps something like Message Class or Message Category might work if we don't like Message Family ... but NOT Message Type.</DB> Section 5.13 - Precondition - Providing for preconditions is fundamentally dangerous and actively harmful to the interests of interoperability. Please see http://lists.w3.org/Archives/Public/public-ws-chor/2003Jul/0135.html for my full argument. Therefore I would like to see this feature removed. <DB>Firstly, although you mentioned it on the archive, and I replied at http://lists.w3.org/Archives/Public/public-ws-chor/2003Jul/0141.html, the topic was never fully discussed. The sole reason for having a precondition is to define the pre-existing state of a choreography which causes the next step in the choreography to occur. In the spec this is defined as a boolean combination of choreography states. What the spec does not do is define how those states are determined since this dependent on: the structure of the actual messages as well as additional "internal" logic which is not part of the choreography - which I think is your main concern. This means that that at some point you need to have a mapping of a state to actual conditions, as in BPEL, which is implementation dependent. A second point is that simply removing conditions would mean that you remove the ability to show the dependency of one step in the choreography on another which would make the spec completely broken unless replaced by something else. </DB> Section 5.14 - Process - I don't think we need processes. A process is a strictly internal action that does not need to be visible in a choreography. So Figure 1 could look like: Buyer Seller [Order Sent]--->Send Order--->[Order Received]--->(End Choreography) | | [Order Error <--Send Order Error---- Received] The choreography definition still provides a complete definition of all shared behavior, thus illustrating why processes are not necessary. Another way to think of this is that WS-Choreography should only express those parts of the choreography which are externally 'knowable'. If you can change a piece of information in the choreography definition in a way that an external observer could never discover then that information does not need to be in the choreography definition. In this case one could change the processes and 'internal states' within the Seller and so long as the response to a Send Order is either nothing or a Send Order Error then the Buyer (an external observer) would be none the wiser. This is why processes can be safely dropped. <DB>Yaron, basically I agree and said so at the last F2F which you missed. I also think that we can do away with process.</DB> A nice side benefit of dropping processes is that we can get rid of section 2.2.4 which will prevent us from having to have a really nasty argument about what exactly a 'domain of control' is. <DB>I don't quite see the logic in this. The *only* reason you need a choreography definition is because you have two (or more) indepdent processes which need to co-ordinate their activities. In this context each independent process is in a "domain of control".</DB> Section 5.16 - Role - This is another 'easy things should be easy' argument. I can see the potential benefit of having roles defined separately from states but this requires a whole bunch of linking. If someone just wants to throw together states with hard coded roles they should be free to do so. They should then have the option, much like I discuss in my comments about MessageFamily, to put in a layer of abstraction and allow for late binding of roles. But they shouldn't be required to late bind in all cases. <DB>I have a number of comments on this at a number of different levels. Firstly, there's the principle of should you define things first then use them (as in Java) which allows better validation, or define things on the fly (as in Basic). I think I prefer the former. Second, the reason for including roles explicity is so that you can extract from a definition, what each role has to do. This is important as in an instance, an individual process will only take part in a single role in a choreography. If roles are "hard coded" - whatever that actually means, then it might work if there are only two roles. Also I am not sure that having a simple way when there are only two roles is actually worth it. I would need to see an example of how your could include both ways. Finally, when you said "Messsage Family" did you mean "Interaction Def"? </DB> Section 5.18 - StartEndStates - Why are start and end states defined separately from the states themselves? I would prefer having explicit 'start' and 'end' states (essentially reserved state names) and then use interactions to hook other states into those explicit 'start' and 'end' states rather than using the StartEndStates declaration to turn an existing state into a hybrid that is both an existing state and a start/end state. BTW, I realize this issue is dangerously close to 'tastes great - less filling'. <DB>The reason for defining start/end states separately is because: 1. You can have the same "states" being reused in multiple different choreographies - look at appendix B in the spec for an example. 2. "Start/end" is actually an attribute of a state rather than a state in its own right. Specifically it is used to identify whether a state, when used in a Choreography: a) causes a choreography to start b) is an "end state" i.e. no more state transitions can occur c) is a "conditional end state" i.e. more state transitions could occur but need not, or d) is a state which should be succeeded by some different state (the default). 3. Depending on the choreography in which a state is being used it might be either an end state, a conditional end state or even a start state. 4. If you include identification the start/end states at the time you define them, then it can prevent those states being used in multiple choreographies. 5. Bottom line, start/end states are an attribute of the usage of a state in a choreography and therefore needs to be defined in the choreography. </DB>
Received on Friday, 12 September 2003 15:41:02 UTC