[OWL-S]: simple process coordination

OK, let's press forward on this issue of representing process 
coordination.  Let's concretize the discussion using a very simple 
interaction (part of Monika's scenario; see "Model of Concurrency in 
DAML-S"):

Suppose we want to specify 2 composite processes, A and B.  Let's say 
they start at about the same time and execute concurrently for awhile, 
and then at some point A needs to get some information from B.  I'll 
assume that this is done by having A invoke some atomic subprocess of B 
which I'll call, say, B2.  My goal here is just to get clear about how 
we want to specify this basic interaction between A and B.

Let's also stipulate that B2 has an input called B2-in1, and an output 
called B2-out1, which are the I/O that A cares about.  (But these aren't 
necessarily the only inputs/outputs that B2 has.)

Currently, in DAML-S/OWL-S, we are relying on our dataflow constructs to 
indicate the relationship between (some part of) A and B2.  My immediate 
goal is to get clear about the details of what appears in the spec of 
process A with respect to this relationship.  I can imagine 2 answers; 
in one, B2 is actually mentioned in A, in the other, it isn't.  Let's 
look at each in turn.

Note: for simplicity, let's assume that composite process A will run in 
a single enactment engine, and composite process B will also run in a 
(different) single enactment engine.  Let's also assume that process A 
is described in namespaceA, and process B in namespaceB.

Answer 1
--------
     The spec of process A mentions B2 (let's say, as part of a 
sequence). (I'll show this very informally, and I will not bother to 
show the definitions of the atomic processes.)
     Process A:
     <Sequence>
         <AtomicProcess rdf:about=#A1>
         <AtomicProcess rdf:about=namespaceB:B2>
         <AtomicProcess rdf:about=#A3>
     </Sequence>

Then, we use a dataflow declaration to indicate that some output of A1 
flows into B2-in1, and B2-out1 flows into some input of A3:

     namespaceA:A1-out1 => namespaceB:B2-in1
     namespaceB:B2-out1 => namespaceA:A3-in1

(I'm just giving an indication of the dataflow here, not trying to 
reproduce all the details of our current approach.)

This seems clear enough, I guess.  The idea is that these dataflow 
relationships imply the obvious things about timing.  Such as, B2 must 
wait until it receives its input (from A1) before it executes, and 
similarly, A3 must wait until B2 has completed.

I assume that B2 will also appear in the process spec of B, something 
like this:
     Process B:
     <Sequence>
         <AtomicProcess rdf:about=#B1>
         <AtomicProcess rdf:about=#B2>
         <AtomicProcess rdf:about=#B3>
     </Sequence>
(Let's ignore for the moment what's going on with the inputs and outputs 
of B1 and B3.)

So what bothers me about this?  Well, there are several things, but 
here's what bothers me the most:

Complaint 1. It seems ambiguous as to whether B2 is supposed to be 
executed once or twice - even after looking at the dataflow.  What I 
have in mind is that it will only execute once as part of the execution 
of process B.  But couldn't one also interpret this as running process 
B2 *both* as part of process A and as part of process B?

The (hand-waving) answer I think I've been hearing the most is that the 
grounding makes this sort of thing clear.  OK, I *think* we could get 
that to work out, by grounding B2 to a solicit/response operation in 
process A, and to a request/response operation in process B - but then 
we are really talking about 2 *different* atomic processes, so they'd 
have to be named and declared differently.  From the WSDL 1.1 point of 
view, I suppose, that's fine, since it maps nicely - but from my OWL-S 
point of view, it seems really unfortunate to have to declare a distinct 
atomic process to represent what is easily thought of as an "invocation 
statement" within Process A.  Another issue, about which I'm not clear, 
is: can we still adopt this approach with WSDL 1.2?  And, I suspect 
there are other gaps/problems that none of us has yet grappled with.

Note that even if we ignore the "ambiguity" concern, or find a different 
solution for it, we still have to come to grips with the grounding issues.

Answer 2
--------
     The spec of process A doesn't mention B2:
     Process A:
     <Sequence>
         <AtomicProcess rdf:about=#A1>
         <AtomicProcess rdf:about=#A3>
     </Sequence>

and, again, we use a dataflow declaration, just as above, to indicate 
that some output of A1 flows into B2-in1, and B2-out1 flows into some 
input of A3:

     namespaceA:A1-out1 => namespaceB:B2-in1
     namespaceB:B2-out1 => namespaceA:A3-in1

In this approach, the process spec of B remains the same as above (in 
Answer 1).

OK, this solves the "ambiguity" concern I mentioned above.  And, 
offhand, I can't see that there are any representational gaps here.  So 
why do I hate it so much???

Complaint 2: It must be because it (the control flow) is so ridiculously 
- and unnecessarily - hard to read and think about!

------- Conclusion

These are the kinds of considerations that lead me to want to have 
constructs for "invoke" and (let's call it) "accept".  (In the example, 
process A "invokes B2", and process B "accepts" the invocation, and that 
could easily be made explicit.)  Why shouldn't these distinctions be 
captured in control flow, as well as in dataflow?

I think I know the answer that may be forthcoming: because we want an 
abstract process specification (for some purposes at least) which 
doesn't actually commit to whether B2 is being invoked, or accepted, or 
just run as a subprogram, or whatever.  So we just mention B2, as 
something that needs to get done at this point in the composite process, 
and the "details" become clear via dataflow and grounding. My rejoinder 
is twofold:

First, someone who advocates this answer needs to show how my complaints 
about either Approach 1 or Approach 2 can be addressed (or argue that 
they aren't important complaints), or put forward some other approach 
that addresses them, and I don't think that's yet been done.

Second, once you've specified your dataflow, at that point you've 
already committed to how B2 is being used (invoked, accepted, run as a 
subprogram, or whatever).  At that point, at least, why not make the 
control flow more comprehensible?

With constructs for "invoke" and "accept", it seems to me, things may 
become much nicer.  But I need to substantiate this claim, and will try 
to do so in another message.

Cheers,
David

Received on Tuesday, 23 September 2003 21:11:43 UTC