Data Flow from Massimo Paolucci on 2003-10-10 (www-ws@w3.org from October 2003)

From: Massimo Paolucci <paolucci@cs.cmu.edu>
Date: Thu, 09 Oct 2003 20:50:33 -0400
To: www-ws@w3.org
Message-ID: <3F860259.3030208@cs.cmu.edu>
The data-flow mechanism for OWL-S  1.0 is still somewhat in flux.  I will
try to make a proposal here that seems to be a compromise between all
the proposals that have been put forward so far (at least as far as I
can remember).

Drew proposed to define a new type of object that he called "dataLink"
in a first incarnation and "channel" in the more recent surface language
proposal.  Inputs and outputs of processes are represented as
properties to the channel or from the channel.

An alternative proposal has been sometime labelled "Martin&Paolucci"
which assumes that there are input/outputs parameters, processes have
an input relation with an input parameter and an output relation with
an output parameter.  This latter viewpoint has also been adopted by
Martha's proposal for IOPEs.

While the two proposals seem to be quite distant, there is a simple
way to combine them using the owl sameAs construct.

 From the owl reference document:

     The built-in OWL property owl:sameAs links an individual to an
     individual. Such an owl:sameAs statement indicates that two URI
     references actually refer to the same thing: the individuals have
     the same "identity".

We could then say that there are inputs and outputs parameters, and
that the data flow is then represented by asserting a sameAs relation
between an input and an output.  Effectively those input and output
become the very same object in the context of the process model, which
looks very similar to Drew's channel proposal.

A somewhat better way to do the same thing is to define a new property
"dataLink" that is a subproperty of sameAs but is restricted to relate
outputs to inputs.  This property is defined as follows.


<owl:ObjectProperty rdf:ID="dataLink">
  <rdfs:subPropertyOf rdf:resource="&owl;#sameAs"/>
  <rdfs:domain rdf:resource="#Output"/>
  <rdfs:range rdf:resource="#Input"/>
</owl:ObjectProperty>


The advantage is that we introduce some sort of directionality to data
links, and we allow for some control on the validity of process
models: any process model that defines a dataLink departing from an
input would be invalid and we can detect that.

We could also gain more control on the description of the process
model by adding the following two statements which constrain the
values of any input to be set by an output.  Note that there is no
restriction on outputs, so any output can set may inputs.

<owl:ObjectProperty rdf:ID="dataFrom">
  <owl:inverseOf rdf:resource="#dataLink"/>
</owl:ObjectProperty>

<owl:Class rdf:about="Input">
  <owl:Restriction>
    <owl:onProperty rdf:resource="dataFrom"/>
    <owl:cardinality 
rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality>
  </owl:Restriction>
</owl:Class>

Also,  this solution does not make process models more wordy, we just
define data links,  the inverse relation and the restriction on the
Input are taken care automatically by the processors.

My vote as usual goes for the highest level of control so I would add
to the Process Model the properties dataLink and dataFrom, and the
restriction on Inputs.

--- Massimo
Received on Thursday, 9 October 2003 20:51:18 UTC