Roles (was: Testing the ontology for expressing workflow provenance)

On Fri, Sep 9, 2011 at 11:19, Paolo Missier <Paolo.Missier@ncl.ac.uk> wrote:

> Minor question: what happened to the port names? e.g.  string1, string2,...
> those used to be carried over to the provenance trace as role names. I guess
> they can still be added?

Yes, they need to come over as attributes of the roles.  Roles are not
yet linkable in the ontology, because wasGeneratedBy and used go
directly to the entity instead of by an intermediate "Generation" and
"Use"  - see "IsUSedBy" discussion at
http://www.w3.org/2011/prov/wiki/PIL_OWL_Ontology#Initial_comments.2Fsuggestions_about_the_ontology


I would assume we could include informational bits from the workflow definition:

<http://ns.taverna.org.uk/2010/workflow/ea4168eb-67ea-440f-ab73-818da5efc998/processor/String_constant/out/value>
scufl2:name "value" .

If we want to suggest that Roles have 'names' then perhaps we should
suggest a prov:roleName or simply use dc:title. (Note that in our
workflow definition, a port name is not so much a dc:title, but more
like a variable or parameter name, can't be localised and must be
unique within that processors' input ports or output ports) - would a
prov:roleName also have to be unique within a process execution?

That's why I'm tempted to just let Roles stay as resources which
you'll can subclass and describe however you want.


My gut feeling is to let the 'role' refer directly to the actual port
definition URI - but I'm still not sure if this makes sense. Is the
role a 'general' thing or an instance of a role, or just a plain
string? Ie - two runs of the same workflow definition - would this
have the same role?




http://www.w3.org/2011/prov/wiki/PIL_OWL_Ontology#Dealing_with_the_issues_of_.22uses.22_relationship
suggests that I should subclass Role and then instantiate it for the
particular role used/generated at a particular time. So that could
serialise as:


 <http://ns.taverna.org.uk/2011/run/2613aab1-dfe9-4a17-a4be-7589f5d388d6/>
    a  prov:ProcessExecution;
         prov:used [
             rdf:type
<http://ns.taverna.org.uk/2010/workflow/ea4168eb-67ea-440f-ab73-818da5efc998/processor/String_constant/out/value>
             prov:assumedBy
<http://ns.taverna.org.uk/2011/data/2613aab1-dfe9-4a17-a4be-7589f5d388d6/ref/153277f1-5e4f-43fc-968d-ab3a8b038676>
;
             prov:time "2011-09-12T15:02:24Z"^^xsd:dateTime .
         ] .

<http://ns.taverna.org.uk/2010/workflow/ea4168eb-67ea-440f-ab73-818da5efc998/processor/String_constant/out/value>
a scufl2:OutputProcessorPort, owl:Class ;
  rdfs:subClassOf prov:Role ;
  scufl2:name "value" ;
  dc:description "A user provided description for this port in this
workflow"@en;

<http://ns.taverna.org.uk/2010/workflow/ea4168eb-67ea-440f-ab73-818da5efc998/processor/String_constant/>
a scufl2:Processor ;
   scufl2:outputProcessorPort
<http://ns.taverna.org.uk/2010/workflow/ea4168eb-67ea-440f-ab73-818da5efc998/processor/String_constant/out/value>
.


but the ontology has not yet been updated for this suggestion and only
allows prov:used directly to an Entity - and I guess it should be
optional to go by indirection of a Role. (by making the role an
entity?) It would also be a bit strange to call it a 'Role' for the
case where you just want to say the time something was used or
generated - but don't know in exactly which role - a better name might
be something like "Usage" - but I guess we want this symmetric for
both prov:used and prov:wasGeneratedBy.



I'm uncertain OWL-wise what are the implications if I make my port
instance also be an owl:Class (in order to subclass prov:Role) - as
then scufl2:outputProcessorPort is an ObjectProperty pointing to a
class. Perhaps I could do this by indirection of anoter property
instead of subclassing the actual port.


 <http://ns.taverna.org.uk/2011/run/2613aab1-dfe9-4a17-a4be-7589f5d388d6/>
    a  prov:ProcessExecution;
         prov:used [
             rdf:type scufl2:Role ;
             prov:assumedBy
<http://ns.taverna.org.uk/2011/data/2613aab1-dfe9-4a17-a4be-7589f5d388d6/ref/153277f1-5e4f-43fc-968d-ab3a8b038676>
;
             prov:time "2011-09-12T15:02:24Z"^^xsd:dateTime .
             scufl2:rolePort
<http://ns.taverna.org.uk/2010/workflow/ea4168eb-67ea-440f-ab73-818da5efc998/processor/String_constant/out/value>
         ] .

scufl2:Role a owl:Class ;
  rdfs:subClassOf prov:Role.

scufl2:rolePort a owl:ObjectProperty ;
  rdfs:subClassOf scufl2:port .


but then again this boils down to 'what is a role' -
http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html#expression-Role
suggests the role is a plain literal - but that makes it difficult to
do any of the above - for instance in our workflows users might have
provided descriptions about what a particular output 'means' within
that workflow - that is certainly useful information about a role.


-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Friday, 9 September 2011 11:36:31 UTC