A proposal for modeling agents from Yolanda Gil on 2011-11-10 (public-prov-wg@w3.org from November 2011)

From: Yolanda Gil <gil@isi.edu>
Date: Thu, 10 Nov 2011 09:32:28 -0800
To: Provenance Working Group WG <public-prov-wg@w3.org>
Cc: Paolo Missier <paolo.missier@newcastle.ac.uk>, Paul Groth <p.t.groth@vu.nl>, Luc Moreau <L.Moreau@ecs.soton.ac.uk>, "Reza B'Far (Oracle)" <reza.bfar@oracle.com>, Ryan Golden <ryan.golden@oracle.com>
Message-Id: <CDFD3D2D-6354-4618-BB05-B541B84DC5EB@ISI.EDU>
All,

Prompted by Luc and Paul, a small group of us including Reza and Paolo  
have been debating the agent-related aspects of the PROV model.  We  
reviewed issues and discussions that have been brought up by the  
larger group, and we have a proposal to include in the model 5 basic  
statements about agents, which are summarized at the bottom of this  
message.  Before giving the 5 statements, we explain the rationale.

As a first step, we agree to identify "agent" as a special type of  
"entity".  Modeling agents as entities enables expressing the  
provenance of agents.  Note that many entities will not be agents.   A  
data file or a document that are input to an activity will not be  
considered agents in our model.  A person's record that is input to an  
activity will not be considered an agent either if it is just being  
fed to the activity as data.

Second, we would need to define what an agent is.  We propose to  
define agent as a type of entity that takes an active role in an  
activity such that it can be assigned some degree of responsibility  
for the activity taking place.  We want to stay away from using in  
this definition concepts such as enabling, causing, initiating,  
affecting, etc, because any entities also enable, cause, initiate, and  
affect in some way the activities.  So the notion of having some  
degree of responsibility is really what makes an agent.  Even software  
agents can be assigned some responsibility for the effects they have  
in the world, so for example if one is using Word and one's laptop  
crashes then one would say that Word was responsible for crashing the  
laptop.  If one invokes a service to buy a book, that service can be  
considered responsible for drawing funds from one's bank to make the  
purchase (the company that runs the service and the web site would  
also be responsible, but the point here is that we assign some measure  
of responsibility to software as well).  So when someone models  
software as an agent for an activity in our model they mean the agent  
has some responsibility for that activity.

Third, many agents can have an association with a given activity.  An  
agent may do the ordering of the activity, another agent may do its  
design, another agent may push the button to start it, another agent  
may run it, etc.  As many agents as one wishes to mention in the  
provenance record if it is important to indicate that they were  
associated with the activity.  Other terms that have been circulated  
to refer to this include "involved", "used", "participated", and  
"hadRoleIn".

Fourth, we believe that it would be useful to define some basic  
categories of agents.  Defining some standard classes of agents will  
improve the use of provenance records by applications.  There should  
be very few of these basic categories to keep the model simple and  
accessible.  Also, a simpler model will be more extensible, as we  
anticipate that agency may be defined very differently in different  
domains and applications so the model should be able to accommodate  
that.  We propose to have three classes of agents in the model:  
"foaf:person" (sometimes referred to as "human agent" but that may be  
too AI-ish), "foaf:organization", and "software agent".  These classes  
should be mutually exclusive, though they do not cover all subclasses  
of agent.  We could also add "foaf:group".

Fifth, the notion of responsibility needs to be pinned down and this  
is challenging.  It is important to reflect that there is a degree in  
the responsibility of agents, and that is a major reason for  
distinguishing among all the agents that have some association with an  
activity and determine which ones are really the originators of the  
entity.  For example, a programmer and a researcher could both be  
associated with running a workflow, but it may not matter what  
programmer clicked the button to start the workflow while it would  
matter a lot what researcher told the programmer to do so.  Another  
example: a student publishing a web page describing an academic  
department could result in both the student and the department being  
agents associated with the activity, and it may not matter what  
student published a web page but it matters a lot that the department  
told the student to put up the web page.  So there is some notion of  
responsibility that needs to be captured.  Similarly with intent.   
These notions are hard to define, but it would be even harder to make  
them easy for people using our model so they would be comfortable  
assigning and stating responsibility.  We would like to suggest a much  
milder version of responsibility.  We propose to represent when an  
agent acted on another agent's behalf.  So in the example of someone  
running a mail program, the program is an agent of that activity and  
the person is also an agent of the activity, but we would also add  
that the mail software agent is running on the person's behalf.  In  
the other example, the student acted on behalf of his supervisor, who  
acted on behalf of the department chair, who acts on behalf of the  
university, and all those agents are responsible in some way for the  
activity to take place but we don't say explicitly who bears  
responsibility and to what degree.  We could also say that an agent  
can act on behalf of several other agents (a group of agents).  This  
would also make possible to indirectly reflect chains of  
responsibility.  This also indirectly reflects control without  
requiring that control is explicitly indicated.  In some contexts  
there will be a need to represent responsibility explicitly, for  
example to indicate legal responsibility, and that could be added as  
an extension to this core model.  Similarly with control, since in  
particular contexts there might be a need to define specific aspects  
of control that various agents exert over a given activity.

Finally, we want to provide temporal ordering among activities, and we  
can define that by expressing that agents start and end activities.   
We could use wasStartedBy to represent that an activity was started by  
an entity, and was EndedBy to represent that an activity was ended by  
an entity.  For wasStartedBy, the entity must exist before start of  
the activity. For wasEndedBy, the entity must exist before end of the  
activity.  It would then be possible to assert either start/end for an  
activity, or state relative temporal orderings between activities.

So, to summarize, we propose the following statements for a minimal  
model:

1) An "agent" is a type of entity that takes an active role in an  
activity such that it can be assigned some degree of responsibility  
for the activity taking place.

2) Many agents can be "associatedWith" a given activity.

3) Subclasses of agent are "foaf:person", "foaf:organization", and  
"software agent".

4) Agents can run activities on behalf of other agents, indicated by  
"runOnBehalfOf".

5) Agents can be responsible for starting and ending activities,  
indicated as "wasStartedBy" and "wasEndedBy".

We look forward to everyone's comments on this!

Yolanda




Yolanda Gil
Director of Knowledge Technologies, USC/ISI
Associate Director for Research, Intelligent Systems Division, USC/ISI
Research Professor of Computer Science
Information Sciences Institute
University of Southern California
4676 Admiralty Way, Suite 1001
Marina del Rey, CA 90292 (USA)
Phone: +1-310-448-8794
Fax: +1-310-822-0751
http://www.isi.edu/~gil
Received on Thursday, 10 November 2011 17:34:01 UTC