- From: Yolanda Gil <gil@isi.edu>
- Date: Thu, 10 Nov 2011 09:32:28 -0800
- To: Provenance Working Group WG <public-prov-wg@w3.org>
- Cc: Paolo Missier <paolo.missier@newcastle.ac.uk>, Paul Groth <p.t.groth@vu.nl>, Luc Moreau <L.Moreau@ecs.soton.ac.uk>, "Reza B'Far (Oracle)" <reza.bfar@oracle.com>, Ryan Golden <ryan.golden@oracle.com>
All, Prompted by Luc and Paul, a small group of us including Reza and Paolo have been debating the agent-related aspects of the PROV model. We reviewed issues and discussions that have been brought up by the larger group, and we have a proposal to include in the model 5 basic statements about agents, which are summarized at the bottom of this message. Before giving the 5 statements, we explain the rationale. As a first step, we agree to identify "agent" as a special type of "entity". Modeling agents as entities enables expressing the provenance of agents. Note that many entities will not be agents. A data file or a document that are input to an activity will not be considered agents in our model. A person's record that is input to an activity will not be considered an agent either if it is just being fed to the activity as data. Second, we would need to define what an agent is. We propose to define agent as a type of entity that takes an active role in an activity such that it can be assigned some degree of responsibility for the activity taking place. We want to stay away from using in this definition concepts such as enabling, causing, initiating, affecting, etc, because any entities also enable, cause, initiate, and affect in some way the activities. So the notion of having some degree of responsibility is really what makes an agent. Even software agents can be assigned some responsibility for the effects they have in the world, so for example if one is using Word and one's laptop crashes then one would say that Word was responsible for crashing the laptop. If one invokes a service to buy a book, that service can be considered responsible for drawing funds from one's bank to make the purchase (the company that runs the service and the web site would also be responsible, but the point here is that we assign some measure of responsibility to software as well). So when someone models software as an agent for an activity in our model they mean the agent has some responsibility for that activity. Third, many agents can have an association with a given activity. An agent may do the ordering of the activity, another agent may do its design, another agent may push the button to start it, another agent may run it, etc. As many agents as one wishes to mention in the provenance record if it is important to indicate that they were associated with the activity. Other terms that have been circulated to refer to this include "involved", "used", "participated", and "hadRoleIn". Fourth, we believe that it would be useful to define some basic categories of agents. Defining some standard classes of agents will improve the use of provenance records by applications. There should be very few of these basic categories to keep the model simple and accessible. Also, a simpler model will be more extensible, as we anticipate that agency may be defined very differently in different domains and applications so the model should be able to accommodate that. We propose to have three classes of agents in the model: "foaf:person" (sometimes referred to as "human agent" but that may be too AI-ish), "foaf:organization", and "software agent". These classes should be mutually exclusive, though they do not cover all subclasses of agent. We could also add "foaf:group". Fifth, the notion of responsibility needs to be pinned down and this is challenging. It is important to reflect that there is a degree in the responsibility of agents, and that is a major reason for distinguishing among all the agents that have some association with an activity and determine which ones are really the originators of the entity. For example, a programmer and a researcher could both be associated with running a workflow, but it may not matter what programmer clicked the button to start the workflow while it would matter a lot what researcher told the programmer to do so. Another example: a student publishing a web page describing an academic department could result in both the student and the department being agents associated with the activity, and it may not matter what student published a web page but it matters a lot that the department told the student to put up the web page. So there is some notion of responsibility that needs to be captured. Similarly with intent. These notions are hard to define, but it would be even harder to make them easy for people using our model so they would be comfortable assigning and stating responsibility. We would like to suggest a much milder version of responsibility. We propose to represent when an agent acted on another agent's behalf. So in the example of someone running a mail program, the program is an agent of that activity and the person is also an agent of the activity, but we would also add that the mail software agent is running on the person's behalf. In the other example, the student acted on behalf of his supervisor, who acted on behalf of the department chair, who acts on behalf of the university, and all those agents are responsible in some way for the activity to take place but we don't say explicitly who bears responsibility and to what degree. We could also say that an agent can act on behalf of several other agents (a group of agents). This would also make possible to indirectly reflect chains of responsibility. This also indirectly reflects control without requiring that control is explicitly indicated. In some contexts there will be a need to represent responsibility explicitly, for example to indicate legal responsibility, and that could be added as an extension to this core model. Similarly with control, since in particular contexts there might be a need to define specific aspects of control that various agents exert over a given activity. Finally, we want to provide temporal ordering among activities, and we can define that by expressing that agents start and end activities. We could use wasStartedBy to represent that an activity was started by an entity, and was EndedBy to represent that an activity was ended by an entity. For wasStartedBy, the entity must exist before start of the activity. For wasEndedBy, the entity must exist before end of the activity. It would then be possible to assert either start/end for an activity, or state relative temporal orderings between activities. So, to summarize, we propose the following statements for a minimal model: 1) An "agent" is a type of entity that takes an active role in an activity such that it can be assigned some degree of responsibility for the activity taking place. 2) Many agents can be "associatedWith" a given activity. 3) Subclasses of agent are "foaf:person", "foaf:organization", and "software agent". 4) Agents can run activities on behalf of other agents, indicated by "runOnBehalfOf". 5) Agents can be responsible for starting and ending activities, indicated as "wasStartedBy" and "wasEndedBy". We look forward to everyone's comments on this! Yolanda Yolanda Gil Director of Knowledge Technologies, USC/ISI Associate Director for Research, Intelligent Systems Division, USC/ISI Research Professor of Computer Science Information Sciences Institute University of Southern California 4676 Admiralty Way, Suite 1001 Marina del Rey, CA 90292 (USA) Phone: +1-310-448-8794 Fax: +1-310-822-0751 http://www.isi.edu/~gil
Received on Thursday, 10 November 2011 17:34:01 UTC