Re: PROV-ISSUE-4: agent subtypes? from Reza B'Far on 2011-07-16 (public-prov-wg@w3.org from July 2011)

From: Reza B'Far <reza.bfar@oracle.com>
Date: Sat, 16 Jul 2011 10:03:17 -0700
To: Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>
CC: public-prov-wg@w3.org
Message-ID: <4E21C455.9050407@oracle.com>
Khalid -

I had told folks that I'd drop the thread, but since you're new on the thread, 
here is the concrete use-case:

A file is created and then modified.  The modifications can be made by a bunch 
of scripts, automated programs, etc. such as back-up programs, automated merge 
programs, etc.  or it can be made by a human being.  In practical use-cases that 
at least I'm concerned about, the automated mods are treated completely 
differently than those made by a human being.  There are practical concerns 
(such as scalability, usability, etc.) where you want to fundamentally 
differentiate between the types.  Now, let's assume we stay generic.  The issue 
with that becomes than an import/export becomes lossy.  In other words, the gap 
in the definition of agent is so large, that I can export my data and import it 
back in without losing a bunch of semantics (unless I hack it up with a bunch of 
extension).  My contention on this thread was that this use-case is fundamental, 
but, as previously stated, other folks seem to disagree.

Best.

On 7/16/11 4:17 AM, Khalid Belhajjame wrote:
> Hi Reza,
>
> I believe you have reasons why we should specify the sub-types of agents, but 
> I must admit I don't see them yet and would like to understand them.
> In particular, I am struggling to see what agent subtyping will add from 
> provenance point of view. For example, are there cases in which provenance 
> queries will treated differently depending on whether the agent is a human or 
> a software.
>
> Thanks, khalid
>
>>
>> From an implementation perspective, and regardless of domain, unless what 
>> you're saying is to completely rule out commercial software products (again, 
>> I state, regardless of domain) which require core features such as import and 
>> export without data loss, then you have to provide some strong typing.  And 
>> you can't just create some generic entity and ask people to go implement 
>> their own specific types as you will create (I would claim force) 
>> incompatibilities between different implementers which will make the standard 
>> completely useless.
>>
>> Having said that, I'm not even sure how the domain comes into play here.  
>> Please read what I said previously, I'm going to restate it at the end of my 
>> email by re-pasting since this is something that any product implementer will 
>> feel strongly about.  Please note that the concepts of "Human agent", "System 
>> agent", "Trusted agent", "Untrusted agent", etc. have NOTHING to do with 
>> domain.  I can post a half-a-dozen IEEE papers here about agents.
>>
>> In one sentence summarized - The request to consider here is to either reduce 
>> the scope of agent to something like User-Agent which is used in many other 
>> W3C standards such as HTML or to accommodate stronger types as mentioned here 
>> (and hopefully we have enough domain participants here that we can create a 
>> domain-independent sub-typed system by consensus)
>>
>> Please read re-post below
>> ---------------------------------
>>    1. The distinction between the direct intervention of a human being
>>       effecting the state of a data versus an indirect intervention is
>>       absolutely crucial.  Without this, establishing "trust" (I mean
>>       this from a formal perspective - something like PACE[1])
>>    2. I personally would lean towards one of the following options -
>>           * Strong Typing of the Agent to multiple types and specifying
>>             exactly what we mean by the types.  For example, /Human
>>             Agent, System Agent/, etc.  I've mentioned this in a
>>             previous thread.  Within all practical usages of provenance
>>             that at least I'm concerned with, there are completely
>>             different treatments of a "snapshot" (or whatever you want
>>             to call it) of the state of an entity (which would be
>>             considered something that is included in provenance) based
>>             on whether or not there is direct human intervention (or
>>             alternatively, far more specification and strong typing) of
>>             the changes.  "Agent" is way to generic to be useful
>>             practically.
>>           * Reducing the use-cases of Agent to just User-Agent which is
>>             the approach that is used in some of the other W3C standards
>>             and is weaved into the fabric of www as we know today.  This
>>             would reduce the scope of what an "Agent" is.  We may
>>             possibly be able to leverage work of UAProf[2] and even if
>>             not, we can learn from UAProf and CC/PP as examples.
>>    3. The key of both (1) and (2) above is that we in order to have a
>>       practical implementation, it is highly desirable to have some very
>>       exact meaning for what "Agent" is, what it does, what the boundary
>>       conditions are, etc.  I also highly encourage that we do NOT
>>       include concepts that start going into RBAC and other security
>>       related standards such as Role.  IMO, we need to reuse concepts
>>       from these standards.
>>
>> I'm relatively new to the group, but have spent a lot of time reading the 
>> archives.  From an implementation perspective, I caution that if things are 
>> too generic and there is not enough specification (typing) and exactness in 
>> order to accommodate a larger tent, there may be long term implementation 
>> hurdles that are presented in terms of practical implementation.  In terms of 
>> a specific example, I think "Agent" above is one.  It's far too generically 
>> defined at this point, IMO.
>>
>> Please see references below.
>>
>> [1] - PACE - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.8965, 
>> http://www.mendeley.com/research/architectural-support-trust-models-decentralized-applications/ 
>>
>> [2] - UAProf - http://en.wikipedia.org/wiki/UAProf
>> [3] - CC/PP - http://www.w3.org/Mobile/CCPP/
>>
>> On 7/14/11 9:27 AM, Satya Sahoo wrote:
>>> Hi,
>>> I agree with Lena here. Subtypes of agents are domain dependent and I don't 
>>> think we should define them in WG provenance model.
>>>
>>> Regarding Reza's point, our current "definition" of agent (by direct 
>>> assertion or involved in process execution) does not (seem to) prevent 
>>> particular domain users/developers from creating more specific sub-types of 
>>> agent (s/w agent, enzymes, anti-bodies, sensors, researcher, legal analyst etc.)
>>>
>>> Thanks.
>>>
>>> Best,
>>> Satya
>>>
>>> On Thu, Jul 14, 2011 at 12:08 PM, Deus, Helena <helena.deus@deri.org 
>>> <mailto:helena.deus@deri.org>> wrote:
>>>
>>>     Hi Luc,
>>>
>>>     I agree, agent subtypes are important. But are they not also domain
>>>     dependent?
>>>
>>>     Yes, in regard to your question – a catalyst, an enzyme are in fact agents.
>>>
>>>
>>>     Best
>>>
>>>     Lena
>>>
>>>     *From:*public-prov-wg-request@w3.org
>>>     <mailto:public-prov-wg-request@w3.org>
>>>     [mailto:public-prov-wg-request@w3.org
>>>     <mailto:public-prov-wg-request@w3.org>] *On Behalf Of *Luc Moreau
>>>     *Sent:* 14 July 2011 16:21
>>>     *To:* public-prov-wg@w3.org <mailto:public-prov-wg@w3.org>
>>>     *Subject:* PROV-ISSUE-4: agent subtypes?
>>>
>>>     Hi Reza,
>>>
>>>     Yes, it's a good idea to discuss agent subtypes as a separate thread.
>>>
>>>     >From my point of view, I want to be sure that we don't disallow some
>>>     kind of agents, simply
>>>     because we had not thought about them.
>>>
>>>     I believe that from a biology/chemistry point of view, a catalyst could
>>>     be seen as an agent.
>>>
>>>     Views on this?
>>>
>>>     Regards,
>>>     Luc
>>>
>>>
>>>     On 07/13/2011 07:41 PM, Reza B'Far wrote:
>>>
>>>     Graham -
>>>
>>>     Thank you for your thorough response. Please note the following:
>>>
>>>      1. I'm completely fine with sub-typing.  As long as the more concrete
>>>         types (some more exact definitions of agent) are available, I'm fine
>>>         with them "inheriting" from more generic types.  My chief concern as
>>>         an implementer is to make sure that there is enough "typing"
>>>         available so that there is no loss of data in the export/import
>>>         process that can be avoided. *_So, is the next step creation of a
>>>         new email thread for sub-typing Agent?_*
>>>
>>>
>>>     On 7/12/11 11:48 PM, Graham Klyne wrote:
>>>
>>>     Reza,
>>>
>>>     I have two main responses to your comments:
>>>
>>>     (1) your description of "Agent" here seems to me to be closer to what
>>>     the provenance work has envisaged than that described in ws-arch
>>>     document mentioned by Ryan.
>>>
>>>     (2) I fully accept your need for volitional vs computational agent
>>>     distinction for establishing certain kinds of trust in data.  But I
>>>     still think that a generic agent class would keep things simpler for
>>>     developers who are not so concerned with specific legislative or similar
>>>     frameworks - I think it's easier to subclass a generic class as needed
>>>     than to unite distinct classes.
>>>
>>>     Given that yours is a concrete use-case addressing a real and immediate
>>>     implementation need (I understand from comments by you and your
>>>     colleague) I think it may be appropriate to include this
>>>     person-vs-program distinction of agents in an initial model, but also
>>>     providing a generic agent superclass for implementations that don't care
>>>     or don't know what kind of agent is involved.
>>>
>>>     ...
>>>
>>>     Also, I note that even in my revised understanding per your comments,
>>>     the provenance notion of "process execution" still isn't covered by the
>>>     ws-arch terminology relating to agency.
>>>
>>>     ...
>>>
>>>     You mentioned PACE.  The matter of the relationship between work in
>>>     provenance and work in trusted systems came up in the telecon to review
>>>     work of the provenance incubator group, led by Yolanda Gil.  The point
>>>     she made there was that [while these are clearly interconnected] the
>>>     trust work has focused on trust in *systems*, where the provenance work
>>>     is concerned with establishing credibility in specific datasets.  To
>>>     this extent, I think we need to be cautious about over-extending the
>>>     provenance model to also include concepts that would propoerly belong in
>>>     a model for trusted systems.
>>>
>>>     #g
>>>     -- 
>>>
>>>
>>>     Reza B'Far wrote:
>>>
>>>        Folks -
>>>
>>>     To add to Ryan's comments, I had put in a comment previously regarding
>>>     using stronger types for agents.  From a practical implementation
>>>     perspective, a subset of which Ryan mentions to be "audit" trail, etc.,
>>>     please note the following -
>>>
>>>        1. The distinction between the direct intervention of a human being
>>>           effecting the state of a data versus an indirect intervention is
>>>           absolutely crucial.  Without this, establishing "trust" (I mean
>>>           this from a formal perspective - something like PACE[1])
>>>        2. I personally would lean towards one of the following options -
>>>               * Strong Typing of the Agent to multiple types and specifying
>>>                 exactly what we mean by the types.  For example, /Human
>>>                 Agent, System Agent/, etc.  I've mentioned this in a
>>>                 previous thread.  Within all practical usages of provenance
>>>                 that at least I'm concerned with, there are completely
>>>                 different treatments of a "snapshot" (or whatever you want
>>>                 to call it) of the state of an entity (which would be
>>>                 considered something that is included in provenance) based
>>>                 on whether or not there is direct human intervention (or
>>>                 alternatively, far more specification and strong typing) of
>>>                 the changes.  "Agent" is way to generic to be useful
>>>                 practically.
>>>               * Reducing the use-cases of Agent to just User-Agent which is
>>>                 the approach that is used in some of the other W3C standards
>>>                 and is weaved into the fabric of www as we know today.  This
>>>                 would reduce the scope of what an "Agent" is.  We may
>>>                 possibly be able to leverage work of UAProf[2] and even if
>>>                 not, we can learn from UAProf and CC/PP as examples.
>>>        3. The key of both (1) and (2) above is that we in order to have a
>>>           practical implementation, it is highly desirable to have some very
>>>           exact meaning for what "Agent" is, what it does, what the boundary
>>>           conditions are, etc.  I also highly encourage that we do NOT
>>>           include concepts that start going into RBAC and other security
>>>           related standards such as Role.  IMO, we need to reuse concepts
>>>           from these standards.
>>>
>>>     I'm relatively new to the group, but have spent a lot of time reading
>>>     the archives.  From an implementation perspective, I caution that if
>>>     things are too generic and there is not enough specification (typing)
>>>     and exactness in order to accommodate a larger tent, there may be long
>>>     term implementation hurdles that are presented in terms of practical
>>>     implementation.  In terms of a specific example, I think "Agent" above
>>>     is one.  It's far too generically defined at this point, IMO.
>>>
>>>     Please see references below.
>>>
>>>     [1] - PACE -
>>>     http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.8965,
>>>     http://www.mendeley.com/research/architectural-support-trust-models-decentralized-applications/
>>>
>>>     [2] - UAProf - http://en.wikipedia.org/wiki/UAProf
>>>     [3] - CC/PP - http://www.w3.org/Mobile/CCPP/
>>>
>>>     On 7/12/11 12:17 PM, Graham Klyne wrote:
>>>
>>>     Ryan,
>>>
>>>     I think the important element that is missing is that provenance as
>>>     understood so far by this group is intended to capture actual rather
>>>     than potential or unrealized processes.  This is the idea that "Process
>>>     execution" aims to capture.  The notion of "Agent" as described by the
>>>     ws-arch spec is, to my mind, very much concerned with the potential
>>>     rather than the realized computation.
>>>
>>>     Although I'm not a long-time expert in this field, I think this is quite
>>>     central to the notion of provenance we're trying to articulate and
>>>     record, so it's an area where the terminology needs to be quite distinct
>>>     from other usages.  You usage of "invocation" comes closer, I think, but
>>>     I'm not convinced that yet another new term (it's not covered in ws-arch
>>>     as I recall) is helpful at this stage.
>>>
>>>     Because of the focus on actual computations, there's correspondingly
>>>     less need (or so it seems so far based on the use-cases considered) to
>>>     consider subteties of potential processes ("Recipes", "Roles", etc.).  I
>>>     remain open on this, but I would avoid adding concepts for which there
>>>     is not demonstrated need within the goals of provenance modelling and
>>>     recording.
>>>
>>>     #g
>>>     -- 
>>>
>>>
>>>     Ryan Golden wrote:
>>>
>>>     Thanks for taking a look at this, Graham, and I'd be interested to hear
>>>     more feedback from others.  To address a couple of your comments:
>>>
>>>     My intent with Agent was that it closely resemble the concept of
>>>     Invocation, as you say.  I suppose the language "is a computational
>>>     entity" does not effectively convey the intention.  I think Invocation
>>>     necessarily implies an Invoker, so I chose a similar but broader concept
>>>     of Realization.  How does does this strike you as a replacement for
>>>     Process Execution?
>>>
>>>         An Agent realizes zero or more Roles on behalf of zero or more
>>>     Persons or Organizations."
>>>
>>>     My intention with Role is to broaden the idea of Recipe to include more
>>>     abstract functions and purposes, but also to add a subtle implication
>>>     (though not requirement) that it is something to be realized on behalf
>>>     of a person or organization.
>>>
>>>     In associating Person or Organization to the concepts of Agent and Role,
>>>     the model comes closer to something that would be useful in representing
>>>     audit trails or in establishing the trustworthiness of provenance
>>>     assertions.
>>>
>>>     --Ryan
>>>
>>>     On 7/12/2011 10:00 AM, Graham Klyne wrote:
>>>
>>>     (ref. W3C Web Services Architecture Note <http://www.w3.org/TR/ws-arch>)
>>>
>>>     Notwithstanding the slightly divergent usage in the provenance research
>>>     community, I think there is value in using terms already adopted in the
>>>     web services community where they align - I think that would help to
>>>     make our outputs be more readily accepted, hence more relevant.  Thus, I
>>>     think "Person or Organization" is reasonable term, replacing (as I
>>>     understand) what provenance efforts have described as "Agent".
>>>
>>>     But my understanding is that "Process execution" is *not* the same as
>>>     ws-arch:"Agent", being intended to reflect a specific invocation of the
>>>     programme or service.  I think the term ws-arch:"Agent" would more
>>>     closely replace "Recipe".
>>>
>>>     I'm not sure "Role" (ws-arch:"Service Role") has a direct correspondence
>>>     in the terms we've discussed to date, though there is a notion of
>>>     something like role in OPM.  Similarly for "Realizes" and "Acts on
>>>     Behalf of".
>>>
>>>     #g
>>>     -- 
>>>
>>>     Ryan Golden wrote:
>>>
>>>        I'd like to bring a proposal up for discussion regarding Process
>>>     Execution and its related concepts.  Although at the F2F1 there wasn't
>>>     much discussion over "Process Execution," "Generates," "Uses," and
>>>     "Agent," I believe more clarification and discussion is needed in these
>>>     areas.
>>>
>>>     High Level Proposal
>>>     ----------------------------
>>>     a) Rename the concept of "Process Execution" to "Agent,"
>>>     adjusting/adding a few properties
>>>     b) Rename the concept of "Process/Recipe" to "Role," adjusting/adding a
>>>     few properties
>>>     c) Add the concept of "Person or Organization"
>>>     d) Add the concept of "Realizes"
>>>     e) Add the concept of "Acts on Behalf of"
>>>
>>>     More Detailed Proposal
>>>     ---------------------------------
>>>     a) Concept: Agent
>>>         - is a computational entity (narrowed from "piece of work")
>>>         - may use zero or more Entity States (Bobs)
>>>         - may generate zero or more Entity States  (Bobs)
>>>         - may realize zero or more Roles
>>>         - may have a duration
>>>         - may acts on behalf of a "Person or Organization"
>>>         Discussion:
>>>             Agent is a relatively well-defined industry term for an program
>>>     acting on a user's behalf.   I propose it as a replacement for "Process
>>>     Execution," which has the overloaded (and thus undesireable) term
>>>     "process" in it, and does not necessarily imply that it is acting on
>>>     behalf of any one person or organization.  In scenarios involving trust,
>>>     audit, or change tracking, the ability to identify the "who" is crucial,
>>>     and so the relation between Agent and Person or Organization is
>>>     introduced.  "Person or Organization" is discussed further
>>>     below.         Some other common variations are "software agent," or
>>>     "user agent."  One notable difference between this concept and other
>>>     agent concepts is that our Agent may have a duration.  I'm still
>>>     undecided on the utility of the duration.
>>>             There will be some discussion here about non-computational
>>>     agents.  I would question the utility of being able to assert relations
>>>     involving Entity States (Bobs) and non-computational agents, and would
>>>     ask you to first consider whether the same semantics could be better
>>>     represented by a Role instead [see next].
>>>
>>>     b) Concept: Role
>>>         - is an abstract set of tasks which pertain to a job function
>>>         - may have semantics beyond the scope of the WG model (e.g., as
>>>     described in the RBAC reference model)
>>>         - may be realized by zero or more Agents        Discussion:
>>>             Replaces the somewhat confused notions of "Agent" (as it was
>>>     discussed at F2F1), "Process," and "Recipe".  Note that multiple Roles
>>>     can be realized by a single Agent.
>>>
>>>     c) Concept: Person or Organization
>>>         - is a real-world person or organization that an Agent acts on
>>>     behalf of
>>>
>>>     d) Concept: Realizes
>>>         [see Agent and Role]
>>>
>>>     e) Concept: Acts on Behalf of
>>>         [see Agent and Person or Organization]
>>>
>>>     References:
>>>     I have adapted some of this proposal from concepts in the W3C Web
>>>     Services Architecture Note <http://www.w3.org/TR/ws-arch>, a document
>>>     that I don't entirely agree with, but which has some useful models in
>>>     it. I also referred to the NIST RBAC reference model.
>>>
>>>
>>>
>>>     -- 
>>>
>>>     Professor Luc Moreau
>>>
>>>     Electronics and Computer Science   tel:+44 23 8059 4487  <tel:%2B44%2023%208059%204487>          
>>>
>>>     University of Southampton          fax:+44 23 8059 2865  <tel:%2B44%2023%208059%202865>           
>>>
>>>     Southampton SO17 1BJ               email:l.moreau@ecs.soton.ac.uk  <mailto:l.moreau@ecs.soton.ac.uk>   
>>>
>>>     United Kingdomhttp://www.ecs.soton.ac.uk/~lavm  <http://www.ecs.soton.ac.uk/%7Elavm>
>>>
>>>
>
Received on Saturday, 16 July 2011 17:04:24 UTC