W3C home > Mailing lists > Public > public-prov-wg@w3.org > January 2012

Re: PROV-ISSUE-105: 5.3.1 Generation (current version of the conceptual model document) [Conceptual Model]

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Mon, 16 Jan 2012 23:21:40 +0000
Message-ID: <EMEW3|938acd1f0610d456351eedb31b852296o0FNLo08L.Moreau|ecs.soton.ac.uk|4F14B104.8030706@ecs.soton.ac.uk>
To: public-prov-wg@w3.org
Hi Satya,
Responses interleaved.

On 08/12/11 01:59, Satya Sahoo wrote:
> Hi Luc,
> Again apologies about the delayed reply. My responses are interleaved:
>         5.3.1 Generation
>         =====
>         1. In PROV-DM, a generation expression is a representation of
>         a world event, the creation of a new characterized thing by an
>         activity. This characterized thing did not exist before creation.
>         Issue: The "characterized thing" in the above statements is
>         Entity or some other resource?
>     Now,  we have defined entity as an identifiable characterized
>     thing. So, the statements has become:
>     In PROV-DM, a generation record is a representation of a world
>     event, the creation of a new entity by an activity. This entity
>     did not exist before creation. The representation of this event
>     encompasses a description of the modalities of generation of this
>     entity by this activity.
> Ok. I have raised the issue of activity vs. event separately 
> (generation record as representation of a world event).
>         2. contains a generationQualifier q that describes the
>         modalities of generation of this thing by this activity
>         Issue: How is this qualifier distinct from specialization of
>         the generation property?
>     I think the work on prov-o now answers this question.
> Ok, I believe we have covered this through introduction of 
> qualifiedInvolvement in prov-o, which is different from specialization.
>         3. The first one is available as the first value on port p1,
>         whereas the other is the second value on port p1.
>         Issue: As we discussed during the telcon on  Sept 15 [1] and
>         in email thread (Subject: Roles, initiated by Paolo on Sept
>         15), the "qualifier" if any are on the entity and PE and not
>         on the relation. In the above statement, port p1 is qualifier
>         for either the entities e1, e2 (they were generated on that
>         particular port) or the PE pe1 (it was using that port for
>         listening/responding). Hence, the qualifiers are on the
>         "class" and not the "relation".
>         [1] http://www.w3.org/2011/prov/meeting/2011-09-15
>     I think the work on prov-o also answers this comment.
> Ok, as above the qualifiedInvolvement work in prov-o covers this now.
>         4. If two process executions sequentially set different values
>         to some attribute by means of two different generate events,
>         then they generate distinct entities.
>         Issue: This is an incorrect statement. Setting values of an
>         entity at different points of time cannot be equated to
>         generating new entities. For example, we don't generate a new
>         human being everytime a PE changes the value of their age. pe1
>         sets Person X age = 5 years in 2005 and pe2 sets Person X age
>         = 10 years in 2010 then they are not generating new person
>         (within an account or across accounts).
>     Remember that an entity is a perspective on a thing.
>     So, here, we can have multiple perspectives:
>     e1 Luc
>     e2 Luc at age=5
>     e3 Luc at age=10
>     e3 and e2 have a same attribute name age, but different values. So
>     they must be different entities,i.e. perspectives, over human
>     being e1.
> As I have discussed in other mails, interpreting entity to be 
> perspective on a thing does not work in an information system where 
> everything is a representation of a thing in the world and interpreted 
> as things in the information system. A thing never enters any 
> information system. Hence, the distinction between a representation of 
> a thing and the thing cannot be maintained in any information system.
> In addition, e1 is Luc not human being. Since there are 7 billion 
> human beings and when we make assertions about a person we use an 
> identifier to refer to a specific human being. So, the assertions of 
> age=5 and age=10 are being made about Luc and not human being pe se.
>         5. Alternatively, for two process executions to generate an
>         entity simultaneously, they would require some synchronization
>         by which they agree the entity is released for use; the end of
>         this synchronization would constitute the actual generation of
>         the entity, but is performed by a single process
>         execution.Given an entity expression denoted by e, two process
>         execution expressions denoted by pe1 and pe2, and two
>         qualifiers q1 and q2, if the expressions
>         wasGeneratedBy(e,pe1,q1) and wasGeneratedBy(e,pe2,q2) exist in
>         the scope of a given account, then pe1=pe2 and q1=q2.
>         Issue: If two sculptors collaborate on creating a human
>         figurine statue entity e1: sculptor A by PE pe1 creates the
>         arms and legs of e1 and sculptor B by PE pe2 creates the head
>         and upper-body part of e1 then both pe1 and pe2 create e1.
>         They may or may not be synchronized. How can we infer that pe1
>         = pe2 (whether in one account or across accounts)?
>     I think you've articulated well the case that A and B create
>     different parts.  If they do this at different times, you will have
>     statue without head, statue with head without leg, statue with
>     head with leg.
>     The constrained with accounts on generation-unicity is enforcing
>     some structure in the provenance records, so that if really
>     pe1<>pe2, then
>     they should generate the statue in different records.
> But, both they together generated the statue, which has leg + head. 
> So, just because two or more distinct activities led to the creation 
> of a single entity does not mean that for the sake of the above 
> constraint the entity has to be "broken down" and referred to by 
> distinct identifiers. I am afraid any user or provenance application 
> will find the constraint unnecessary as it does not reflect scores of 
> real world scenarios.

I have explicitly encoded this example. See:
>     I am proposing, in the end, to follow Simon's proposal, and move
>     this in an entirely different section.
>         6. Given an identifier pe for a process execution expression,
>         an identifier e for an entity expression, qualifier q, and
>         optional time t, if the assertion wasGeneratedBy(e,pe,r) or
>         wasGeneratedBy(e,pe,r,t) holds, then the values of some of e's
>         attributes are determined by the activity represented by
>         process execution expression identified by pe and the entity
>         expressions used by pe. Only some (possibly none) of the
>         attributes values may be determined since, in an open world,
>         not all used entity expressions may have been asserted.
>         [PROV:0002]
>         Issue: This constraint is confusing (maybe even contradictory)
>         - some or none attributes may be determined? Further, there is
>         no specification or mechanism defined to identify which
>         attributes were determined by the PE? the constraint does not
>         provide any new information (even as a constraint) regarding
>         generation.
>     We have decided to drop this constraint at the last teleconference.
> Ok.
>         7. If an assertion wasGeneratedBy(x,pe,r) or
>         wasGeneratedBy(x,pe,r,t), then generation of the thing denoted
>         by x precedes the end of pe and follows the beginning of pe.
>         Issue: Suggest rewording this: given the assertion that "an
>         Entity e1 was generated by a PE pe1" then "the Entity e1 did
>         not exist before start of PE pe1".
>     This would be an entirely different meaning that is not the same
>     as the one intended.
> Exactly. I am not sure why is it necessary for generation of x to 
> precede end of pe since they can share the same event or time value? 
> For example, it is fairly common to state "the car production ended 
> with the production of car c1 at 10:00am on Dec 7."

The constraint is just stating that  generation occurs during the 
duration of the activity. I don't see how it can occur before or after 
the activity.


> Thanks.
> Best,
> Satya
>     Cheers,
>     Luc
>     -- 
>     Professor Luc Moreau
>     Electronics and Computer Science   tel: +44 23 8059 4487
>     <tel:%2B44%2023%208059%204487>
>     University of Southampton          fax: +44 23 8059 2865
>     <tel:%2B44%2023%208059%202865>
>     Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>     <mailto:l.moreau@ecs.soton.ac.uk>
>     United Kingdom http://www.ecs.soton.ac.uk/~lavm
>     <http://www.ecs.soton.ac.uk/%7Elavm>
Received on Monday, 16 January 2012 23:22:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:11 UTC