Re: ISSUE-385: hasProvenanceIn: finding a solution from Luc Moreau on 2012-06-04 (public-prov-wg@w3.org from June 2012)

From: Luc Moreau <l.moreau@ecs.soton.ac.uk>
Date: Mon, 04 Jun 2012 10:16:39 +0100
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>, W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <EMEW3|daf8352a11a3f67adfa31c532e4c6186o53AHx08l.moreau|ecs.soton.ac.uk|4FCC7CF7>
Hi all,

During this diamond jubilee WE, I had the opportunity to think about Tim 
and Simon's long emails.

I agree with them that we have concepts of alternate and specialisation, 
and we want to reuse them.

I also came to the conclusion that behind the hasProvenanceIn relation, 
what I really wanted was a form of alternate. But not what Tim or Simon 
are suggesting.

The PROV data model has a shortcoming: the inability to identify 
something in some context. That's what I am trying to address here.

Coming back to my rating example, consider the producers of bundles 
ex:run1 and ex:run2 had written them as follows.

    bundle ex:run1
     activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00)  
//duration: 1hour
     wasAssociatedWith(ex:a1,ex:Bob1,[prov:role="controller"])
      specialisation(ex:Bob1,ex: Bob) //****
    endBundle

    bundle ex:run2
     activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00)  
//duration: 7hours
     wasAssociatedWith(ex:a2,ex:Bob2,[prov:role="controller"])
      specialisation(ex:Bob2,ex: Bob) //****
    endBundle

  By identifying ex:Bob1 and ex:Bob2, it then becomes easy to write 
their rating.

    bundle tool:analysis01
      agent(tool:Bob1, [perf:rating="good"])
      alternate(tool:Bob1, ex:Bob1)

      agent(tool:Bob2, [perf:rating="bad"])
        alternate(tool:Bob2, ex:Bob2)
    endBundle

But my example was not like that, ex:Bob was used in run1 and run2

bundle ex:run1
     activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00)  
//duration: 1hour
     wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"])
    endBundle

    bundle ex:run2
     activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00)  
//duration: 7hours
     wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"])
    endBundle

What we need is a mechanism to be able to identify things in some 
context: ex:bob in run1, as if  the
expressions (****) had been asserted in ex:run1.

bundle tool:analysis01
      agent(tool:Bob1, [perf:rating="good"])
      alternate(tool:Bob1, ex:Bob,ex:run1)

      agent(tool:Bob2, [perf:rating="bad"])
        alternate(tool:Bob2, ex:Bob,ex:run2)
    endBundle

The interpretation of
        alternate(tool:Bob2, ex:Bob,ex:run2)
is that tool:Bob2 is the entity that share aspects of ex:bob as 
described by ex:run2. *Conceptually*, this could be done by substituting 
ex:Bob for tool:Bob2 in ex:run2.

I appreciate that what I am describing here is not too distant from 
http://www.w3.org/TR/2011/WD-prov-dm-20111215/#record-complement-of, 
which had optional account, and was not received with enthusiasm, to say 
the least.

Coincidentally, Paul shared this paper
http://ceur-ws.org/Vol-614/owled2010_submission_29.pdf which introduces  
rules of the kind
/X counts as Y in context C/
which bears some resemblance with what I am trying to argue for.

So, my proposal is;
- drop hasProvenanceIn
- drop isTopicIn
- allow for the ternary form of alternate

Tim and Simon approach by using two binary relations do not offer the 
same level of expressivity.
The also have a technological bias, as well: they require 
querying/reasoning facility.  Therefore,
their suggestion is not suitable for a data model supposed to be 
technology neutral.

Luc

On 31/05/2012 22:54, Luc Moreau wrote:
> All,
>
> To try and converge towards a solution, I am
> circulating an example using a ternary hasProvenanceIn.
> I would like to understand if and how we can make it work with
> a simpler relation.
>
>
> Two bundles ex:run1 and ex:run2 describe bob's role as a controller
> of two activities.  Same bob, two different bundles.
>
>     bundle ex:run1
>      activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00)  
> //duration: 1hour
>      wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"])
>     endBundle
>
>     bundle ex:run2
>      activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00)  
> //duration: 7hours
>      wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"])
>     endBundle
>
>
> A performance analysis tool rates the performance of agents (this 
> could be used
> to dispatch further work to performant agents, or congratulate them, 
> etc).
>
>
>     bundle tool:analysis01
>
>       agent(tool:Bob1, [perf:rating="good"])
>       hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)  // Bob performance 
> in ex:run1 is good
>
>       agent(tool:Bob2, [perf:rating="bad"])
>       hasProvenanceIn(tool:Bob2, ex:run2, ex:Bob)  // Bob performance 
> in ex:run2 is bad
>
>     endBundle
>
> The performance analysis tool has to rate two involvements of ex:Bob 
> in two separate activities.
> Two specialized version of ex:Bob are defined: tool:bob1 and 
> tool:bob2, with rating good and
> bad respectively.
>
> tool:Bob1 is linked to ex:Bob in run1, and tool:Bob2 is linked to 
> ex:Bob in run2, with the following
>
>       hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)
>       hasProvenanceIn(tool:Bob2, ex:run2, ex:Bob)
>
> Nothing is expressed about ex:Bob in bundle tool:analysis01 (except 
> that this is an alias
> for tool:Bob1 and tool:Bob2).
>
> It is suggested that the ternary relation could be replaced by
> isTopicIn(tool:Bob1, ex:run1)
> and
> specialization(tool:Bob1, ex:Bob).
>
> I don't understand the point of
>   isTopicIn(tool:Bob1, ex:run1)
> since tool:Bob1 is not a topic in ex:run1.
>
> Also, we now seem to have made ex:Bob a topic of tool:analysis01, because
> the following expression.
> specialization(tool:Bob1, ex:Bob).
>
> From tool:analysis01, where do I find provenance about ex:Bob?
> It look like this has become a dead end in this graph.
>
> Do I need to introduce:
>   isTopicIn(ex:Bob, ex:run1)
>   isTopicIn(ex:Bob, ex:run2)?
>
>
> So now we would  have:
> isTopicIn(tool:Bob1, ex:run1)
> specialization(tool:Bob1, ex:Bob)
> isTopicIn(tool:Bob2, ex:run2)
> specialization(tool:Bob2, ex:Bob)
> isTopicIn(ex:Bob, ex:run1)
> isTopicIn(ex:Bob, ex:run2)
>
> Which means that:
>
> specialization(tool:Bob1, ex:Bob)
> isTopicIn(ex:Bob, ex:run2)
>
> ... would lead us to believe that good rating is due to slow performance.
>
> Can the proposer of the separate binary relations explain how this 
> example can work?
>
> Thanks,
> Luc
Received on Monday, 4 June 2012 09:20:46 UTC