W3C home > Mailing lists > Public > public-prov-wg@w3.org > May 2012

Re: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Wed, 30 May 2012 21:18:19 +0100
Message-ID: <EMEW3|15cb875a069ae970d7240b792629c701o4YLIR08L.Moreau|ecs.soton.ac.uk|4FC6808B.5020103@ecs.soton.ac.uk>
To: Timothy Lebo <lebot@rpi.edu>
CC: public-prov-wg@w3.org
Hi Tim,

You seem to have dropped target: this solution exhibits the problems
I tried to describe in previous emails to Simon, Jim, and Graham.

Luc

On 30/05/12 14:04, Timothy Lebo wrote:
> Graham and Luc,
>
> I haven't been able to attend to this thread as much as I'd like, but I'd hope that the "navigation" objectives that Luc offers can be addressed by PROV and are not "scoped out" as Graham seems to be proposing.
>
> In the world of distributed representations, how can the provenance be useful if we can't use it to find other predecessor provenance?
>
>
> Rereading [1]  and going back to Graham's original proposal on this "discussion placeholder" issue (disclaimer: it's been a while since I've read PAQ):
> ======
> Instead of:
>
>   hasProvenanceIn(id, subject, bundle, target, attrs)
>
> Why not just:
>
>   hasProvenanceIn(subject, bundle)
> ======
>
>
> Some comments:
>
> 1)
> I agree that DM should avoid falling into "operational details", as in:
> "Given that bundles may be retrieved separately [PROV-AQ], it is not obvious for a provenance consumer to navigate descriptions across bundles."
> However, the material presented is fully capable of standing on its own without "operational details" - it just needs a slight rewrite.
>
>
> 2)
> The example 45 would solve a critical need in a semantic web world in SPARQL endpoints:
>   hasProvenanceIn(ex:report1, bob:bundle1, -, [ prov:service-uri="http://example.com/service" %% xsd:anyURI ])
>
> Example 46 works only when both bundles are within the same "location scope". We need to go further and permit hasProvenanceIn links across location scopes.
>
> 3)
> I'd like to see the discussion / definition of hasProvenanceIn to situate itself among the "prior art", as a sub property of:
>
> http://dublincore.org/documents/dcmi-terms/#terms-isReferencedBy
> (the inverse of) http://rdfs.org/sioc/spec/#term_topic
> (a less primary form of) http://xmlns.com/foaf/spec/#term_isPrimaryTopicOf
>
>
> 3.B) Perhaps rename hasProvenanceIn to      prov:isTopicOf (with range prov:Bundle)
>
>
> 4) Perhaps hasProvenanceIn can be decomposed into two aspects:
>
> * prov:isTopicOf and
> * prov:atLocation
>
> so example 45
>   hasProvenanceIn(ex:report1, bob:bundle1, -, [ prov:service-uri="http://example.com/service" %% xsd:anyURI ])
> would become:
>
> ex:report1
>     prov:isTopicOf bob:bundle1;
> .
>
> bob:bundle1
>         a prov:Bundle;
>        prov:atLocation<http://example.com/service>;
> .
>
> <http://example.com/service>  a :Service .
>
> Thanks,
> Tim
>
> [1] http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120525/prov-dm.html#component5
>
> On May 30, 2012, at 3:08 AM, Luc Moreau wrote:
>
>    
>> Hi Simon
>>
>> The second use case I have is the following.
>> Same bundles ex:run1 and ex:run2
>>
>>      bundle ex:run1
>>       agent(ex:Bob)  //bob-in-run1
>>       activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00)  //duration: 1hour
>>       wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"])
>>      endBundle
>>
>>      bundle ex:run2
>>       agent(ex:Bob)
>>       activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00)  //duration: 7hours
>>       wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"])
>>      endBundle
>>
>>
>> Now, a visualization tool renders each bundle separately.
>> For the first bundle, the agent Bob is displayed on the left, and  for the second
>> it is on the right.
>>
>>
>>      bundle viz:run1
>>        entity(ex:Bob, [screen:position="left"])            //bob-in-viz-run1
>>        hasProvenanceIn(ex:Bob, ex:run1, ex:Bob)     // equivalent to hasProvenanceIn(ex:Bob, ex:run1, -)
>>      endBundle
>>
>>
>>      bundle viz:run2
>>        entity(ex:Bob, [screen:position="right"])
>>        hasProvenanceIn(ex:Bob, ex:run2, ex:Bob)
>>      endBundle
>>
>> Here the tool reused identifier ex:Bob (this is compatible with our approach of reusing existing identifiers).
>>
>> What I am trying to say here is that
>> bob-in-viz-run1 is a specialization of bob-in-run1, seen as a entity with its own fixed facets.
>>
>> But obviously
>> specializationOf (ex:Bob, ex:Bob) ... does not capture that.
>>
>> Even If we had used a different identifier for Bob
>>
>>      bundle viz:run1
>>        entity(viz:Bob, [screen:position="left"])            //bob-in-viz-run1
>>        hasProvenanceIn(viz:Bob, ex:run1, ex:Bob)     // equivalent to hasProvenanceIn(ex:Bob, ex:run1, -)
>>      endBundle
>>
>> I am not trying to say that viz:Bob specializes ex:Bob in all his occurrences, but only in the context of ex:run1.
>>
>> Luc
>>
>>
>>
>> On 05/29/2012 10:01 AM, Luc Moreau wrote:
>>      
>>> Hi Simon and Jim,
>>>
>>> Whether we have an id and/or attributes is a secondary question. What we need
>>> to clarify is what the concept are involved in this relation.
>>>
>>> In principle, I am in agreement with you. In practice, I don't think we can do
>>> it with the current alternate relation.
>>>
>>> Here is my use case:
>>>
>>>
>>> bundle ex:run1
>>>   activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00)  //duration: 1hour
>>>   wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"])
>>> endBundle
>>>
>>> bundle ex:run2
>>>   activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00)  //duration: 7hours
>>>   wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"])
>>> endBundle
>>>
>>>
>>> bundle tool:analysis01
>>>
>>>    entity(tool:Bob1, [perf:rating="good"])
>>>    hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)  // Bob performance in ex:run1 is good
>>>
>>>    entity(tool:Bob2, [perf:rating="bad"])
>>>    hasProvenanceIn(tool:Bob2, ex:run2, ex:Bob)  // Bob performance in ex:run2 is bad
>>>
>>> endBundle
>>>
>>>
>>> In the bundle tool:analysis01, a tool rates the performance of agents in other bundles.
>>> ex:Bob performance in ex:run1 is good, and bad in ex:run2.
>>>
>>> If, as you suggest, I use
>>>   alternate(tool:Bob1,ex:Bob)
>>> instead of
>>>    hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)
>>> we do not make explicit the context in which ex:Bob is rated.
>>>
>>> Simon seems to suggest we could have a specializationOf over bundles.  I think it's too coarse granularity,
>>> and it wouldn't help in this case.
>>>
>>> Luc
>>>
>>> On 05/29/2012 09:40 AM, Miles, Simon wrote:
>>>        
>>>> I tend to agree with Graham and Jim.
>>>>
>>>> The hasProvenanceIn relation is not a description of provenance, it is about locations of provenance data. It seems unnecessary to apply the same rules as for relations describing the past.
>>>>
>>>> In particular, I'm not clear how attributes should be interpreted for hasProvenanceIn: attributes of what? If we mean metadata about the bundle pointed to, e.g. its format, I think this goes beyond provenance and would ideally be left to serialisations to appropriately address.
>>>>
>>>> Having an ID makes sense for entity(), activity(), agent() etc. because we are giving an identifier to something referred to in other PROV descriptions. I'd argue we don't need a general rule of identifying every description, because it's not obviously about provenance and any given serialisation could easily do that where required.
>>>>
>>>> I agree with Jim that it seems important to use alternateOf here: this seems like the situation that specialisationOf/alternateOf were really designed for. Linked bundles with different IDs for the same entity seem most likely where different asserters provide provenance on the same entity or one asserter provides different accounts of the provenance of an entity. Each bundle then takes a particular perspective on the entity. Where we can be more precise and say one bundle's perspective is a specialisationOf the other, that is even better.
>>>>
>>>> thanks,
>>>> Simon
>>>>
>>>> Dr Simon Miles
>>>> Senior Lecturer, Department of Informatics
>>>> Kings College London, WC2R 2LS, UK
>>>> +44 (0)20 7848 1166
>>>>
>>>> accounting for the reasons behind contractual violations:
>>>> http://eprints.dcs.kcl.ac.uk/1283/
>>>>
>>>> From: Jim McCusker [mccusj@rpi.edu]
>>>> Sent: 28 May 2012 21:59
>>>> To: Luc Moreau
>>>> Cc: public-prov-wg@w3.org
>>>> Subject: Re: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]
>>>>
>>>> Can't that be decomposed into:
>>>>
>>>> hasProvenanceIn(ex:report1,bob:bundle4)
>>>> alternateOf(alice:report1, ex:report1)
>>>>
>>>> ?
>>>>
>>>> We should focus on re-using constructs rather than implicitly re-introducing them into relations like this. Especially since the idea of a target is entirely optional, as bob and alice may already be using the same URIs.
>>>>
>>>> Jim
>>>>
>>>> On Mon, May 28, 2012 at 4:26 PM, Luc Moreau<L.Moreau@ecs.soton.ac.uk>  wrote:
>>>> Hi Graham,
>>>>
>>>> Like PROV-AQ, we need a target.
>>>> Example 47 illustrates the need for it:
>>>>
>>>>   hasProvenanceIn(alice:report1, bob:bundle4, ex:report1)
>>>>
>>>> In the current bundle, there is a description for alice:report1.
>>>> More provenance can be found for it in bundle bob:bundle4, under the name ex:report1.
>>>>
>>>>
>>>> The presence of attributes and id follow the pattern of other qualified relations.
>>>>
>>>> Luc
>>>>
>>>>
>>>> On 28/05/12 20:01, Provenance Working Group Issue Tracker wrote:
>>>> PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]
>>>>
>>>> http://www.w3.org/2011/prov/track/issues/385
>>>>
>>>> Raised by: Graham Klyne
>>>> On product: prov-dm
>>>>
>>>> I'm raising this issue as a placeholder and for discussion.  I didn't notice the arrival of prov:hasProvenanceIn, but based on its appearance in http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120525/prov-dm.html (which AFAIK is not a currently active draft, but a proposal) is rather over-complicated and a bit obscure.
>>>>
>>>> My sense is that, especially as this is motivated by PROV-AQ, there are just too many identifiers floating around.
>>>>
>>>> Instead of:
>>>>
>>>>    hasProvenanceIn(id, subject, bundle, target, attrs)
>>>>
>>>> Why not just:
>>>>
>>>>    hasProvenanceIn(subject, bundle)
>>>>
>>>> Where subject is based on the URI of an entity, and bundle is based on the URI of a provenance bundle with information about that entity.
>>>>
>>>> I would like to understand what  real scenario justifies all the added machinery that has been included with this relation.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Jim McCusker
>>>> Programmer Analyst
>>>> Krauthammer Lab, Pathology Informatics
>>>> Yale School of Medicine
>>>> james.mccusker@yale.edu | (203) 785-6330
>>>> http://krauthammerlab.med.yale.edu
>>>>
>>>> PhD Student
>>>> Tetherless World Constellation
>>>> Rensselaer Polytechnic Institute
>>>> mccusj@cs.rpi.edu
>>>> http://tw.rpi.edu
>>>>          
>>> -- 
>>> Professor Luc Moreau
>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>> University of Southampton          fax:   +44 23 8059 2865
>>> Southampton SO17 1BJ               email:
>>> l.moreau@ecs.soton.ac.uk
>>>
>>> United Kingdom
>>> http://www.ecs.soton.ac.uk/~lavm
>>>
>>>
>>>
>>>        
>> -- 
>> Professor Luc Moreau
>> Electronics and Computer Science   tel:   +44 23 8059 4487
>> University of Southampton          fax:   +44 23 8059 2865
>> Southampton SO17 1BJ               email:
>> l.moreau@ecs.soton.ac.uk
>>
>> United Kingdom
>> http://www.ecs.soton.ac.uk/~lavm
>>      
>    
Received on Wednesday, 30 May 2012 20:18:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 30 May 2012 20:19:00 GMT