Scruffy vs proper (was: PROV-ISSUE-249 (two-derivations): Why do we have 3 derivations? [prov-dm])

I think it's a mistake to think of "scruffy" and "proper" as different kinds of 
provenance.  They are fundamentally the same.  Rather, if the provenance is 
collected and managed under conditions that we might consider "proper", then we 
can combine freely and use the additional inferences that flow from those 
conditions.

For provenance that is not collected and managed under these "proper" 
conditions, then we may wish to consider something akin to Guha's "lifting 
rules" [1] for extracting appropriately contextualized provenance information 
that can be treated as "proper".

In summary: scruffy vs proper isn't about the data model or the provenance 
itself so much as its context of collection and use.  IMO.

#g
--

[1] http://www-formal.stanford.edu/guha/


On 10/02/2012 14:11, Daniel Garijo wrote:
> I agree with Khalid too.
> Small question: Is the new version of DM going to include both scruffy and
> proper provenance,
> or is it going to be separated in two different documents?
>
> Thanks,
> Daniel
>
> 2012/2/10 Khalid Belhajjame<Khalid.Belhajjame@cs.man.ac.uk>
>
>>
>> +1
>>
>> I think this proposal will also simplify the model.
>> The consequence of applying this proposal will also IMO remove some
>> confusion, by avoiding talking about granularity of the activities involved
>> in the derivation. In particular, what for one observer can be
>>   imprecise-1, because s/he believes that the activity involved in the
>> derivation is atomic, can be seen by another observer as imprecise-n,
>> because s/he believes that the activity involved in the derivation is
>> composite. Talking simply about precise and imprecise derivation allows us
>> to avoid this issue.
>>
>> Khalid
>>
>>
>> On 09/02/2012 23:11, Provenance Working Group Issue Tracker wrote:
>>
>>> PROV-ISSUE-249 (two-derivations): Why do we have 3 derivations? [prov-dm]
>>>
>>> http://www.w3.org/2011/prov/**track/issues/249<http://www.w3.org/2011/prov/track/issues/249>
>>>
>>> Raised by: Luc Moreau
>>> On product: prov-dm
>>>
>>> We currently have 3 derivations:
>>>
>>>
>>> A precise-1 derivation, written wasDerivedFrom(id, e2, e1, a, g2, u1,
>>> attrs)
>>> An imprecise-1 derivation, written wasDerivedFrom(id, e2,e1, t, attrs)
>>> An imprecise-n derivation, written wasDerivedFrom(id, e2, e1, t, attrs)
>>>
>>>
>>> Imprecise-1/imprecise-1 are distinguished with the attribute prov:steps.
>>>
>>> Why do we need 3 derivations?
>>>
>>> I believe that imprecise-n derivation is required for the 'scruffy
>>> provenance' use case.
>>>
>>> I believe that precise-1 derivation is required for the 'proper
>>> provenance' use case: in particular, it's a requirement for provenance
>>> based reproducibility.
>>>
>>> I don't understand why we have imprecise-1.  Why can we just have
>>> imprecise-n and precise-1?
>>>
>>> PS. If we go with this proposal, then they could simply be called
>>> imprecise/precise, and we don't need the attribute steps.
>>>
>>> PS2. They would essentially be a unqualified and a qualified derivation
>>> (in prov-o terminology).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Received on Friday, 10 February 2012 14:41:54 UTC