Re: prov-dm derivation: three proposals to vote on (deadline Wednesday midnight GMT) from Simon Miles on 2011-11-09 (public-prov-wg@w3.org from November 2011)

From: Simon Miles <simon.miles@kcl.ac.uk>
Date: Wed, 9 Nov 2011 20:49:07 +0000
To: Provenance Working Group WG <public-prov-wg@w3.org>
Message-ID: <CAKc1nHcMLFQoTqYuVMDTpVo22Wa8gqmudxzMCb2+rHz_ani6HA@mail.gmail.com>
Hi Paul,

I take your question to be saying - can't we remove dependedUpon, right?

This was not something added by me, but I can sort of see the point of
it. If you query some provenance data, you might be looking for
connections, e.g. "find everyone who has been in contact or met
someone who has been in contact with suspected terrorist T". You are
not looking only for those that T has had some effect on, but merely
any leads to pursue in finding T. There are probably better examples
others can give.

No obviously better suggestion for wasEventuallyDerivedFrom as yet...
wasBasedOn?

Thanks,
Simon

On 9 November 2011 20:17, Paul Groth <p.t.groth@vu.nl> wrote:
> Hi Simon,
>
> Couldn't you model the case of the banner image by saying that it was
> used in the activity that generated the page. There is no concrete
> derivation there?
>
> Also, do you have a better name for wasEventuallyDerivedFrom? :-)
>
> thanks,
> Paul
>
> Simon Miles wrote:
>> Hi Luc,
>>
>> Responses interleaved.
>>
>>> We didn't have transitivity on derivation because of the constraint on attributes but it was dropped last week.
>>
>> Yes, but I thought that relaxation merely didn't constrain when
>> transitivity held, not that all derivation was transitive.
>>
>>> If you think that we need a non-transitive relation wasEventuallyDerivedFrom, can you explain why?
>>
>> I've been drafting some text for the primer on derivation that
>> includes an example:
>>
>> "When one entity's existence, content, characteristics and so on are
>> at least partly due to another entity, then we say that the former is
>> derived from the latter. For example, one document may contain
>> material copied from another, a child is derived from his/her
>> ancestors, and a page displayed in a browser is derived from the same
>> page on the web server from which it was downloaded, as well as from
>> the designer's original sketches of what the page would look like.
>>
>> There are different kinds of derivation expressible in Prov-DM.
>> Consider the case of the page in the browser above. It is derived from
>> the designer's sketch in the strictest sense, i.e. if the sketch had
>> been different so would the page. On the other hand, there are
>> entities that are part of the page's history but which did not inform
>> the content of that page, i.e. the page would have been the same even
>> if the earlier entity changed. For example, on creating the original
>> draft of the page, the designer may have included a banner image
>> saying "DRAFT - FOR REVIEW ONLY". This banner was not part of the
>> sketch, nor part of the published page downloaded to the browser, but
>> was part of the page's history, and while not affecting the browsed
>> page's content may have been a factor in its existence. Finally, in
>> some cases, we may be able to say not only that one entity was derived
>> from another, but also how it was derived, i.e. by what process
>> execution. For example, the page in the browser is derived from the
>> page on the web server because a download process sent the bytes of
>> the latter across an HTTP connection to the browser client.
>>
>> In Prov-DM terms, we say that the page in the browser was eventually
>> derived from the sketch, depended on the banner image, and was derived
>> from the page on the web server due to the download process."
>>
>> I still can't agree with Proposal 3 - dependedUpon and
>> wasEventuallyDerivedFrom seem distinct concepts and both important.
>>
>>> Why do you come back on something you had agreed upon?
>>
>> I'm not sure which agreement you are referring to?
>>
>>> If you don't make the link to the PE, how can you decide which PE underpinned the derivation?
>>
>> I don't always want to, I merely want to know from what something is
>> derived (I believe Paul said the same [1]).
>>
>> But on reconsideration, I was wrong that A wasDerivedFrom B could be
>> captured by just A wasEventuallyDerivedFrom B, P used B and A
>> wasGeneratedBy P. I think the difference is only apparent when B
>> occurs multiple times in one account of A's history - if B only
>> occurred once, then I see no need for wasDerivedFrom as only P can be
>> the underpinning of the derivation. But in the case where an account
>> contains A dependedUpon B by multiple paths, then I agree
>> wasDerivedFrom states something otherwise inexpressible.
>>
>> The Prov-O (Stian's) proposal for encoding wasDerivedFrom [2] looks
>> very like my proposed replacement, so might not resolve the ambiguous
>> situation mentioned above.
>>
>>> To me, when generating provenance in a computational context, eg workflow, it's the only derivation that is grounded and can be verified.
>>
>> Sorry, I'm not clear what you mean here - "only derivation" and not what?
>>
>> thanks,
>> Simon
>>
>> [1] http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0170.html
>> [2] http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0126.html
>>
>>
>>> Professor Luc Moreau
>>> Electronics and Computer Science
>>> University of Southampton
>>> Southampton SO17 1BJ
>>> United Kingdom
>>>
>>> On 7 Nov 2011, at 17:57, "Simon Miles"<simon.miles@kcl.ac.uk>  wrote:
>>>
>>>> Hello Luc,
>>>>
>>>> +1 for Proposal 1; 0 for Proposal 2; -1 for Proposal 3
>>>>
>>>> Proposal 1 sounds fine, but in what way do Proposals 1 and 2 differ
>>>> from what exists at the moment?
>>>>
>>>> More importantly, I can't see anything in the text about
>>>> wasEventuallyDerivedFrom being transitive, or see why it would be, so
>>>> why does Proposal 3 make sense?
>>>>
>>>> With the two separate links, we are able to assert and query for an
>>>> actual connection between one entity's content and another's
>>>> (wasEventuallyDerivedFrom), while also allowing the entities involved
>>>> somewhere in an entity's history to be browsed (dependedUpon). This
>>>> seems to allow for two clear classes of use case for two common
>>>> interpretations of provenance.
>>>>
>>>> The one I still don't see the value of is wasDerivedFrom. If you can
>>>> say that A wasEventuallyDerivedFrom B, that P used B and P generated
>>>> A, then what more is there to say? If wasDerivedFrom is just a
>>>> shortcut for this information, why is it significant enough to warrant
>>>> being added to the model? Why would you assert an account where you
>>>> can say A wasDerivedFrom B, because you know about P, but you do not
>>>> say P used B and P generated A?
>>>>
>>>>>  From our earlier discussions, I understand the distinction of
>>>> derivation types, but wasDerivedFrom just seems a less useful and more
>>>> complex to understand version of wasEventuallyDerivedFrom.
>>>>
>>>> Thanks,
>>>> Simon
>>>>
>>>> On 7 November 2011 10:06, Luc Moreau<l.moreau@ecs.soton.ac.uk>  wrote:
>>>>> Dear all,
>>>>>
>>>>> Can you express your support or not for the following proposals. We will
>>>>> confirm
>>>>> the outcome at the teleconference.
>>>>>
>>>>> Best regards,
>>>>> Luc
>>>>>
>>>>>
>>>>> In the interest of simplification, we would like to make the following
>>>>> proposal about derivations in prov-dm.
>>>>>
>>>>> Context: prov-dm currently contains 3 different notions of
>>>>> derivations, in particular with names that are not intuitive.  The
>>>>> constraint derivation-attributes [1] prevented derivations to be
>>>>> transitive. These constraints were removed from the prov-dm document
>>>>> last week [2].
>>>>>
>>>>>
>>>>>
>>>>> Proposal 1. Transitive derivation is expressed using 'dependedUpon'
>>>>>              between two entities.  dependedUpon can be asserted or
>>>>> inferred.
>>>>>
>>>>> Proposal 2.  There exists a special case of derivation, where a
>>>>>               process execution is known or known to exist.  This is
>>>>> expressed using:
>>>>>               wasDerivedFrom(e2,e1,[pe, ...])  and its compact form
>>>>>               wasDerivedFrom(e2,e1).
>>>>>
>>>>>               Furthermore, there exists an inference:
>>>>>               wasDerivedFrom(e2,e1,[pe, ...]) implies dependedUpon(e2,e1).
>>>>>
>>>>> Proposal 3.  In the current version of the document,
>>>>> wasEventuallyDerivedFrom and dependedOn intended to
>>>>>                express the same notion of (transitive) derivation, and
>>>>> thus can be
>>>>>                removed as redundant.
>>>>>
>>>>>
>>>>>
>>>>> Instead of 3 relations wasDerivedFrom, wasEventuallyDerivedFrom, and
>>>>> dependedOn, we would now only have 2 relations wasDerivedFrom and
>>>>> dependedUpon. The awkward term 'wasEventuallyDerivedFrom' is also
>>>>> abandonned.  Overall, this should contribute towards a simplification
>>>>> of the model.
>>>>>
>>>>>
>>>>> Note: the text will describe the conditions under which the binary
>>>>> form of wasDerivedFrom is transitive.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [1] http://www.w3.org/TR/2011/WD-prov-dm-20111018/#derivation-attributes
>>>>> [2] http://www.w3.org/2011/prov/meeting/2011-11-03#resolution_5
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr Simon Miles
>>>> Lecturer, Department of Informatics
>>>> Kings College London, WC2R 2LS, UK
>>>> +44 (0)20 7848 1166
>>>>
>>
>>
>>
>
> --
> Dr. Paul Groth (p.t.groth@vu.nl)
> http://www.few.vu.nl/~pgroth/
> Assistant Professor
> Knowledge Representation & Reasoning Group
> Artificial Intelligence Section
> Department of Computer Science
> VU University Amsterdam
>
>



-- 
Dr Simon Miles
Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166
Received on Wednesday, 9 November 2011 20:49:45 UTC