Re: prov-dm derivation: three proposals to vote on (deadline Wednesday midnight GMT) from Simon Miles on 2011-11-09 (public-prov-wg@w3.org from November 2011)

From: Simon Miles <simon.miles@kcl.ac.uk>
Date: Wed, 9 Nov 2011 20:05:23 +0000
To: Provenance Working Group WG <public-prov-wg@w3.org>
Message-ID: <CAKc1nHdcOKk0-iYAEUL3CWt3YLMgijy4vCPOX-xDXebk+iVh4Q@mail.gmail.com>
Hi Luc,

Responses interleaved.

> We didn't have transitivity on derivation because of the constraint on attributes but it was dropped last week.

Yes, but I thought that relaxation merely didn't constrain when
transitivity held, not that all derivation was transitive.

> If you think that we need a non-transitive relation wasEventuallyDerivedFrom, can you explain why?

I've been drafting some text for the primer on derivation that
includes an example:

"When one entity's existence, content, characteristics and so on are
at least partly due to another entity, then we say that the former is
derived from the latter. For example, one document may contain
material copied from another, a child is derived from his/her
ancestors, and a page displayed in a browser is derived from the same
page on the web server from which it was downloaded, as well as from
the designer's original sketches of what the page would look like.

There are different kinds of derivation expressible in Prov-DM.
Consider the case of the page in the browser above. It is derived from
the designer's sketch in the strictest sense, i.e. if the sketch had
been different so would the page. On the other hand, there are
entities that are part of the page's history but which did not inform
the content of that page, i.e. the page would have been the same even
if the earlier entity changed. For example, on creating the original
draft of the page, the designer may have included a banner image
saying "DRAFT - FOR REVIEW ONLY". This banner was not part of the
sketch, nor part of the published page downloaded to the browser, but
was part of the page's history, and while not affecting the browsed
page's content may have been a factor in its existence. Finally, in
some cases, we may be able to say not only that one entity was derived
from another, but also how it was derived, i.e. by what process
execution. For example, the page in the browser is derived from the
page on the web server because a download process sent the bytes of
the latter across an HTTP connection to the browser client.

In Prov-DM terms, we say that the page in the browser was eventually
derived from the sketch, depended on the banner image, and was derived
from the page on the web server due to the download process."

I still can't agree with Proposal 3 - dependedUpon and
wasEventuallyDerivedFrom seem distinct concepts and both important.

> Why do you come back on something you had agreed upon?

I'm not sure which agreement you are referring to?

> If you don't make the link to the PE, how can you decide which PE underpinned the derivation?

I don't always want to, I merely want to know from what something is
derived (I believe Paul said the same [1]).

But on reconsideration, I was wrong that A wasDerivedFrom B could be
captured by just A wasEventuallyDerivedFrom B, P used B and A
wasGeneratedBy P. I think the difference is only apparent when B
occurs multiple times in one account of A's history - if B only
occurred once, then I see no need for wasDerivedFrom as only P can be
the underpinning of the derivation. But in the case where an account
contains A dependedUpon B by multiple paths, then I agree
wasDerivedFrom states something otherwise inexpressible.

The Prov-O (Stian's) proposal for encoding wasDerivedFrom [2] looks
very like my proposed replacement, so might not resolve the ambiguous
situation mentioned above.

> To me, when generating provenance in a computational context, eg workflow, it's the only derivation that is grounded and can be verified.

Sorry, I'm not clear what you mean here - "only derivation" and not what?

thanks,
Simon

[1] http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0170.html
[2] http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0126.html


>
> Professor Luc Moreau
> Electronics and Computer Science
> University of Southampton
> Southampton SO17 1BJ
> United Kingdom
>
> On 7 Nov 2011, at 17:57, "Simon Miles" <simon.miles@kcl.ac.uk> wrote:
>
>> Hello Luc,
>>
>> +1 for Proposal 1; 0 for Proposal 2; -1 for Proposal 3
>>
>> Proposal 1 sounds fine, but in what way do Proposals 1 and 2 differ
>> from what exists at the moment?
>>
>> More importantly, I can't see anything in the text about
>> wasEventuallyDerivedFrom being transitive, or see why it would be, so
>> why does Proposal 3 make sense?
>>
>> With the two separate links, we are able to assert and query for an
>> actual connection between one entity's content and another's
>> (wasEventuallyDerivedFrom), while also allowing the entities involved
>> somewhere in an entity's history to be browsed (dependedUpon). This
>> seems to allow for two clear classes of use case for two common
>> interpretations of provenance.
>>
>> The one I still don't see the value of is wasDerivedFrom. If you can
>> say that A wasEventuallyDerivedFrom B, that P used B and P generated
>> A, then what more is there to say? If wasDerivedFrom is just a
>> shortcut for this information, why is it significant enough to warrant
>> being added to the model? Why would you assert an account where you
>> can say A wasDerivedFrom B, because you know about P, but you do not
>> say P used B and P generated A?
>>
>>> From our earlier discussions, I understand the distinction of
>> derivation types, but wasDerivedFrom just seems a less useful and more
>> complex to understand version of wasEventuallyDerivedFrom.
>>
>> Thanks,
>> Simon
>>
>> On 7 November 2011 10:06, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:
>>> Dear all,
>>>
>>> Can you express your support or not for the following proposals. We will
>>> confirm
>>> the outcome at the teleconference.
>>>
>>> Best regards,
>>> Luc
>>>
>>>
>>> In the interest of simplification, we would like to make the following
>>> proposal about derivations in prov-dm.
>>>
>>> Context: prov-dm currently contains 3 different notions of
>>> derivations, in particular with names that are not intuitive.  The
>>> constraint derivation-attributes [1] prevented derivations to be
>>> transitive. These constraints were removed from the prov-dm document
>>> last week [2].
>>>
>>>
>>>
>>> Proposal 1. Transitive derivation is expressed using 'dependedUpon'
>>>             between two entities.  dependedUpon can be asserted or
>>> inferred.
>>>
>>> Proposal 2.  There exists a special case of derivation, where a
>>>              process execution is known or known to exist.  This is
>>> expressed using:
>>>              wasDerivedFrom(e2,e1,[pe, ...])  and its compact form
>>>              wasDerivedFrom(e2,e1).
>>>
>>>              Furthermore, there exists an inference:
>>>              wasDerivedFrom(e2,e1,[pe, ...]) implies dependedUpon(e2,e1).
>>>
>>> Proposal 3.  In the current version of the document,
>>> wasEventuallyDerivedFrom and dependedOn intended to
>>>               express the same notion of (transitive) derivation, and
>>> thus can be
>>>               removed as redundant.
>>>
>>>
>>>
>>> Instead of 3 relations wasDerivedFrom, wasEventuallyDerivedFrom, and
>>> dependedOn, we would now only have 2 relations wasDerivedFrom and
>>> dependedUpon. The awkward term 'wasEventuallyDerivedFrom' is also
>>> abandonned.  Overall, this should contribute towards a simplification
>>> of the model.
>>>
>>>
>>> Note: the text will describe the conditions under which the binary
>>> form of wasDerivedFrom is transitive.
>>>
>>>
>>>
>>>
>>> [1] http://www.w3.org/TR/2011/WD-prov-dm-20111018/#derivation-attributes
>>> [2] http://www.w3.org/2011/prov/meeting/2011-11-03#resolution_5
>>>
>>>
>>>
>>
>>
>>
>> --
>> Dr Simon Miles
>> Lecturer, Department of Informatics
>> Kings College London, WC2R 2LS, UK
>> +44 (0)20 7848 1166
>>
>



-- 
Dr Simon Miles
Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166
Received on Wednesday, 9 November 2011 20:06:02 UTC