W3C home > Mailing lists > Public > public-prov-wg@w3.org > January 2012

Re: PROV-ISSUE-198: Section 6.1 (PROV-DM as on Dec 5) [prov-dm]

From: Satya Sahoo <satya.sahoo@case.edu>
Date: Tue, 3 Jan 2012 11:10:11 -0500
Message-ID: <CAOMwk6w=dwmiDk60HhBCTAOnMvXFxBsgXBg+vd4nuoutjJvPdw@mail.gmail.com>
To: Paolo Missier <Paolo.Missier@ncl.ac.uk>
Cc: Provenance Working Group WG <public-prov-wg@w3.org>
Hi Paolo,
My responses are interleaved:

I am strongly in favour of introducing constructs that provide a way to
> track the provenance of a data structure. There is a ample evidence that
> tracking the evolution of data /along with its data container/ is useful.

I agree

> Indeed, the initial proposal was more ambitious and tried to capture
> operations on ordered trees. It was then revised "down" to a simple data
> structure.
>  I will argue that this is not at all domain-specific: the notion of a
> "data container" or data structure is not a domain, rather it's an integral
> part of what data provenance is about.

I agree that notion of data container is not domain-specific, but
part-whole relations between data and the data containers are domain
specific (workflows- sub workflows-atomic processes, information
artifact-file-folder etc.)

> I felt that support for expressing the connection between data elements
> and the data structures that contain them was missing from the OPM, and at
> the time we devised extensions to deal with it.
>  So to me the question is not whether there is a place in PROV for
> constructs that track the provenance of a data structure, rather what is
> the most general data structure whose provenance we can capture in a simple
> way.

I agree that there is a need for a general data structure (that in turn may
be mapped to RDF graph, workflows etc.). My point was the distinction
between (or need for) a generic data structure "container" and "account",
where we can easily add an asserter to a container, is not clear.

> In the end, we settled for a minimal structure, namely sets of key-value
> pairs, which is what the current proposal is about.

I did not understand this. The use of indexes to access elements in a
container, including addition and removal, are in my view out of scope of
the DM (and rather computer science-specific)



 But it's important to clarify whether there is agreement on going forward
> with it.
> If so, we are going to edit the current version based on the other issues
> that have been raised (135-139)
> --Paolo
> On 12/7/11 2:19 AM, Provenance Working Group Issue Tracker wrote:
>> PROV-ISSUE-198: Section 6.1 (PROV-DM as on Dec 5) [prov-dm]
>> http://www.w3.org/2011/prov/**track/issues/198<http://www.w3.org/2011/prov/track/issues/198>
>> Raised by: Satya Sahoo
>> On product: prov-dm
>> Hi,
>> The following are my comments for Section 6.1 of the PROV-DM (as on Dec
>> 5):
>> Section 6.1
>> 1. "The relations introduced here are all specializations of the
>> wasDerivedFrom relation, specifically precise-1 or imprecise-1. They are
>> designed to model:
>> * insertion: a collection entity c' is obtained from collection entity c,
>> by adding entity e having key k to c;
>> * removal: a collection entity c' is obtained from collection entity c,
>> by removing entity e having key k from c;
>> * selection: an entity e was selected from collection c using key k."
>> Comment: The relevance of the Collection and these related properties in
>> PROV-DM is not clear. I am not sure why indexing structures should be part
>> of the Data Model. In addition, the above list has highly domain-specific
>> methods and should be either removed completely or removed to Best
>> Practices document if needed. For example, one can make the case for
>> modeling wasAddedTo_Agent, wasRemovedFrom_Entity, wasModifiedIn_Entity etc.
>> 2. "Record: wasAddedTo_Coll(c2,c1) (resp. wasRemovedFrom_Coll(c2,c1))
>> denotes that collection c2 is an updated version of collection c1,
>> following an insertion (resp. deletion) operation."
>> Comment: Why can't this be expressed using "wasDerivedFrom" or revision?
>> Thanks.
>> Best,
>> Satya
> --
> -----------  ~oo~  --------------
> Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org
> School of Computing Science, Newcastle University,  UK
> http://www.cs.ncl.ac.uk/**people/Paolo.Missier<http://www.cs.ncl.ac.uk/people/Paolo.Missier>
Received on Tuesday, 3 January 2012 16:13:14 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:11 UTC