W3C home > Mailing lists > Public > public-prov-wg@w3.org > June 2012

Re: PROV-ISSUE-409 (prov-dm-review-LC): feedback on PROV-DM document (for last call release) [prov-dm]

From: Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>
Date: Mon, 18 Jun 2012 17:21:51 +0100
Message-ID: <CAANah+HmN19xAHUNU--yxaeN6sNEQUmxgz2rNqXuFnS8CEaaYA@mail.gmail.com>
To: Provenance Working Group <public-prov-wg@w3.org>
Cc: Luc Moreau <L.Moreau@ecs.soton.ac.uk>

I realized that I didn't answer all the questions that were asked by the
editors in my review. You will find them below.

Thanks, khalid

   1. Can the document be released as a WD? Yes, provided that
   contextualization definition is amended in the light of the comments below.
   2. Can the documen*t* be released as a last call WD? Yes
   3. Renaming wasRevisionOf to wasRevisedFrom? It is fine with me either
   4. Primitive datatypes. Do we have to list them all? I think it would be
   good, but I wouldn't say they are mandatory.

> 2012/6/17 Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>
>> Hi,
>> I read the new draft of the prov-dm. You will find my comments below.
>> Regarding the question of the editors about conceptualization. I am no
>> opposed to its presence in the DM, but its definition should be simplified
>> substantially (see the comments below).
>> Regards, khalid
>> -----------
>> - In the beginning of the document (PROC Family Specification), it is
>> stated that "PROV-O, the PROV ontology, an OWL-RL ontology allowing the
>> mapping of PROV to RDF". I am not sure that PROVO is entirely OWL-RL
>> compliant. We have been using in PROVO the term OWL-RL++, because there are
>> minor violation of OWL-RL in few places in the ontology.
>> - In the Table of Content, the titles of Section 4.1 and Section 4.2 may
>> need to be detailed a bit more. As they are, they are not informative at
>> the level of the table of content, when the reader is browsing.
>> - In the introduction, in the list that describes the components, there
>> is a mismatch between this list and the components in the table of
>> contents: according to the list in the introduction, component 2 is about
>> agents and component 3 is about derivations, whereas according to the table
>> of contents, component 2 is about derivations and component 3 is about
>> agents.
>> - Section 2 is supposed to be an overview, but it is quite long.
>> - Section 2 makes the difference between binary and expanded relations. I
>> am not sure this makes sense in the context of the DM. It was introduced in
>> PROV-O, because the language we are using is not expressive enough for
>> specifying n-ary relations in a natural way. This is not the case for
>> PROV-DM, PROV-N allow expressing such relations without a problem. Also,
>> reading the section on Expanded relations from the point of view of a
>> reader who is not part of the working group, it seems that this is a source
>> of confusion, and I don't see a real benefit from its presence in the DM.
>> - In Section 2, when explaining "Usage", it is said that "Usage is the
>> beginning of utilizing an entity by an activity. Before usage, the activity
>> had not begun to utilize this entity and could not have been afected by the
>> entity.". this statement does not hold when an entity is used multiple
>> times by the same activity (e.g., to feed different parameters).
>> - The discussion that follows Example 3, and explains that actually a car
>> is used and that another car is generated at the end of the journey is a
>> possible interpretation, and I don't think it is the more natural
>> interpretation. A simpler interpretation that the reader may grasp quickly,
>> is that the driving activity used a car, that's it. Not every activity
>> needs to generate an entity.
>> - In Section 2.1.3, first paragraph: "more trustworthy that that from a
>> lobby organization" -> "more trustworthy than that from a lobby
>> organization"
>> - In Section 2.1.3, in the statement about Delegation, it may be worth
>> specifying what is the scope of delegation, is the delegation valid for a
>> given activity or all activities carried out by the agent.
>> - In example 13, "[...] but also determine who its provenance is
>> attributed to [...]". This sentence implies that an agent is always a
>> human. "who" can be replaced by "the agent" to avoid confusion.
>> - The column "Core Structures", in Table 3, is confusing. components 1, 2
>> and 3 do not contain only core concepts.
>> - In the UML diagram in Figure 5, as well as in other UML diagrams,
>> "attributes" is defined as a filed for Entity, Activity and others. Looking
>> just as the UML diagram, the reader may think that there is a filed called
>> attributes!
>> - In the definition of communication, Section 5.1.5, it is stated that
>> "Communication is the exchange of an unspecified entity". Why do we require
>> that the entity should be unspecified. Aren't we restricting who may want
>> to specify the entity (or entities) exchanged between two activities to be
>> specified. I would suggest to rephrase that sentence in the  following
>> lines "Communication is the exchange of an entity that may be unspecified".
>> - I notice that Invalidation (Section 5.1.8), is not present in Figure 5.
>> - In section 5.2 (Component 2: Derivations), the first sentence in this
>> section says "The third component".
>> - I find the definition of "Primary Source", hard to follow. Can we
>> simplify it?
>> - In the definition of delegation, the activity is an optional argument.
>> What is the semantics of delegation when the activity is not specified. I
>> suspect that it means that the activity for which the delegation holds is
>> unknown. However, the reader may think that the delegation hold for all the
>> activities that are carried out by the agent in question.
>> - The first paragraph, 3rd sentence, in Section 5.4, "It comprises a
>> Bundle class and a subclass of Entity"-> "It specifies that Bundle is a
>> sub-class of Entity".
>>  - The first sentence in Example 40 states that "A provenance aggregator
>> could merge two bundles". the verb merge has a strong semantics that does
>> not applies in this case. I think we could simply say "could union"?
>> - Section 5.5.3 on contextualization is difficult to follow. The third
>> paragraph in this section states that "A bundle's description provide a
>> context in which to interpret an entity in a domain-specific manner".  This
>> is not reflected in the definition of bundle, which form my understanding,
>> aggregate a number of provenance descriptions that happen (by accident) to
>> be in a bundle, e.g., a file. The notion of context and domain dependency
>> introduced in contextualization seems to assume that a bundle contains
>> provenance description within the bundle are domain dependent and that they
>> have been specified within a given context. The notion of context is also
>> loose, and cam mean different things to different people.
>> Now, looking at example 45, it may be that what the first paragraphs in
>> Section 5.5.3 are misleading, and that the purpose is to have something
>> simple. If the objective is basically to specify that a given entity e1 is
>> a specialization of another entity e2 and to be able to locate the bundle
>> in which e2 is described, then we should just do that. In other words, we
>> should use "specializationOf", and add a construct that specify the bundle
>> in which a given entity is described, e.g., isDescribedIn(e2,bundle2)?
>> Therefore, to answer the question that the editor asked regarding
>> contextualization, I do not oppose its presence in the DM, but I think it
>> definition should be simplified substantially to reflect the way it will be
>> used in practice. I would also urge the editors to avoid using the term
>> contextualization as it is vague.
>> - In section 5.6.1, it is stated that collection is a multiset because it
>> may not be possible to verify that two distinct entity identifiers do not
>> denote the same entity. This is one reason, but not the main one.
>> Collection is a general contruct, and we should allow people to contruct
>> collections that contains duplicate entities with different or same
>> identifiers.
>> On 14 June 2012 12:07, Provenance Working Group Issue Tracker <
>> sysbot+tracker@w3.org> wrote:
>>> PROV-ISSUE-409 (prov-dm-review-LC): feedback on PROV-DM document (for
>>> last call release) [prov-dm]
>>> http://www.w3.org/2011/prov/track/issues/409
>>> Raised by: Luc Moreau
>>> On product: prov-dm
>>> This is the issue to collect feedback on the prov-dm document.
>>> Document to review is available from:
>>> http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120614/prov-dm.html
>>> Question for reviewers:
>>> http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.06.14
>>> Cheers,
>>> Luc
Received on Monday, 18 June 2012 16:22:22 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:16 UTC