Re: Comment on the DM

Tracker, this is now ISSUE-274

On 24/02/2012 20:18, Khalid Belhajjame wrote:
> Hi,
> I read mainly Part-1, and briefly looked at Part-2.
> I think that the simplification is on the right direction. I think 
> however the part-1 can be further simplified by moving some 
> definitions and details to part-2. I will give more details on this 
> later on in the email.
> Below are the comments.
> - I think the title of part-2 is misleading as it does not contains 
> only constraints but also definitions that are not present in part-1, 
> and revise other definitions to provide more details, e.g., Entity. 
> Therefore, I wonder if it would be better to rename part-1 and part-2. 
> I couldn’t find better titles though. I thought of “core prov-dm” for 
> part-1, and “extended prov-dm” for part 2, but that is not really what 
> the two parts are about.
> - ASN is used in part-1, but not introduced. A brief definition when 
> it is used for the first time, for example, may be good.
> - The first paragraph in Section 2.1, it is said that “provenance of 
> Entities, that is of things in the world”. I am not sure that is the 
> case, provenance of entities is not the same as provenance of things.
> - In the same section 2.1, it is said that “The definition of agent 
> intentionally stays away from using concepts such as enabling, 
> causing, *initiating*, affecting…”. Isn’t wasStartedBy, which is 
> defined in Section is used to specify that an agent initiated 
> the execution of an activity?
> - The examples of generation and usage that are given in Section 2.2 
> are complicated. Although they are to give a precise definition of 
> what generation and usage are by considering the time, e.g., “Examples 
> of generation are the *completed* creation of a file by a program”. I 
> think that at the stage it would be less confusing for the reader to 
> simply know that the creation of a file is an example of generation.
> - In Section 2.3, plan is used in the text without being introduced 
> before.
> - I have the impression that the diagram presented in Section 2.5 
> would be more useful if placed at the beginning of Section 2. Also, 
> this diagram was not clear, i.e., the quality of the image is bad, 
> when I printed it out on paper.
> - The title of Section 3.2 “The Authors View” is confusing. A reader 
> that is quickly browsing the document may think that this section 
> gives the views of the prov-dm authors about the prov-dm document :-)
> - In Section 4, first paragraph: “We revisit each concept 
> *introduction* in Section 2” -> introduced
> - In the definition of Entity in Section 4.1.1: “id: an identifier 
> identifying an entity” -> “id: an entity identifier”.
> - In the definition of Entity in Section 4.1.1: “attributes: an 
> Optional set of attribute-value pairs *representing this entity’s 
> situation in the world*” -> characterizing the thing that the entity 
> represents. Or something in these lines.
> - In the same section, the constraint that the set of Activities and 
> Entities are disjoint is presented, later on in Section 4.1.2, this 
> constraint is explained further. However, the explanation is based on 
> details that are not present in part-1, but are presented later on in 
> part-2, specifically that “an entity exists in full at any point in 
> its lifetime, persists during this interval, and preserves the 
> characteristics that makes it identifiable”. I would therefore 
> suggests moving the discussion about the above constraint, i.e., that 
> entities and activities are disjoint to the constraint document.
> - In Section Generation, it is said that “While each of the 
> components activity, time, and attributes is Optional, at least one of 
> them must be present”. I wonder if there is a straightforward way to 
> encode this constraints in the serializations of prov-dm, in 
> particular prov-o.
> - In Section Responsibility Chain, in the definition of 
> actedOnBehalfOf, it is specified that activity can be optional. We 
> need to add some details to specify what will be the semantics of 
> actedOnBehalfOf when activity is not given as an argument, that is 
> means that a given agent ag1 acts on behalf of another agent ag2 in 
> all the activities that ag1 is involved in?
> - Section presents derivation. If the objective is to simplify 
> part-1, then this section needs serious simplifications :-) In 
> particular, there are three version of derivation precise-1, 
> imprecise-n and imprecise-n. I was thinking of presenting only one, 
> e.g., imprecise, without saying that it is imprecise, and giving more 
> details about the different kinds of derivations in the constraint 
> document. Also, I think traceability which is presented later on 5, is 
> a first class relation, and therefore should be introduced when 
> speaking about entity-entity relations in Section 4.2.3.
> - Section on Alternate and Specialization can be moved to 
> part-2, since to grasp these relations one needs to have more details 
> about what entity represents, which are given in part-2.
> - Section 4.2 Relation, I think the order in which the subsections of 
> this section are presented should be re-thinked. In particular, I have 
> the impression that the reader would be interested to know about 
> entity-entity relations, which are probably the most important 
> relations in provenance, before getting to know what are the 
> agent-activity and agent-agent relations.
> - The table presented in Section 4.2 need some text that explains to 
> the reader how it can be read.
> Hope these comments will be of help, khalid

Received on Wednesday, 29 February 2012 05:25:00 UTC