Re: Review of DM WD4

Hi Luc,

On 02/03/2012 16:06, Luc Moreau wrote:
> Hi Tim
>
> Paolo and I have made changes following your feedback.
> Our responses can be found below.
>
> This now completes WD4. Notes have been inserted in the document,
> which we will tackle as part of WD5.

Thanks for your efforts in incorporating my feedback on the DM.
>
> We are proposing to close ISSUE-274. Let us know if this is fine with 
> you.

Yes please, go ahead.

Khalid

> Regards,
> Luc
>
> > I was asked to review DM WD3. This email constitutes my review.
> > I have included supplemental notes that I hope the DM editors will 
> review and consider in future versions.
> > I have raised a few of the bigger issues in the tracker already.
> >
> > Regards,
> > Tim
> >
> > Goals of the review (per 
> http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.02.16#PROV-DM_Simplification):
> >
> >     • decide whether the new documents are inline with the 
> simplification objective
> >
> > +1
> >
> >
> >     • recommend whether they become the new editor's draft
> >
> > +1
> >
> >         • if not, identify blocking issues
> >         • if yes, identify potential issues to be raised against 
> these future new editor's draft
> >
> >     • decide whether ISSUE-145, ISSUE-183, ISSUE-215, ISSUE-225 and 
> ISSUE-234 (all relating to identifiers) can be closed
> >
> >
> > ------
> > http://www.w3.org/2011/prov/track/issues/145
> > qualified identifiers may not work well with named graphs
> >
> > This issue can be CLOSED. The treatment of AccountEntities (which I 
> hope will be renamed to prov:Provenance) and the section on provenance 
> of provenance does not impose a scoping of identifiers.
> > This will make it easy to implement using RDF mechanisms
> >
> >
> > ------
> > http://www.w3.org/2011/prov/track/issues/183
> > identifiers in prov-dm
> >
> >
> > The use of identifiers is no longer confusing. They identify 
> Entities, Activities, etc.
> > "Records" (a dying term) are not identified, they identifier they 
> mention is identifying the Entity, Activity, Involvement, etc.
> >
> >
> > ------
> > http://www.w3.org/2011/prov/track/issues/215
> > ProvenanceOfW3CReport
> >
> > The example is good because it shows two perspectives, which makes 
> it easy to use for AccountEntity (prov:Provenance).
> > The identifiers make it a bit dry and hard to follow, but the 
> concrete aspect is MUCH more useful.
> >
> >
> > ------
> > http://www.w3.org/2011/prov/track/issues/225
> > What are the objects in the universe of discourse?
> >
> > This can be CLOSED. It is not confusing in the current writeup.
> >
> > ------
> > http://www.w3.org/2011/prov/track/issues/234
> > id identifies entity, not the record
> >
> > Can be CLOSED.
> >
> >
> >
> >
> > ------- supplemental notes --------
> >
> >
> > About notes in 
> http://www.w3.org/2011/prov/wiki/ProvDMWorkingDraft4#Design_decisions
> >
> >     • If part 3 is now separate from part 1, there is no need to 
> talk about 'Entity Record' (or whatever Record) in part 1. Instead, we 
> can just mention Entity (or whatever other concept)
>
> > +1 This is much more natural
> >
> >     • Given that Part 3 is just about ASN, and therefore is a 
> language, then we can without confusion, talk about 'Entity 
> Expression' since now these would be Expressions of the language
> >
> > +1
> >
> >     • Does this mean that we would be dropping the term record 
> entirely? What would we bundle up though?
> >
> > I would say we bundle up "expressions". One could bundle ASN 
> expressions, RDF expressions, XML expressions, etc.
> >
> >     • What about assertions? So should still use the word?
> >
> > I would suggest the more general term "expression" in place of 
> "assertion".
> >
> >
> > ------- supplemental notes --------
> >
> > About 
> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html
> >
> >
> > Sections entitled "Activity-Entity Relation" seem a bit unnatural. 
> Perhaps something like "Relations between Activities and Entities" 
> would be clearer.
>
> Title will change when components are introduced.
>
> >
> > The phrase "when the data it is about changes" is unclear.
>
> Updated to:
>  However, if data changes, it is challenging to express its provenance 
> precisely, like it would be for any other form of metadata.
>
> >
> > "To address this challenge, an upgrade path is proposed to enrich 
> simple provenance..." This paragraph is nice. I'd suggest including 
> "specific subject" in "qualify the subject of provenance".
>
> Done.
>
> >
> >
> > Is it okay to use ASN before it is defined? "In section 3, PROV-DM 
> is applied to a short scenario, encoded in PROV-ASN, and illustrated 
> graphically."
> >
> > "Section 4 provides the definition of PROV-DM." is a bit ambiguous. 
> Please elaborate.
>
> Section 4 provides the definition of PROV-DM constructs.
>
> >
> >
> > The following duplicates: "Activities that operate on digital 
> entities may for example move, copy, or duplicate them. Activities 
> that operate on digital entities may for example move, copy, or 
> duplicate them."
>
> Done.
>
> >
> > I propose to change the Agent definition from->to:
> > "An agent is a type of entity that can be associated to an activity, 
> to indicate that it bears some form of responsibility for the activity 
> taking place."
> > "An agent is a type of entity that bears some form of responsibility 
> for an activity taking place."
> >
>
> Yes, implemented.
>
> >
> > perhaps add the person invoking the grammar checker to the following 
> example (to illustrate the levels of responsibility):
> > "Software for checking the use of grammar in a document may be 
> defined as an agent of a document preparation activity, and at the 
> same time one can describe its provenance, including for instance the 
> vendor and the version history."
>
> This is just an example for agent, we shouldn't illustrate 
> responsibility here. This comes laters.
>
> >
> > add "an" to "Generation is the completed production of a new entity 
> by activity." -> "Generation is the completed production of a new 
> entity by an activity."
>
> done
>
> >
> > reads oddly: "the activity had not begun to consume or use to this 
> entity"
>
> dropped 'to'.
>
> >
> >
> > avoid parens in a definition: "(and could not have been affected by 
> the entity)"
> >
>
> Done
> >
> > avoid "internal" in collection definition "A collection is an entity 
> that has internal structure." -> "A collection is an entity provides 
> structure to some constituents." (or something)
>
> Yes, done.
>
> >
> >
> > shocked by naming of "AccountEntity" why not "PlanEntity" and 
> "CollectionEntity" (no, I don't want that...) I propose to rename 
> "AccountEntity" to "Provenance"
>
>
> This will revisited as part of the overall discussion on accounts.
> So, for now, no action.
>
> Other option is to drop this subtype of entity. We don't need to 
> express provenance of provenance.
>
> >
> >
> > This sentence is long. Suggest stopping it at the first comma. "It 
> is important to reflect that there is a degree in the responsibility 
> of agents, and that is a major reason for distinguishing among all the 
> agents that have some association with an activity and determine which 
> ones are really the originators of the entity."
> > ("and that is a major reason for distinguishing" -> "There is a 
> major reason for distinguishing")
> >
>
> This paragraph was edited.
>
> >
> > Suggest removing "active" in "indicating that the agent had an 
> active role in the activity". Does RPI have an active role in the 
> writing of this email (since I'm an RPI student...)? I'd say they have 
> a role, but not an active one.
> >
>
> OK, dropped.
>
> >
> >
> > 
> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html#section-UML 
> shows Activity wasStartedBy Agent, but Luc just said in email recently 
> that only Activity wasStartedBy Activity is the way forward. I prefer 
> Activity wasStartedBy Agent and think that some other involvement 
> should be named for the special informed involvement Activity 
> ?triggered? Activity.
> >
> >
>
> A proposal, towards WD5, will be submitted to discussion by the WG. It 
> will address that point.
>
> >
> > "ex:pub2" is a bad name - is it an activity or entity? I recommend 
> "ex:act2"
> >
>
> Done
>
> >
> > why aren't the edges labeled in the example?
>
> To be done, there is a note to that effect.
> >
> >
> > avoid term "minted" when talking about choosing a URI for a 
> Resource. "minted" is colloquial.
> >
>
> OK, generated.
> >
> > "3.3 Attribution of Provenance"  -- YES! :-)
> >
> >
> > The definition of Activity "An activity is anything that can operate 
> on entities." seems to talk about the future
> >
> >
>
> It's general property of definitions, they don't refer to the past 
> explicitly. We think it's fine.
>
> >
> > activity(id, st, et, [ attr1=val1, ...]) does include brackets for 
> optional constituents st and et
> >
>
> This is not a grammar, so it's not appropriate to use square brackets 
> mark the optional nature.
> The square brackets used for [ attr1=val1, ...] are part of the syntax!
>
>
> >
> > "(This type is equivalent to a "foaf:person" [FOAF])"   --> we 
> should not bind ourselves to  FOAF:
> >
> >
>
> We removed references to FOAF.
>
> >
> >
> > Please add a note to section Note to encourage people to use Account 
> / AccountEntity/ Provenance to annotate provenance assertions as a 
> better practice. When using AccountEntity, the annotated thing can be 
> described _directly_ as a single triple instead of using Notes. Notes 
> are very much "scruffy  provenance" and do not benefit from the 
> directness afforded by AccountEntity / prov:Provenance.
> >
> > :prov_1 {
> >  :simon a prov:Human;
> >         prov:hasAnnotation [
> >              a prov:Note; ex3:reputation "excellent";
> >              rdfs:comment "This is a kludge way to get indirection. 
> Use prov:Provenance instead.";
> >         ];
> > }
> >
> > :prov_2 {
> >   :simon ex3:reputation "excellent" .
> > }
> >
> > :prov_1 a prov:Provenance; prov:wasAttributedTo :first_asserter .
> > :prov_2 a prov:Provenance; prov:wasAttributedTo 
> :trust_evaluator_agent. .
>
> See email discussion. I don't think we have reached agreement yet.
>
> >
> >
> > I'm starting to agree that wasGeneratedBy(id,e,a,t,attrs) should 
> become Generation(id,e,a,t,attrs)
> >
> >
>
> We feel that even if the activity is not specified, there is an 
> implied activity, so this is reasonable to keep
> the name wasGeneratedBy. Thoughts?
>
> >
> >
> > This starts to distract, I think: "While each of the components 
> activity, time, and attributes is optional, at least one of them must 
> be present."
> > Permitting degenerate cases should not be a priority. If not much 
> (or nothing) is said with an assertion, let it be.
> >
> >
>
> It's to address ISSUE-XXX that we have introducing this statement. We 
> don't feel it's a distraction.
>
> >
> >
> > remove "order" from "wasGeneratedBy(e1,a1, 2001-10-26T21:32:52, 
> [ex:port="p1", ex:order=1])" because it is distracting and encourages 
> not using PROV for things that PROV should do.
> > I think Paolo agreed to this before.
>
>
> We don't see this distracting, it's an example, a real-use case in 
> workflow.
> What is it that is being discouraged by this example?
>
> >
> >
> > both agents are responsible in Responsibility. Suggest to rename 
> "responsible" to "superior" in "responsible: an identifier for the 
> agent, on behalf of which the subordinate agent acted;" in section 
> 4.2.3.1
> >
>
> What about  deputy and superior?
>
> (PS. Oxford American suggests 'second banana' ;-)
>
>
> >
> >
> > two wasQuotedFroms in the UML diagram in section 5
>
> Should be original Source. Figure edited.
>
>
> On 22/02/2012 19:46, Timothy Lebo wrote:
>> I was asked to review DM WD3. This email constitutes my review.
>> I have included supplemental notes that I hope the DM editors will 
>> review and consider in future versions.
>> I have raised a few of the bigger issues in the tracker already.
>>
>> Regards,
>> Tim
>>
>> Goals of the review (per 
>> http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.02.16#PROV-DM_Simplification):
>>
>>     • decide whether the new documents are inline with the 
>> simplification objective
>>
>> +1
>>
>>
>>     • recommend whether they become the new editor's draft
>>
>> +1
>>
>>         • if not, identify blocking issues
>>         • if yes, identify potential issues to be raised against 
>> these future new editor's draft
>>
>>     • decide whether ISSUE-145, ISSUE-183, ISSUE-215, ISSUE-225 and 
>> ISSUE-234 (all relating to identifiers) can be closed
>>
>>
>> ------
>> http://www.w3.org/2011/prov/track/issues/145
>> qualified identifiers may not work well with named graphs
>>
>> This issue can be CLOSED. The treatment of AccountEntities (which I 
>> hope will be renamed to prov:Provenance) and the section on 
>> provenance of provenance does not impose a scoping of identifiers.
>> This will make it easy to implement using RDF mechanisms
>>
>>
>> ------
>> http://www.w3.org/2011/prov/track/issues/183
>> identifiers in prov-dm
>>
>>
>> The use of identifiers is no longer confusing. They identify 
>> Entities, Activities, etc.
>> "Records" (a dying term) are not identified, they identifier they 
>> mention is identifying the Entity, Activity, Involvement, etc.
>>
>>
>> ------
>> http://www.w3.org/2011/prov/track/issues/215
>> ProvenanceOfW3CReport
>>
>> The example is good because it shows two perspectives, which makes it 
>> easy to use for AccountEntity (prov:Provenance).
>> The identifiers make it a bit dry and hard to follow, but the 
>> concrete aspect is MUCH more useful.
>>
>>
>> ------
>> http://www.w3.org/2011/prov/track/issues/225
>> What are the objects in the universe of discourse?
>>
>> This can be CLOSED. It is not confusing in the current writeup.
>>
>> ------
>> http://www.w3.org/2011/prov/track/issues/234
>> id identifies entity, not the record
>>
>> Can be CLOSED.
>>
>>
>>
>>
>> ------- supplemental notes --------
>>
>>
>> About notes in 
>> http://www.w3.org/2011/prov/wiki/ProvDMWorkingDraft4#Design_decisions
>>
>>     • If part 3 is now separate from part 1, there is no need to talk 
>> about 'Entity Record' (or whatever Record) in part 1. Instead, we can 
>> just mention Entity (or whatever other concept).
>>
>> +1 This is much more natural
>>
>>     • Given that Part 3 is just about ASN, and therefore is a 
>> language, then we can without confusion, talk about 'Entity 
>> Expression' since now these would be Expressions of the language
>>
>> +1
>>
>>     • Does this mean that we would be dropping the term record 
>> entirely? What would we bundle up though?
>>
>> I would say we bundle up "expressions". One could bundle ASN 
>> expressions, RDF expressions, XML expressions, etc.
>>
>>     • What about assertions? So should still use the word?
>>
>> I would suggest the more general term "expression" in place of 
>> "assertion".
>>
>>
>> ------- supplemental notes --------
>>
>> About 
>> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html
>>
>>
>> Sections entitled "Activity-Entity Relation" seem a bit unnatural. 
>> Perhaps something like "Relations between Activities and Entities" 
>> would be clearer.
>>
>> The phrase "when the data it is about changes" is unclear.
>>
>> "To address this challenge, an upgrade path is proposed to enrich 
>> simple provenance..." This paragraph is nice. I'd suggest including 
>> "specific subject" in "qualify the subject of provenance".
>>
>>
>> Is it okay to use ASN before it is defined? "In section 3, PROV-DM is 
>> applied to a short scenario, encoded in PROV-ASN, and illustrated 
>> graphically."
>>
>> "Section 4 provides the definition of PROV-DM." is a bit ambiguous. 
>> Please elaborate.
>>
>>
>> The following duplicates: "Activities that operate on digital 
>> entities may for example move, copy, or duplicate them. Activities 
>> that operate on digital entities may for example move, copy, or 
>> duplicate them."
>>
>> I propose to change the Agent definition from->to:
>> "An agent is a type of entity that can be associated to an activity, 
>> to indicate that it bears some form of responsibility for the 
>> activity taking place."
>> "An agent is a type of entity that bears some form of responsibility 
>> for an activity taking place."
>>
>>
>> perhaps add the person invoking the grammar checker to the following 
>> example (to illustrate the levels of responsibility):
>> "Software for checking the use of grammar in a document may be 
>> defined as an agent of a document preparation activity, and at the 
>> same time one can describe its provenance, including for instance the 
>> vendor and the version history."
>>
>> add "an" to "Generation is the completed production of a new entity 
>> by activity." ->  "Generation is the completed production of a new 
>> entity by an activity."
>>
>> reads oddly: "the activity had not begun to consume or use to this 
>> entity"
>>
>>
>> avoid parens in a definition: "(and could not have been affected by 
>> the entity)"
>>
>>
>> avoid "internal" in collection definition "A collection is an entity 
>> that has internal structure." ->  "A collection is an entity provides 
>> structure to some constituents." (or something)
>>
>>
>> shocked by naming of "AccountEntity" why not "PlanEntity" and 
>> "CollectionEntity" (no, I don't want that...) I propose to rename 
>> "AccountEntity" to "Provenance"
>>
>>
>> This sentence is long. Suggest stopping it at the first comma. "It is 
>> important to reflect that there is a degree in the responsibility of 
>> agents, and that is a major reason for distinguishing among all the 
>> agents that have some association with an activity and determine 
>> which ones are really the originators of the entity."
>> ("and that is a major reason for distinguishing" ->  "There is a 
>> major reason for distinguishing")
>>
>>
>> Suggest removing "active" in "indicating that the agent had an active 
>> role in the activity". Does RPI have an active role in the writing of 
>> this email (since I'm an RPI student...)? I'd say they have a role, 
>> but not an active one.
>>
>>
>>
>> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html#section-UML 
>> shows Activity wasStartedBy Agent, but Luc just said in email 
>> recently that only Activity wasStartedBy Activity is the way forward. 
>> I prefer Activity wasStartedBy Agent and think that some other 
>> involvement should be named for the special informed involvement 
>> Activity ?triggered? Activity.
>>
>>
>>
>> "ex:pub2" is a bad name - is it an activity or entity? I recommend 
>> "ex:act2"
>>
>>
>> why aren't the edges labeled in the example?
>>
>>
>> avoid term "minted" when talking about choosing a URI for a Resource. 
>> "minted" is colloquial.
>>
>>
>> "3.3 Attribution of Provenance"  -- YES! :-)
>>
>>
>> The definition of Activity "An activity is anything that can operate 
>> on entities." seems to talk about the future
>>
>>
>>
>> activity(id, st, et, [ attr1=val1, ...]) does include brackets for 
>> optional constituents st and et
>>
>>
>> "(This type is equivalent to a "foaf:person" [FOAF])"   -->  we 
>> should not bind ourselves to  FOAF:
>>
>>
>>
>>
>> Please add a note to section Note to encourage people to use Account 
>> / AccountEntity/ Provenance to annotate provenance assertions as a 
>> better practice. When using AccountEntity, the annotated thing can be 
>> described _directly_ as a single triple instead of using Notes. Notes 
>> are very much "scruffy  provenance" and do not benefit from the 
>> directness afforded by AccountEntity / prov:Provenance.
>>
>> :prov_1 {
>>   :simon a prov:Human;
>>          prov:hasAnnotation [
>>               a prov:Note; ex3:reputation "excellent";
>>               rdfs:comment "This is a kludge way to get indirection. 
>> Use prov:Provenance instead.";
>>          ];
>> }
>>
>> :prov_2 {
>>    :simon ex3:reputation "excellent" .
>> }
>>
>> :prov_1 a prov:Provenance; prov:wasAttributedTo :first_asserter .
>> :prov_2 a prov:Provenance; prov:wasAttributedTo 
>> :trust_evaluator_agent. .
>>
>>
>> I'm starting to agree that wasGeneratedBy(id,e,a,t,attrs) should 
>> become Generation(id,e,a,t,attrs)
>>
>>
>>
>>
>> This starts to distract, I think: "While each of the components 
>> activity, time, and attributes is optional, at least one of them must 
>> be present."
>> Permitting degenerate cases should not be a priority. If not much (or 
>> nothing) is said with an assertion, let it be.
>>
>>
>>
>>
>> remove "order" from "wasGeneratedBy(e1,a1, 2001-10-26T21:32:52, 
>> [ex:port="p1", ex:order=1])" because it is distracting and encourages 
>> not using PROV for things that PROV should do.
>> I think Paolo agreed to this before.
>>
>>
>> both agents are responsible in Responsibility. Suggest to rename 
>> "responsible" to "superior" in "responsible: an identifier for the 
>> agent, on behalf of which the subordinate agent acted;" in section 
>> 4.2.3.1
>>
>>
>>
>> two wasQuotedFroms in the UML diagram in section 5
>>
>>
>>
>>
>>
>>
>>
>>
>
>

Received on Monday, 5 March 2012 14:35:17 UTC