W3C home > Mailing lists > Public > public-prov-wg@w3.org > March 2012

Re: Review of DM WD4

From: Luc Moreau <l.moreau@ecs.soton.ac.uk>
Date: Fri, 02 Mar 2012 17:06:28 +0100
Message-ID: <EMEW3|620f98b397f986868cc4e4f7b4ebe657o22F6b08l.moreau|ecs.soton.ac.uk|4F50F004.40900@ecs.soton.ac.uk>
To: Timothy Lebo <lebot@rpi.edu>
CC: "public-prov-wg@w3.org Group" <public-prov-wg@w3.org>
Hi Tim

Paolo and I have made changes following your feedback.
Our responses can be found below.

This now completes WD4. Notes have been inserted in the document,
which we will tackle as part of WD5.

We are proposing to close ISSUE-274. Let us know if this is fine with you.
Regards,
Luc

 > I was asked to review DM WD3. This email constitutes my review.
 > I have included supplemental notes that I hope the DM editors will 
review and consider in future versions.
 > I have raised a few of the bigger issues in the tracker already.
 >
 > Regards,
 > Tim
 >
 > Goals of the review (per 
http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.02.16#PROV-DM_Simplification):
 >
 >      decide whether the new documents are inline with the 
simplification objective
 >
 > +1
 >
 >
 >      recommend whether they become the new editor's draft
 >
 > +1
 >
 >          if not, identify blocking issues
 >          if yes, identify potential issues to be raised against 
these future new editor's draft
 >
 >      decide whether ISSUE-145, ISSUE-183, ISSUE-215, ISSUE-225 and 
ISSUE-234 (all relating to identifiers) can be closed
 >
 >
 > ------
 > http://www.w3.org/2011/prov/track/issues/145
 > qualified identifiers may not work well with named graphs
 >
 > This issue can be CLOSED. The treatment of AccountEntities (which I 
hope will be renamed to prov:Provenance) and the section on provenance 
of provenance does not impose a scoping of identifiers.
 > This will make it easy to implement using RDF mechanisms
 >
 >
 > ------
 > http://www.w3.org/2011/prov/track/issues/183
 > identifiers in prov-dm
 >
 >
 > The use of identifiers is no longer confusing. They identify 
Entities, Activities, etc.
 > "Records" (a dying term) are not identified, they identifier they 
mention is identifying the Entity, Activity, Involvement, etc.
 >
 >
 > ------
 > http://www.w3.org/2011/prov/track/issues/215
 > ProvenanceOfW3CReport
 >
 > The example is good because it shows two perspectives, which makes it 
easy to use for AccountEntity (prov:Provenance).
 > The identifiers make it a bit dry and hard to follow, but the 
concrete aspect is MUCH more useful.
 >
 >
 > ------
 > http://www.w3.org/2011/prov/track/issues/225
 > What are the objects in the universe of discourse?
 >
 > This can be CLOSED. It is not confusing in the current writeup.
 >
 > ------
 > http://www.w3.org/2011/prov/track/issues/234
 > id identifies entity, not the record
 >
 > Can be CLOSED.
 >
 >
 >
 >
 > ------- supplemental notes --------
 >
 >
 > About notes in 
http://www.w3.org/2011/prov/wiki/ProvDMWorkingDraft4#Design_decisions
 >
 >      If part 3 is now separate from part 1, there is no need to talk 
about 'Entity Record' (or whatever Record) in part 1. Instead, we can 
just mention Entity (or whatever other concept)

 > +1 This is much more natural
 >
 >      Given that Part 3 is just about ASN, and therefore is a 
language, then we can without confusion, talk about 'Entity Expression' 
since now these would be Expressions of the language
 >
 > +1
 >
 >      Does this mean that we would be dropping the term record 
entirely? What would we bundle up though?
 >
 > I would say we bundle up "expressions". One could bundle ASN 
expressions, RDF expressions, XML expressions, etc.
 >
 >      What about assertions? So should still use the word?
 >
 > I would suggest the more general term "expression" in place of 
"assertion".
 >
 >
 > ------- supplemental notes --------
 >
 > About 
http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html
 >
 >
 > Sections entitled "Activity-Entity Relation" seem a bit unnatural. 
Perhaps something like "Relations between Activities and Entities" would 
be clearer.

Title will change when components are introduced.

 >
 > The phrase "when the data it is about changes" is unclear.

Updated to:
  However, if data changes, it is challenging to express its provenance 
precisely, like it would be for any other form of metadata.

 >
 > "To address this challenge, an upgrade path is proposed to enrich 
simple provenance..." This paragraph is nice. I'd suggest including 
"specific subject" in "qualify the subject of provenance".

Done.

 >
 >
 > Is it okay to use ASN before it is defined? "In section 3, PROV-DM is 
applied to a short scenario, encoded in PROV-ASN, and illustrated 
graphically."
 >
 > "Section 4 provides the definition of PROV-DM." is a bit ambiguous. 
Please elaborate.

Section 4 provides the definition of PROV-DM constructs.

 >
 >
 > The following duplicates: "Activities that operate on digital 
entities may for example move, copy, or duplicate them. Activities that 
operate on digital entities may for example move, copy, or duplicate them."

Done.

 >
 > I propose to change the Agent definition from->to:
 > "An agent is a type of entity that can be associated to an activity, 
to indicate that it bears some form of responsibility for the activity 
taking place."
 > "An agent is a type of entity that bears some form of responsibility 
for an activity taking place."
 >

Yes, implemented.

 >
 > perhaps add the person invoking the grammar checker to the following 
example (to illustrate the levels of responsibility):
 > "Software for checking the use of grammar in a document may be 
defined as an agent of a document preparation activity, and at the same 
time one can describe its provenance, including for instance the vendor 
and the version history."

This is just an example for agent, we shouldn't illustrate 
responsibility here. This comes laters.

 >
 > add "an" to "Generation is the completed production of a new entity 
by activity." -> "Generation is the completed production of a new entity 
by an activity."

done

 >
 > reads oddly: "the activity had not begun to consume or use to this 
entity"

dropped 'to'.

 >
 >
 > avoid parens in a definition: "(and could not have been affected by 
the entity)"
 >

Done
 >
 > avoid "internal" in collection definition "A collection is an entity 
that has internal structure." -> "A collection is an entity provides 
structure to some constituents." (or something)

Yes, done.

 >
 >
 > shocked by naming of "AccountEntity" why not "PlanEntity" and 
"CollectionEntity" (no, I don't want that...) I propose to rename 
"AccountEntity" to "Provenance"


This will revisited as part of the overall discussion on accounts.
So, for now, no action.

Other option is to drop this subtype of entity. We don't need to express 
provenance of provenance.

 >
 >
 > This sentence is long. Suggest stopping it at the first comma. "It is 
important to reflect that there is a degree in the responsibility of 
agents, and that is a major reason for distinguishing among all the 
agents that have some association with an activity and determine which 
ones are really the originators of the entity."
 > ("and that is a major reason for distinguishing" -> "There is a major 
reason for distinguishing")
 >

This paragraph was edited.

 >
 > Suggest removing "active" in "indicating that the agent had an active 
role in the activity". Does RPI have an active role in the writing of 
this email (since I'm an RPI student...)? I'd say they have a role, but 
not an active one.
 >

OK, dropped.

 >
 >
 > 
http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html#section-UML 
shows Activity wasStartedBy Agent, but Luc just said in email recently 
that only Activity wasStartedBy Activity is the way forward. I prefer 
Activity wasStartedBy Agent and think that some other involvement should 
be named for the special informed involvement Activity ?triggered? Activity.
 >
 >

A proposal, towards WD5, will be submitted to discussion by the WG. It 
will address that point.

 >
 > "ex:pub2" is a bad name - is it an activity or entity? I recommend 
"ex:act2"
 >

Done

 >
 > why aren't the edges labeled in the example?

To be done, there is a note to that effect.
 >
 >
 > avoid term "minted" when talking about choosing a URI for a Resource. 
"minted" is colloquial.
 >

OK, generated.
 >
 > "3.3 Attribution of Provenance"  -- YES! :-)
 >
 >
 > The definition of Activity "An activity is anything that can operate 
on entities." seems to talk about the future
 >
 >

It's general property of definitions, they don't refer to the past 
explicitly. We think it's fine.

 >
 > activity(id, st, et, [ attr1=val1, ...]) does include brackets for 
optional constituents st and et
 >

This is not a grammar, so it's not appropriate to use square brackets 
mark the optional nature.
The square brackets used for [ attr1=val1, ...] are part of the syntax!


 >
 > "(This type is equivalent to a "foaf:person" [FOAF])"   --> we should 
not bind ourselves to  FOAF:
 >
 >

We removed references to FOAF.

 >
 >
 > Please add a note to section Note to encourage people to use Account 
/ AccountEntity/ Provenance to annotate provenance assertions as a 
better practice. When using AccountEntity, the annotated thing can be 
described _directly_ as a single triple instead of using Notes. Notes 
are very much "scruffy  provenance" and do not benefit from the 
directness afforded by AccountEntity / prov:Provenance.
 >
 > :prov_1 {
 >  :simon a prov:Human;
 >         prov:hasAnnotation [
 >              a prov:Note; ex3:reputation "excellent";
 >              rdfs:comment "This is a kludge way to get indirection. 
Use prov:Provenance instead.";
 >         ];
 > }
 >
 > :prov_2 {
 >   :simon ex3:reputation "excellent" .
 > }
 >
 > :prov_1 a prov:Provenance; prov:wasAttributedTo :first_asserter .
 > :prov_2 a prov:Provenance; prov:wasAttributedTo :trust_evaluator_agent. .

See email discussion. I don't think we have reached agreement yet.

 >
 >
 > I'm starting to agree that wasGeneratedBy(id,e,a,t,attrs) should 
become Generation(id,e,a,t,attrs)
 >
 >

We feel that even if the activity is not specified, there is an implied 
activity, so this is reasonable to keep
the name wasGeneratedBy. Thoughts?

 >
 >
 > This starts to distract, I think: "While each of the components 
activity, time, and attributes is optional, at least one of them must be 
present."
 > Permitting degenerate cases should not be a priority. If not much (or 
nothing) is said with an assertion, let it be.
 >
 >

It's to address ISSUE-XXX that we have introducing this statement. We 
don't feel it's a distraction.

 >
 >
 > remove "order" from "wasGeneratedBy(e1,a1, 2001-10-26T21:32:52, 
[ex:port="p1", ex:order=1])" because it is distracting and encourages 
not using PROV for things that PROV should do.
 > I think Paolo agreed to this before.


We don't see this distracting, it's an example, a real-use case in 
workflow.
What is it that is being discouraged by this example?

 >
 >
 > both agents are responsible in Responsibility. Suggest to rename 
"responsible" to "superior" in "responsible: an identifier for the 
agent, on behalf of which the subordinate agent acted;" in section 4.2.3.1
 >

What about  deputy and superior?

(PS. Oxford American suggests 'second banana' ;-)


 >
 >
 > two wasQuotedFroms in the UML diagram in section 5

Should be original Source. Figure edited.


On 22/02/2012 19:46, Timothy Lebo wrote:
> I was asked to review DM WD3. This email constitutes my review.
> I have included supplemental notes that I hope the DM editors will review and consider in future versions.
> I have raised a few of the bigger issues in the tracker already.
>
> Regards,
> Tim
>
> Goals of the review (per http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.02.16#PROV-DM_Simplification):
>
> 	 decide whether the new documents are inline with the simplification objective
>
> +1
>
>
> 	 recommend whether they become the new editor's draft
>
> +1
>
> 		 if not, identify blocking issues
> 		 if yes, identify potential issues to be raised against these future new editor's draft
>
> 	 decide whether ISSUE-145, ISSUE-183, ISSUE-215, ISSUE-225 and ISSUE-234 (all relating to identifiers) can be closed
>
>
> ------
> http://www.w3.org/2011/prov/track/issues/145
> qualified identifiers may not work well with named graphs
>
> This issue can be CLOSED. The treatment of AccountEntities (which I hope will be renamed to prov:Provenance) and the section on provenance of provenance does not impose a scoping of identifiers.
> This will make it easy to implement using RDF mechanisms
>
>
> ------
> http://www.w3.org/2011/prov/track/issues/183
> identifiers in prov-dm
>
>
> The use of identifiers is no longer confusing. They identify Entities, Activities, etc.
> "Records" (a dying term) are not identified, they identifier they mention is identifying the Entity, Activity, Involvement, etc.
>
>
> ------
> http://www.w3.org/2011/prov/track/issues/215
> ProvenanceOfW3CReport
>
> The example is good because it shows two perspectives, which makes it easy to use for AccountEntity (prov:Provenance).
> The identifiers make it a bit dry and hard to follow, but the concrete aspect is MUCH more useful.
>
>
> ------
> http://www.w3.org/2011/prov/track/issues/225
> What are the objects in the universe of discourse?
>
> This can be CLOSED. It is not confusing in the current writeup.
>
> ------
> http://www.w3.org/2011/prov/track/issues/234
> id identifies entity, not the record
>
> Can be CLOSED.
>
>
>
>
> ------- supplemental notes --------
>
>
> About notes in http://www.w3.org/2011/prov/wiki/ProvDMWorkingDraft4#Design_decisions
>
> 	 If part 3 is now separate from part 1, there is no need to talk about 'Entity Record' (or whatever Record) in part 1. Instead, we can just mention Entity (or whatever other concept).
>
> +1 This is much more natural
>
> 	 Given that Part 3 is just about ASN, and therefore is a language, then we can without confusion, talk about 'Entity Expression' since now these would be Expressions of the language
>
> +1
>
> 	 Does this mean that we would be dropping the term record entirely? What would we bundle up though?
>
> I would say we bundle up "expressions". One could bundle ASN expressions, RDF expressions, XML expressions, etc.
>
> 	 What about assertions? So should still use the word?
>
> I would suggest the more general term "expression" in place of "assertion".
>
>
> ------- supplemental notes --------
>
> About http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html
>
>
> Sections entitled "Activity-Entity Relation" seem a bit unnatural. Perhaps something like "Relations between Activities and Entities" would be clearer.
>
> The phrase "when the data it is about changes" is unclear.
>
> "To address this challenge, an upgrade path is proposed to enrich simple provenance..." This paragraph is nice. I'd suggest including "specific subject" in "qualify the subject of provenance".
>
>
> Is it okay to use ASN before it is defined? "In section 3, PROV-DM is applied to a short scenario, encoded in PROV-ASN, and illustrated graphically."
>
> "Section 4 provides the definition of PROV-DM." is a bit ambiguous. Please elaborate.
>
>
> The following duplicates: "Activities that operate on digital entities may for example move, copy, or duplicate them. Activities that operate on digital entities may for example move, copy, or duplicate them."
>
> I propose to change the Agent definition from->to:
> "An agent is a type of entity that can be associated to an activity, to indicate that it bears some form of responsibility for the activity taking place."
> "An agent is a type of entity that bears some form of responsibility for an activity taking place."
>
>
> perhaps add the person invoking the grammar checker to the following example (to illustrate the levels of responsibility):
> "Software for checking the use of grammar in a document may be defined as an agent of a document preparation activity, and at the same time one can describe its provenance, including for instance the vendor and the version history."
>
> add "an" to "Generation is the completed production of a new entity by activity." ->  "Generation is the completed production of a new entity by an activity."
>
> reads oddly: "the activity had not begun to consume or use to this entity"
>
>
> avoid parens in a definition: "(and could not have been affected by the entity)"
>
>
> avoid "internal" in collection definition "A collection is an entity that has internal structure." ->  "A collection is an entity provides structure to some constituents." (or something)
>
>
> shocked by naming of "AccountEntity" why not "PlanEntity" and "CollectionEntity" (no, I don't want that...) I propose to rename "AccountEntity" to "Provenance"
>
>
> This sentence is long. Suggest stopping it at the first comma. "It is important to reflect that there is a degree in the responsibility of agents, and that is a major reason for distinguishing among all the agents that have some association with an activity and determine which ones are really the originators of the entity."
> ("and that is a major reason for distinguishing" ->  "There is a major reason for distinguishing")
>
>
> Suggest removing "active" in "indicating that the agent had an active role in the activity". Does RPI have an active role in the writing of this email (since I'm an RPI student...)? I'd say they have a role, but not an active one.
>
>
>
> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html#section-UML shows Activity wasStartedBy Agent, but Luc just said in email recently that only Activity wasStartedBy Activity is the way forward. I prefer Activity wasStartedBy Agent and think that some other involvement should be named for the special informed involvement Activity ?triggered? Activity.
>
>
>
> "ex:pub2" is a bad name - is it an activity or entity? I recommend "ex:act2"
>
>
> why aren't the edges labeled in the example?
>
>
> avoid term "minted" when talking about choosing a URI for a Resource. "minted" is colloquial.
>
>
> "3.3 Attribution of Provenance"  -- YES! :-)
>
>
> The definition of Activity "An activity is anything that can operate on entities." seems to talk about the future
>
>
>
> activity(id, st, et, [ attr1=val1, ...]) does include brackets for optional constituents st and et
>
>
> "(This type is equivalent to a "foaf:person" [FOAF])"   -->  we should not bind ourselves to  FOAF:
>
>
>
>
> Please add a note to section Note to encourage people to use Account / AccountEntity/ Provenance to annotate provenance assertions as a better practice. When using AccountEntity, the annotated thing can be described _directly_ as a single triple instead of using Notes. Notes are very much "scruffy  provenance" and do not benefit from the directness afforded by AccountEntity / prov:Provenance.
>
> :prov_1 {
>   :simon a prov:Human;
>          prov:hasAnnotation [
>               a prov:Note; ex3:reputation "excellent";
>               rdfs:comment "This is a kludge way to get indirection. Use prov:Provenance instead.";
>          ];
> }
>
> :prov_2 {
>    :simon ex3:reputation "excellent" .
> }
>
> :prov_1 a prov:Provenance; prov:wasAttributedTo :first_asserter .
> :prov_2 a prov:Provenance; prov:wasAttributedTo :trust_evaluator_agent. .
>
>
> I'm starting to agree that wasGeneratedBy(id,e,a,t,attrs) should become Generation(id,e,a,t,attrs)
>
>
>
>
> This starts to distract, I think: "While each of the components activity, time, and attributes is optional, at least one of them must be present."
> Permitting degenerate cases should not be a priority. If not much (or nothing) is said with an assertion, let it be.
>
>
>
>
> remove "order" from "wasGeneratedBy(e1,a1, 2001-10-26T21:32:52, [ex:port="p1", ex:order=1])" because it is distracting and encourages not using PROV for things that PROV should do.
> I think Paolo agreed to this before.
>
>
> both agents are responsible in Responsibility. Suggest to rename "responsible" to "superior" in "responsible: an identifier for the agent, on behalf of which the subordinate agent acted;" in section 4.2.3.1
>
>
>
> two wasQuotedFroms in the UML diagram in section 5
>
>
>
>
>
>
>
>
>    
Received on Saturday, 3 March 2012 15:10:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 13:06:58 GMT