W3C home > Mailing lists > Public > public-prov-wg@w3.org > April 2012

Re: PROV-ISSUE-333 (review-prov-dm-constraints-wd5): issue to collect feedback on prov-dm-constraints wd5 [prov-dm-constraints]

From: Timothy Lebo <lebot@rpi.edu>
Date: Wed, 11 Apr 2012 12:34:51 -0400
To: Provenance Working Group <public-prov-wg@w3.org>
Message-Id: <A8A783A5-A255-4ED5-B580-FF25911A832C@rpi.edu>

Please find here:
* summary of my review
* response to your review questions
* general comments on the document



Separating the events from the DM was and still is a good idea.
The need for events is mostly motivated (I have a suggestion below on how to make it more compelling)
The types of events are straightforward.
The need for event ordering is straightforward.

However, the organization and navigation of the document is disorienting and confusing.
The cause of this is a mix of things:
* The naming of the sections
* The contextual narrative around the links to constraints.
* Unclear distinctions among the types of constraints, why the distinction matters, why they are placed in different areas, and when and why one needs to use each.

Given the disorientation and confusion accumulated up to #45, I could not focus on the content and thus cannot give meaningful feedback on the content.

BLOCKER: #34, #40, #42 below.

My current vote for releasing this is -1.

After blockers #34, #40, #42 are fixed, I will need to vote +0 for public release, because I could not work through the content well enough.

editors' questions

 Can the document be released as a next public working draft? If no, what are the blocking issues?
no. See #34, #40, #42  below
Is the structure of the document approved?
no. it is disorienting and confusing.
Can the short name of the document be confirmed (in particular, for prov-n, prov-dm-constraints, since request needs to be sent for publication)?
If a reviewer raised some issues (closed pending review), can they be closed?
as long as the normal process is used to close them (where the raiser is informed that it is being closed)
Can all concept definitions be confirmed? Specifically,
consider ISSUE-337 on agents
consider ISSUE-223 on entities
No. I have many issues on Accounts. But this is not a blocker for this public release.

general comments

"actities" typo

perhaps simplify:
"agents bearing responsibility for entities that were generated and actities that happened"
"agents bearing responsibility for activities that happened"
and leave the fact that activities generate entities for later.

This summary seems to depend on more knowledge than should be required:
"Structural constraints are further constraints to be satisfied by generation descriptions"
* suggest to modify so that it can stand on its own.

section 1 - intro
The order of the sections 4-8 seems unnatural. Why is accounts at the very end of PROV-DM (almost not even in the document), yet it is the second of the five main sections in dm-constraints?

"(when entities are generated, used, or invalidated, or activities started or ended)" should be promoted to something more than an afterthought. It assumes that too much is known about the DM. Ease the reader in a bit more by piecing the DM concepts together to motivate the need for an Event.

"make their descriptions more robust." - robust isn't compelling on its own. What are the benefits of knowing and heeding the interpretation provided in -constraints?

"used in many different contexts: in a single system, across the Web, or in spatial data management, to name a few."
* the third example is a type of application and is out of place here.
* suggest to simplify to  "used in many different contexts within individual systems and across the Web."

"Hence, it is a design objective of PROV-DM to minimize the assumptions about time,"
"Hence, PROV-DM is designed to minimize assumptions about time,"

It seems like "unsynchronized clocks" should be mentioned within:
Although time is critical, we should also recognize that provenance can be used in many different contexts: in a single system, across the Web, or in spatial data management, to name a few. Hence, it is a design objective of PROV-DM to minimize the assumptions about time, so that PROV-DM can be used in varied contexts.
* this is to motivate the need to minimize assumptions about time.
* it sufficiently argues for the need for events (without the point 10 below)

This point has always seemed pedantic to me. And the current argument phrasing remains un-compelling and out of place.
Furthermore, consider two activities that started at the same time instant. Just by referring to that instant, we cannot distinguish which activity start we refer to. This is particularly relevant if we try to explain that the start of these activities had different reasons. We need to be able to refer to the start of an activity as a first class concept, so that we can talk about it and about its relation with respect to other similar starts.
* strongly recommend to drop this paragraph.


The use of "underpin" throughout this document reads like "we know better than you and we're candy coating it for you". It's rather offensive.
* recommend recasting those phrases with the much more compelling use as stated earlier in the document: "These constraints help provide an interpretation for provenance descriptions".
"Five kinds of instantaneous events underpin the PROV-DM data model."
"Five kinds of instantaneous events are used to provide an interpretation for the PROV-DM data model."

Intuitively, I order usages before generations (I use something to make something). The document does the opposite, probably because it takes an entity viewpoint while I take an activity viewpoint. As a human that participates in activities, it seems unnatural to impose the entity viewpoint.
* suggest to reorder usage before generation when listing or discussing them.

Have we settled on the naming "invalidation"?
"destruction" seems much more appropriate.
dictionary: Generation definition 4: creation.
thesaurus: Creation antonym destruction.
then the opposite of invalidation should be validation, no? It's currently "generation".

a more natural order seems to be: start, end, usage, generation, invalidation.
this is the order that a single activity would go through these events.

suggest to swap order of 2.1.1 and 2.1.2
* this lets you complete your justification for "events" before digging in to the 5 kinds that DM needs.

Sorry, I'm slow. Does "precedes is defined as the inverse of follows." mean 
"occurs at the same time as or before another"
"occurs after another" ?
I'd rather have it explicitly there, if only in parens.

This statement needs help in a handful of ways.
"This specification introduces a set of "temporal interpretation" rules allowing ordering constraints between instantaneous event to inferred from provenance descriptions."


What is being said here, and why?
PROV-DM also allows for time observations to be inserted in specific descriptions, for each recognized instantaneous event introduced in this specification. The presence of a time observation for a given instantaneous event fixes the mapping of this instantaneous event to the timeline. It can also help with the verification of associated ordering constraints (though, again, this verification is outside the scope of this specification).

quite long:
"From a provenance viewpoint, it is important to identify a "partial state" of something, i.e. something with some aspects that have been fixed, so that it becomes possible to express its provenance, and what causes that thing, with these specific aspects to be as such."
"From a provenance viewpoint, it is important to identify a "partial state" of something, i.e. something with some aspects that have been fixed, so that it becomes possible to express its provenance (i.e. what caused the thing with these specific aspects)."
* suggest rewrite above

"It is the purpose of attributes in PROV-DM to help fix some aspect of entities."
"Attributes in PROV-DM are used to fix certain aspects of entities."

What is the purpose of:
", and linking them to the very existence of entities." ?
* suggest removing it because it doesn't add anything.

The following is not open world friendly:
"An entity's attribute-value pairs are specified when the entity description is created and remain unchanged."
"An entity's attribute-value pairs are established when the entity is created and remain unchanged."
* recommend the above rephrase. Note the change from "description" to the entity itself. Yes, they are fixed, but we may not know what they are upon first description.

section 2.2
"An entity fixes some aspects of a thing and its situation in the world."
* recommend removing this statement. "Entity was already defined adequately in the paragraph above.
What benefit do we get by bringing in the baggage of this squirming ever-changing Thing that we try to fix with Entities?

The statement in 2.2: "For each perspective, an entity may be expressed:" is VERY powerful. This should be the epicenter for the Entity v Thing and spec/alt discussions.

THe paragraph seems to belong inside of the Example markup, not after it:
We do not assume that any entity is more important than any other; in fact, it is possible to describe the processing that occurred for the report to be commissioned, for individual versions to be created, for those versions to be published at the given URL, etc., each via a different entity with attribute-value pairs that fix some aspect of the report appropriately.

"Attributes are not restricted to entities, but they belong to a variety of PROV-DM objects, including activity"
"Attributes are not restricted to entities; they belong to a variety of PROV-DM objects including activity"

"for a given object, are expected"
"for a given object are expected"

"An account is a entity"
"An account is an entity"

"An account is as a container of provenance descriptions, hence its content may change over time."
"An account is a bundle of provenance descriptions whose content may change over time."



"If an account's set of descriptions changes over time, it increases monotonically with time."
"If an account's set of descriptions changes over time, it SHOULD increase monotonically with time."


Same as #22 above:
"An entity's attribute-value pairs are specified when the entity description is created and remain unchanged"
-> ~~~= ""An entity's attribute-value pairs are established when the entity is created and remain unchanged.""

The following is unmotivated 
"In order to describe something over several intervals, it is required to create multiple entities, each with its own identifier. This allow potential dependencies between the various entities to be expressed."
* what imposed those intervals? Why can't I just use one big interval? Would (external) usages impose an interval when I don't care about the partition?
* If this is talked about somewhere else in the document, link to the discussion.

"This allow potential dependencies between the various entities to be expressed."
"This allow potential dependencies between the various entities to be expressed."

This is unclear, ungrounded:
"This allows potential dependencies between the various entities to be expressed."

By the time I'm in 3.1.2, I'm glad that the event discussions didn't' clutter the DM and were separated out.

does not clearly delineate where the discussion for one constraint starts and where the next ends.


The pattern of section 3.1 is confusing. It is obvious that you're essentially redefining the terms from DM, but then it slips to the "interpretation" link.
It's not clear why this pattern is followed, and why the distinctions matter. I'm now spread across three areas to understand "Entity"?
* recommend adding a meta discourse describing this pattern in the very beginning of section 3 (reminding me if you told me already). Answer why you're pointing me to the interpretations, and how this differs from your redefinition in section 3.
e.g. "In this section, we provide elaborated definitions of the terms from [[PROV-DM]]. After each definition, a pointer to an interpretation is provided. The interpretation is useful because….."

"An activity's attribute-value pairs are expected to describe the activity's situation during its interval"
"An activity is not an entity. Indeed, an entity exists in full at any point in its lifetime, persists during this interval, and preserves the characteristics that makes it identifiable. In contrast, an activity is something that occurs, happens, unfolds, or develops through time, but is typically not identifiable by the characteristics it exhibits at any point during its duration"

The link behind "event" in:
"A generation is an instantaneous world event"
* why is the text just "event" and not "instantaneous event"?
* why is it linking to a definition of "entity generation event" within the definition of "3.1.3 Generation"
You're spreading me all over the place!

does not link to anything.
(okay, it does. but it's not clear that it _is_)
* suggest linking to the section "2.1 Time and Event" instead of the sentence defining it, since the direct link isn't giving enough context.

* recommend never to mention simply "id", as in "A generation's id"
* recommend to expand to identifier.

by 3.1.3 Generation
I was already dealing with (and postponing) "interpretations" (which were not explained to me how I should use)
Now I'm given an in-place constraint and a structural constraint (for generation).
The only time these were distinguished was in section 1, which gave a somewhat reasonable summary AS LONG AS the distinction (and how to use them) is explained further down. By this section, it hasn't become clear and I'm now confused.
* recommend adding a discussion about the distinction among these in Section 1 - as well as at the beginning of section 3 in the meta-discourse asked for in #35 above.


recommend add link to "interpretation:" whenever you indicate one. e.g.
Interpretation: For the interpretation of a generation, see generation-within-activity.
^^^ this would link to the definition of an interpretation constraint which would be right next to the definitions of the other constraint types. 


It appears that the link:
skips past the discussion that introduces the constraint.
* Recommend moving the anchor up to include the discussion, or to move the text below the anchor.

42) (thanks for fish) 
I'm running into a lot of navigational confusion.

When I'm in "PROV-DM Definitional Constraints and Inferences"
e.g. at http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120402/prov-dm-constraints.html#term-Generation

I'm shown:
For the interpretation of a generation, see generation-within-activity. 

which links to something in section PROV-DM Event Ordering Constraints

The naming is inconsistent and disorienting. If I go to see an interpretation, shouldn't that be in a section with "interpretation" in the title?
Or, if I go to an "event ordering constraint", shouldn't I be told that when you offer me the link (instead of "interpretation")?

As I bounce back and forth among these links, I couldn't tell you what section I'm in even if I had the outline in front of me.
The map doesn't' match the road signs.

This is related to #40 above in that the distinction between the types of constraints is not clear. The multiple naming discussed in this #42 isn't helping that.


"Section section-time-event" is disorienting.
* Recommend using section numbers.

is out of order w.r.t. the figure ordering. It comes between b and c.
Also, it does not include the orienting guide of "This is illustrated by Subfigure constraint-summary (XXXX) and expressed by constraint QQQQQ."

links to the BOTTOM of the figure, making it invisible after following the link (one needs to scroll up).

On Mar 29, 2012, at 9:41 AM, Provenance Working Group Issue Tracker wrote:

> PROV-ISSUE-333 (review-prov-dm-constraints-wd5): issue to collect feedback on prov-dm-constraints wd5   [prov-dm-constraints]
> http://www.w3.org/2011/prov/track/issues/333
> Raised by: Luc Moreau
> On product: prov-dm-constraints
> When sending feedback on prov-dm-constraints (wd5), please send it under this issue or individual new issues.

Received on Wednesday, 11 April 2012 16:40:31 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:51:11 UTC