- From: Paul Groth <p.t.groth@vu.nl>
- Date: Sun, 29 Jul 2012 22:14:11 +0200
- To: James Cheney <jcheney@inf.ed.ac.uk>, Luc Moreau <l.moreau@ecs.soton.ac.uk>, Paolo Missier <Paolo.Missier@ncl.ac.uk>
- Cc: Provenance Working Group <public-prov-wg@w3.org>
Hi prov-constraints editors: This is my review of the constraints draft for last call. Sorry for the delay, I wanted to make sure that I could implement each type of constraint. I'm reviewing http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-constraints-20120723/prov-constraints.html First, thanks for all your hard work. The document is precise and the approach is systematic. I have more detailed comments below. Answering the questions posed in http://lists.w3.org/Archives/Public/public-prov-wg/2012Jul/0346.html - 1. Is PROV-CONSTRAINTS ready to be released as a last call working draft (modulo editorial issues and resolution to the below issues)? Yes, but there are some major editorial things that need to be done to help implementors. Additionally, in section 6 you mention a proof in an appendix. This is technical content so either needs to be or not mentioned. 2. Regarding ISSUE-346: Is the role, meaning, and intended use of each type of inference or constraint clear? (http://www.w3.org/2011/prov/track/issues/346) I think each definition is now precise and clear but as I will mention in my longer comments I think there is some additional intuition necessary to help implementers. 3. Regarding ISSUE-451: Are there any objections to the revision-is-alternate inference? (http://www.w3.org/2011/prov/track/issues/451) Nope 4. Regarding ISSUE-454: Are the rules for disjointness clear and appropriate? (http://www.w3.org/2011/prov/track/issues/454) Yes 5. Regarding ISSUE-458: Should influence (and therefore all subrelations, including communication) be irreflexive, or can it be reflexive (i.e., can wasInfluencedBy(x,x) be valid)? (http://www.w3.org/2011/prov/track/issues/458) I think this come downs what we think the role of the constraints are. My impression is to encourage implementers to be both explicit and correct in the provenance they create. In terms of the example given in the issue, I would expect that if an activity called itself you would want to identify that has two independent activities. Thus, I think it's irreflexive. Actually, maybe this is suggesting the need for a part of relation around activities. 5. Are there any objections to closing other open issues on PROV-CONSTRAINTS? They are: - http://www.w3.org/2011/prov/track/issues/387 - http://www.w3.org/2011/prov/track/issues/394 - http://www.w3.org/2011/prov/track/issues/452 - http://www.w3.org/2011/prov/track/issues/453 I think all these issues are addressed. 6. Are there any new issues concerning definitions, constraints, or inferences? No ==Comments== My approach to reviewing the constraints was to attempt to implement the constraints and inferences using semantic web technologies. You can find the beginning of the implementation at https://github.com/pgroth/prov-constraints-validator-spin . I have satisfied myself that the specification can be implemented using SPIN RDF. However, I'm not 100 % certain, which is a bit of concern. Additionally, to get things to work I had to make sure the inferences were done in one pass, which may go against what is specified in the document. My major concern is the lack of intuition about what valid provenance is. I would describe it as follows: valid provenance identifies exactly partial states and those partial states are correctly ordered. I'm trying to implement the spec but as an implementor I need to know my broad goal when implementing these constraints. A key thing that it took me a while to get is that I need to generate all qualified relations before applying the constraints. This is an important point because it's sometimes unclear what should be considered an inference or constraints. Concretely, in the Event Ordering Constraints, the constraints are expressed stating that the head of the rule leads to an assertion of precedence. But actually, the thing is that you have to assert all these precedences relations first and then check for cycles. So I guess, are these really constraints? At any rate, the notion of checking for cycles needs to be brought out more. Overall, I think an implementor could use some examples that show the results of inference and the subsequent constraint checking and just more intuition about what a valid and invalid provenance graphs look like. ==Some comments per section== Section 3 I'm worried about the MUST in the compliant list "When determining whether two PROV instances are equivalent, an application must determine whether their normal forms are equal, as specified in section 6. Normalization, Validity, and Equivalence." Does this imply that I have to implement this to be compatible with PROV-DM? I would use SHOULD… Section 5.1 - From an RDF perspective, do I need to worry about merging? If the assumption is that I'm provided an RDF serialization to check then no merging is necessary. I guess the question is merging PROV-N specific? Section 6.1 - Why do we need to talk about a hierarchy of bundles? Isn't just the point that you want a set of provenance descriptions independent of bundles? Minor Notes: - PROV objects or prov constructs - check the consistency on this - inconsistency with naming. Do you always want to end inference with "-inference". See Inference 11 (derivation-generation-use) and Inference 10 (wasEndedBy-inference) Thanks Paul
Received on Sunday, 29 July 2012 20:14:40 UTC