W3C home > Mailing lists > Public > public-prov-wg@w3.org > October 2012

Review of PROV-CONSTRAINTS issues (ISSUE-576, ISSUE-580, ISSUE-577, ISSUE-578, ISSUE-581)

From: James Cheney <jcheney@inf.ed.ac.uk>
Date: Mon, 29 Oct 2012 10:55:33 +0000
Message-Id: <B29C0B18-46A7-4644-890C-8BE325C14E41@inf.ed.ac.uk>
To: Provenance Working Group <public-prov-wg@w3.org>
('binary' encoding is not supported, stored as-is)
Hi,

I've completed changes to address the issues raised by Antoine Zimmermann.

the main substantive change is to the definition of equivalence (sec. 6.1):

- adding explicit definition of isomorphism (invertible renaming of existential variables)
- allowing equivalence between invalid instances, up to the implementation as long as it is reflexive, symmetric and transitive

Additional changes concern clarifying the fact that although we give a procedural specification that can be implemented to comply with the spec, any other implementation strategy is also allowed, and we will try to clarify the underlying declarative specification in PROV-SEM.

Please let me know if you have any objection to these responses by Wednesday, October 31 (though I guess we may not be voting until next week).

--James


> ISSUE-580 (drop-syntactic-sugar-definitions)
> 
> Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
> Tracker: http://www.w3.org/2011/prov/track/issues/580
> Summary: Suggestion to drop definitions in section 4.1 since they are not needed if the semantics is defined more abstractly
> Group response:
> This is actually an orthogonal issue to the style of semantics; PROV-DM and PROV-N nowhere specify how missing arguments are to be expanded to the "PROV-DM abstract syntax" (which itself is not explicitly specified in PROV-DM). You're correct that Definition 1 (which expands short forms) is in a sense implicit in PROV-DM, which only discusses the long forms and their optional arguments, but it isn't said explicitly in either PROV-DM or PROV-N how the PROV-N short forms are to be expanded to PROV-DM. Furthermore, Def. 2-4 deal with special cases concerning optional/implicit parameters which are not explained anywhere else. We recognize that there is a certain amount of PROV-N centrism in these definitions, but since PROV-N is formally specified and the abstract syntax is not, we feel it's important to make fully clear how arbitrary PROV-N can be translated to the subset of PROV-N that corresponds to the abstract syntax of PROV-DM. This is to ensure that there is no room for misinterpretation among multiple readers, who may expect different conventions for expansion/implicit parameters (even if the rules we specified seem "obvious").
> References:
> Changes to the document:
> Add a note clarifying the relationship between PROV-DM "abstract syntax" and PROV-N, and why the definitions are needed for this mapping.
> Original author's acknowledgement:
> [edit] ISSUE-577 (valid-vs-consistent)
> 
> Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
> Tracker: http://www.w3.org/2011/prov/track/issues/577
> Summary: 'Valid' is used differently from its usual meaning in logic; 'consistent' would be a better term
> Group response:
> We would like to clarify that we are not attempting to define a semantics (in the sense of model theory or programming language semantics) for PROV in PROV-CONSTRAINTS. We may do this in a future version of PROV-SEM, by giving a first-order axiomatization that is sound with respect to the model theory that is in the current draft of PROV-SEM.
> PROV-CONSTRAINTS defines a subset of PROV documents, currently called "valid", by analogy with the notion of validity in other Web standards such as XML, CSS, and so on. While concepts from logic are used, it is not intended as a logic or semantics.
> We agree that it would be preferable to avoid redefining standard terminology from logic in nonstandard ways, and you are correct that "valid" means something different in logic than the sense in which it is usually used in Web standards. However, since we expect our audience to consist of implementors and not logicians, on reflection we prefer the terminology "valid"/"validation" over "consistent"/"consistency checking".
> References:
> Changes to the document:
> Clarify (sec. 1.2) that our notion of "valid" is named by analogy to other W3C standards, such as CSS and XML, and that in logical terms it is "consistency"
> Original author's acknowledgement:
> 
> [edit] ISSUE-578 (equivalence)
> 
> Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
> Tracker: http://www.w3.org/2011/prov/track/issues/578
> Summary: Use of "equivalent"; incompatibility with common uses of the term in logic/mathematics
> Group response:
> This issue was discussed within the group already, and we could not come to an agreement on how equivalence should behave on invalid instances. Therefore, we decided not to define equivalence on invalid instances.
> From a mathematical point of view, we only define equivalence as a relation over valid documents/instances, not all instances. This avoids the problem of deciding what to do with equivalence for invalid instances.
> By analogy consider a typed programming language. An expression 2 + "foo" is not well-typed; technically one could consider a notion of equivalence on such expressions, so that for example, 2 + "foo" would be equivalent to (1 + 1) + "foo". But these ill-typed expressions are (by the definition of the language) not allowed. Similarly, for applications that care about validity, invalid PROV documents can be ignored, so (to us) there seems to be no negative consequence to defining equivalence to hold only on this subset of documents, or to defining all invalid documents to be equivalent (as would follow from the logical definition of equivalence).
> However, for other applications, such as information retrieval, it is not safe to assume that an invalid instance is equivalent to "false"; we can imagine scenarios where an application wants to search for documents similar to an existing (possibly invalid) document. If the definition of equivalence considers all invalid documents equivalent, then there will be a huge number of matches that have no (intuitive) similarity to the query document.
> We also plan to augment PROV-SEM with a logical formalization that will be related to both the model theory proposed there and the procedural specification in PROV-CONSTRAINTS. For this formalization, logical equivalence will be the same as PROV-equivalence on valid instances. (For invalid instances, logical equivalence requires making all invalid instances equivalent, which we prefer not to require.)
> References:
> Changes to the document:
> explicitly defined isomorphism in normative section 6.1
> specify that equivalence is an equivalence relation on *all* documents
> specify that no invalid document is equivalent to a valid one
> specify equivalence between valid documents as already done
> leave it up to implementations how (if at all) to test equivalence on different invalid documents.
> relating PROV-equivalence with logical equivalence is deferred to PROV-SEM
> Original author's acknowledgement:
> 
> [edit] ISSUE-581 (avoid-specifying-algorithm)
> 
> Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
> Tracker: http://www.w3.org/2011/prov/track/issues/581
> Summary: Suggestion to avoid wording that 'almost requires' using normalization to implement constraints
> Group response:
> Just saying that we *define* validity and equivalence in terms of a normalization procedure that *can* be used to check these properties does not require that all implementations explicitly perform normalization. We discussed this issue extensively, and one consequence of this is that the implementation criteria for the constraints document will only test the extensional behavior of validity/equivalence checks; implementations only need to classify documents as valid/invalid/equivalent etc. in the same way as the reference implementation, they do not have to "be" the reference implementation.
> However, this issue arose relatively late in the process and we did miss some places where the document gives a misleading impression that normalization is required to implement the spec.
> Nevertheless, as written, it is difficult to see how else one could implement the specification. In fact, you are correct that there is a simple, declarative specification via a FOL theory of what the normalization algorithm does, which could be used as a starting point for people with a formal background or those who wish to implement the specification in some other way. However, we disagree that it would improve the specification to adopt the declarative view as normative.
> Making the document smaller and simpler in this way would detract from its usefulness to implementors that are not already experts in computational logic. In other words, we recognize that some implementors may want to check the constraints in other ways, but we believe that the algorithm we used to specify validity and equivalence is a particular, good way by default, because it sits within a well-understood formalism known from database constraints and finite first-order model theory.
> The normal forms are essentially "universal instances" in the sense of Fagin et al. 2005, and the algorithm we outline is easily seen to be in polynomial time; in contrast, simply giving a FOL theory on its own gives no guarantee of efficiency or even decidability.
> We intend to incorporate this theory and formalize the link between the procedural and declarative specifications in PROV-SEM. Although PROV-SEM will not be normative, any implementation that correctly implements the declarative specification given there will be correct.
> We will also take greater care to explain that the procedural approach to specification is just one of many possible ways to implement constraint checking (though the group as a whole feels that it is a good default approach for implementors seeking a shortest path to compliance).
> References:
> Revise all parts of the document that may currently convey the impression that the normalization algorithm is a REQUIRED implementation strategy, to ensure that it is clear that this is one approach (among possibly many) that implementations MAY employ. PROV-SEM will present a declarative specification that may serve as a better starting point for alternative implementations.
> Added a paragraph to the beginning of section 2 that specifically addresses this
> Changes to the document:
> Original author's acknowledgement:



The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Received on Monday, 29 October 2012 10:56:23 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:20 UTC