- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Thu, 24 Jan 2013 15:05:42 +0000
- To: Tom De Nies <tom.denies@ugent.be>
- Cc: Provenance Working Group <public-prov-wg@w3.org>
Apologies for my slightly late review which you will find below. As usual my review ended up massive - so apologies if this reads very negative - it is more like a red-pen exercise. In fact I am quite impressed with the specification - it reads very well, and the semantics is very solid (specially compared to earlier back-and-forths on collections). Well done! See below for detailed review. On Wed, Jan 16, 2013 at 8:17 AM, Tom De Nies <tom.denies@ugent.be> wrote: > The latest editor's draft is here: > https://dvcs.w3.org/hg/prov/raw-file/default/dictionary/prov-dictionary.html#dictionary-xml-schema > Questions for reviewers > - Is the notation of Dictionary concepts clear & acceptable for you? (in > PROV-N, PROV-O and/or PROV-XML) Yes > - Are the constraints acceptable, or are they too loose/too strict? > -- In particular, can the constraint "IF derivedByRemovalFrom(d2, d1, > {"k1"}) THEN hadDictionaryMember(d1, e1, "k1") " be dropped, or do you > strongly support it? > - Is the name PROV-DICTIONARY appropriate for the document? Yes > - Can this be released as a first public working draft? No > - If not, where are the blocking issues? >From below: 1, 8b, 9, 11, 12, 15, 16, 17, 19, 20, 29, 34, 39, 41, 46, 47, 48, 49 > - If yes, are there other issues to work on? See below. These are mainly editorial/presentational. > In your review please include ISSUE-614 Done! Please find below detailed review: > 2.1 Dictionary membership > key: a key key_1 that is associated with the specified entity; 1) This should specify that the key is a *Value* as according to PROV-DM http://www.w3.org/TR/prov-dm/#term-value - as it is specified it could be interpreted as also an identifier - although the PROV-N example uses a "string value". (I know it is confusing because DM for some reason chose to call this Value rather than the more appropriate term Literal - but here in a dictionary the actual 'value' is the Entity (identifier), and the key is a (literal) Value! Therefore this should be clarified with DM hyperlink.) > Note that dictionary membership implies collection membership, but not vice versa. 2) This should link forward to inference 1 (dictionary-membership-collection-membership) 3) Also I don't quite understand this. So a prov:Dictionary kind of collection can have members that don't have keys? entity(d, [prov:type='prov:Dictionary' ]) // implies: entity(d, [prov:type='prov:Collection ]) hadDictionaryMember(d, e1, "k1") // implies: hadMember(d, e1) // But what if we also see? hadMember(d, e3) // are you saying this would NOT imply the below? hadDictionaryMember(d, e3, ?unknownKey) If so then I am a bit confused - a prov:Dictionary to be useful should be a constrained prov:Collection in which every member is associated with a key. This should be added to the Conceptual Definition of Dictionary above. If there is no such implication (of course the key is unknown until stated otherwise), I am not sure in which cases such a data type could be useful. It would be like describing an array type of collection, but where some items are allowed to not have a position. (which is quite different from saying they have an unknown position!) > > // d1 is a dictionary, with unknown content 4) Change to "with (so far) unknown content" ? Both example 1 and 2 and others - specially those that later actually do state the members. > 2.2 Dictionary insertion > d0 is the set { } > ... 5) I thought d0 was a Dictionary, not a set. s/set/dictionary/g here and in other examples. It is not simply a set of pairs, because we don't allow multiple entries for same key. > attributes: an optional set (attrs) of attribute-value pairs representing additional information about this relation. 6) Could one of the lines in example 3 perhaps include a dcterms:description or something? Just because it can get confusing with the new {("key", value)} syntax as opposed to the attribute syntax. derivedByInsertionFrom(d2, d1, {("k3", e3)}, [ dcterms:description = "An optional attribute" ] ) > 2.2 Dictionary Insertion > 2.3 Dictionary Removal > In particular, no assumptions are needed regarding the mutability of a data structure that is subject to updates 7) So from the examples given, it seems that both insertion and removal are "complete" statements - for instance after derivedByRemovalFrom(d3, d2, {"k1", "k3"}) we conclude that "k2" remains as a key in d3. I very much prefer this semantics, rather than a very earlier draft, where members could come and go out of the blue and you never really knew much. I think also this is OK as this is a specific prov:Dictionary thing rather than a general thing about prov:Collection's. Therefore 2.2 and 2.3 should clarify this and change > key-entity-set: the inserted key-entity pairs ( to > key-entity-set: all inserted key-entity pairs ( > key-set: a set of deleted keys (..) to > key-set: a set of all deleted keys (..) Inference 7 (insertion-removal-membership) makes this explicit (yay!) - so a forward reference would be good. 3. PROV-N Notation of Dictionary Concepts 8a) This section should start with a paragraph explaining that PROV-DICTIONARY specifies an extension according to PROV-N Extension chapter - http://www.w3.org/TR/prov-n/#extensibility - but using the same namespace 8b) The extension grammar should be downloadable and linked to from introduction of PROV-N notation. 9) However, to be a valid extension, the terms must be given as a QUALIFIED_NAME. With the proposed syntax in this document - the QNAMEs are in the default name space, which can be customized in PROV-N - http://www.w3.org/TR/prov-n/#expression-NamespaceDeclaration - and therefore easily overlap with other extensions. It is not explicit in PROV-N (perhaps it should be!) - but a missing default namespace declaration would make the QNAMEs invalid. > A qualified name's prefix is optional. If a prefix occurs in a qualified name, the prefix must refer to a namespace declared in a namespace declaration. In the absence of prefix, the qualified name belongs to the default namespace. (Side note: PROV-N should require that PN_PREFIX is a declared prefix in the current scope - and that if it is empty, that the default namespace has been declared) The prefix prov: is reserved in PROV-N. I therefore suggest to rename all the Dictionary concepts to qualified names using prov:* - that is: prov:hadDictionaryMember prov:derivedByInsertionFrom prov:derivedByRemovalFrom Thus we would avoid the embarrassing situation of PROV-Dictionary-N documents not being valid PROV-N documents - while documents using other extensions would be. 10) This section paragraph should mention that prov:Dictionary and prov:EmptyDictionary is declared using regular entity statements (with example). This could be inserted before 3.1. 3.1 Membership (PROV-N) > membershipExpression ::= 'hadDictionaryMember' '(' dIdentifier ',' entity ',' key ')' > key key (Non-Terminal) 11) "key" is not a defined Non-Terminal - neither here nor in PROV-N document or grammar. "dIdentifier" is similarly a new non-Terminal that must be defined here. (presumably it is an entity which is a prov:Dictionary type). Combined with issue #1 this is very confusing, and so I have made this and equivalent below a blocker. Use the style of the PROV-N document: http://www.w3.org/TR/prov-n/#expression-Entity .. here it also introduces optionalAttributeValuePairs, etc. 12) This is not valid grammar according to the EBNF section of PROV-N as it uses 'single quotes' rather than "double" http://www.w3.org/TR/prov-n/#grammar-notation Change this and following declaration to style with double quotes: > membershipExpression ::= "hadDictionaryMember" "(" dIdentifier "," entity "," key ")" 13) For some reason I was not able to copy-paste the ::== line above correctly from neither Firefox nor Chrome - the 'single quotes' disappeared! Why? CSS? Ensure quotes are regular characters. > Example 6 > > > In this example, d is a dictionary known to have e0, e1, and e2 as members, and may have other members. > entity(e0) > .. > hadDictionaryMember(d, e0, "k0") 14) I think this and following examples can be shortened - you should not explain the PROV Dictionary semantics again in this section and don't need to repeat the full examples - only the syntax of the particular statement. So change style to only: > Example 6 > // e0 is member in d under key "k0" > hadDictionaryMember(d, e0, "k0") > 3.2 Insertion > derivationByInsertionFromExpression ::= derivedByInsertionFrom ( identifier , dIdentifier , dIdentifier , { keyValuePairs } optional-attribute-values ) 15) Non-Terminals not matching table below: identifier, keyValuePairs, optional-attribute-values Change to match table. 16) Not defined: dIdentifier, keyEntitySet/keyValuePairs Remember that for keyValuePairs the first pair is not optional. 17) as the identifiers is optional, the separator should be ";" not ",". This also applies to derivedByRemovalFrom 18) Hyperlinks for prov-n terminals like 'identifier' are wrong - they go to non-existing local anchors rather than to PROV-N - like http://www.w3.org/TR/prov-n/#prod-identifier > 3.3 Removal > derivationByRemovalFromExpression ::= derivedByRemovalFrom ( identifier , dIdentifier , dIdentifier , { keySet } optional-attribute-values ) 19) optional-attribute-values -> optionalAttributeValuePairs > key-set keySet 20) Undefined terminal keySet Remember that the first key is not optional. > The following table summarizes how each constituent of a PROV-DM Membership maps to a non-terminal. > The following table summarizes how each constituent of a PROV-DM Insertion maps to a non-terminal. > The following table summarizes how each constituent of a PROV-DM Removal maps to a non-terminal. 21) PROV-DM -> PROV-DICTIONARY (or simply Dictionary) > 4. Ontological definition of dictionary Note that I have only checked the Turtle syntax here 'by hand' - they should be formally checked programmatically. > prov:hadDictionaryMember [ > a prov:KeyValuePair; > prov:key "k1"^^xsd:string > prov:value :e1 22) Fix indentation of prov:value (do NEVER use TAB character in such examples as it has inconsistent rendering) 23) Missing ; after xsd:string (invalid Turtle syntax) > A dictionary may be empty, in which case it should be described as an instance of the subclass prov:EmptyDictionary. 24) This makes sense regardless of syntax - can a similar statement be made in section 2.? > PROV-O provides two kinds of involvements 25) PROV-O --> PROV-DICTIONARY > prov:qualifiedInsertion is used to (..) > prov:qualifiedRemoval is used to specify (..) 26) Text should relate/hyperlink this to the insertion/removal sections earlier. > prov:Dictionary > back to collections classes 27) hyperlink from headers - does not work. Remove or change to link to "These terms are used" section. > :wasCreatedBy :bob; 28) Remove this as it is potentially confusing with prov:wasAttributedTo (or just change to prov:wAT) > described with properties > prov:derivedByInsertionFromop prov:qualifiedRemovalop prov:qualifiedInsertionop prov:derivedByRemovalFromop 29) Somehow the most important property, prov:hadDictionaryMember seems to be missing here! > prov:EmptyDictionary > > :d1 a prov:Dictionary; > prov:derivedByInsertionFrom :d; 30) I find it confusing that this example uses most lines to talk about a non-empty dictionary. Simplify to just show the boring: > :d a prov:EmptyDictionary . > An empty dictionary. 31) Could you add the obvious "i.e. has no members" > prov:insertedKeyValuePair [ > a prov:KeyValuePair; > prov:key "k1"^^xsd:string; > prov:value :e1; > ] 32) I know we had discussions about this before (when this was in PROV-O) as one can consider either the value or the keyvaluepair to be what is inserted. But this is getting quite verbose.. inserterkeyvaluepair keyvaluepair key and value! Puh! How about simplifying to: prov:inserted [ a prov:KeyValuePair; prov:key "k1"^^xsd:string; prov:value :e1; ] .. or possibly prov:insertedPair ? This is similar to how the insertion is a prov:Insertion and not a prov:DictionaryInsertion. I would keep prov:removedKey as it's important to know it's the *key* that was removed. > prov:Removal > removing one or more key-value pairs 33) Add ", specified by prov:removedKey" (as you don't specify the removed key/value pairs) > prov:dictionary > has domain > prov:Insertion > prov:Removal 34) This specifies that the domain is the intersection of Insertion and Removal - thus any Insertion with a prov:dictionary becomes also a Removal etc. This is surely wrong. Formally you should specify this as the domain of the union of Insertion and Removal - alternatively as a common "abstract" superclass prov:DictionaryInfluence. See PROV-O http://www.w3.org/TR/prov-o/#owl-profile As the union would take the extension out of OWL-RL, and there already is precedence for other kind of influences, I would suggest making a new prov:DictionaryInfluence which can be superclass of Insertion and Removal, and in domain of prov:dictionary. 35) Where can I find a download link for the OWL of this PROV Dictionary extension? I expected to find it in te beginning of section 4. I would also expect section 4 to have an explanation of the namespaces and how importing <http://www.w3.org/ns/prov#> one would get both PROV-O and extensions like PROV-Dictionary, while <http://www.w3.org/ns/prov-o> would give only PROV-O. I would then also expect a separate versionIRI to import if I only wanted PROV-Dictionary - like <http://www.w3.org/ns/prov-dictionary#> > prov:qualifiedInsertion > If this Dictionary prov:derivedByInsertionFrom another Dictionary :e, then it can qualify how it did perform the Insertion using prov:qualifiedInsertion [ a prov:Insertion; prov:dictionary :e; prov:inserted [a prov:KeyValuePair; prov:key "k1"^^xsd:string; prov:value :foo] ]. 36) I know I probably wrote this text - but without any indentation it is very difficult to read. Could this be simplified to some kind of mirror of prov:derivedByInsertionFrom? Same for prov:qualifiedRemoval. 37) example for qualifiedInsertion should not detail the members of our-old-baseball-team-field-positions as that is confusing and makes example large. > 5. XML Schema Dictionary > In this section, we provide the XML Schema to use dictionaries with the [PROV-XML] serialization. Ironically, you don't (only fragments) - where can I find and download the XML Schema? Note that I have not checked the syntax of XSD statements in this section - if I had the schema file I could have validated it. 38) Modify to "This section details how to describe dictionaries with the [PROV-XML] serialization". 39) Provide download link for XML schema - also include usage instructions as per imports etc. (equivalent as for OWL imports above) 40) Rename section to "PROV-XML representation of Dictionary" - similarly section 4 from "Ontological Definition of Dictionary" to "PROV-O representation of Dictionary" > 5.2 Key-Value Pair > <xs:element name="key" type="xs:String" /> 41) The specification earlier specifies key as any value/literal - so xs:string (lowercase S!) is too specific - it would not allow integers, for instance. Change to xs:anySimpleType here and in 5.5 Removal > <xs:element name="key" type="xs:String" maxOccurs="unbounded" /> > members of a dictionary are specified by listing all the key-value pairs inside a prov:DictionaryMembership element 42) Remove "all the" -- according to semantics of hadDictionaryMember > 6. Constraints Associated with Dictionary This section is very solid, good work! > These inferences and constraints need to be applied to obtain valid provenance when using dictionaries. 43) I don't think they *need to* be applied? Rephrase to "MAY be applied in order to ensure valid provenance". I think a similar kind of disclaimer as in http://www.w3.org/TR/prov-constraints/#purpose (or just linking to this) could be useful, so that it's clear that these constraints are not a MUST to use dictionaries. > Inference 1 (dictionary-membership-collection-membership) > IF hadDictionaryMember(d, e1, "k1") THEN hadMember(d, e1) 44) I would also add the inverse inference dictionary-all-members-have-keys: > Inference X (dictionary-all-members-have-keys) > > Here k is a (potentially unknown) key > > IF hadMember(d, e1) AND 'prov:Dictionary' ∈ typeOf(d) > THEN hadDictionaryMember(d, e1, k) If we don't specify this, opens for a prov:Dictionary to also contain entities which don't have a key. This could be confusing. For instance this would be valid: entity(d0, [prov:type='prov:EmptyDictionary']) entity(d1, [prov:type='prov:Dictionary']) entity(d2, [prov:type='prov:Dictionary']) derivedByInsertionFrom(d1, d0, {("k1", e1)}) // implied insertion-membership //hadDictionaryMember(d1, e1, "k1") // implied by dictionary-membership-collection-membership: //hadMember(d1, e1) derivedByRemovalFrom(d2, d1, {"k1"}) // Invalid by impossible-removal-membership: //hadDictionaryMember(d2, e1, "k1") // however this is still valid .. hadMember(d2, e1) // Unless we add: entity(d2, [prov:type='prov:EmptyDictionary']) // which implies //entity(d2, [prov:type='prov:EmptyCollection']) //which is invalid by constraint membership-empty-collection in PROV-CONSTRAINTS // However, we might now still add this confusing statement: hadMember(d1, e2) // By the PROV-DICTIONARY constraints (including the completeness constraint impossible-insertion-insertion and membership-insertion-membership) we would not here expect any other keys than "k1" in d1 - so we SHOULD be able to conclude that e2==e1 - however we can't do that unless we introduce my inference dictionary-all-members-have-keys: // dictionary-all-members-have-keys implies //hadDictionaryMember(d1, e2, k) // and by key-single-entity and membership-insertion-membership and membership-empty-collection: //e2 == e1 > IF hadDictionaryMember(d1, e, "k") and derivedByInsertionFrom(d2, d1, {("k1", e1),...,("kn", en)}) and k ∉ {k1,...,kn} THEN hadDictionaryMember(d2, e, "k") 45) Don't remove the quotes: "k" ∉ {"k1",...,"kn"} > IF derivedByRemovalFrom(d2, d1, {"k1",...,"kn"}) and derivedByInsertionFrom(d2, d1, {("k1", e1),...,("km",em)})THEN INVALID 46) This implies that "k1", etc. needs to be on both sides (Implying an intersection/containment of keys). This constraint is true no matter the key/value pairs, so change to same style as below: > Here, KV1 and KV2 are sets of key-entity pairs. > IF derivedByRemovalFrom(d2, d1, KV1) and derivedByInsertionFrom(d2, d1, KV2) THEN INVALID > 6.3 Typing > IF entity(d, [prov:type='prov:Dictionary']) THEN 'prov:Dictionary' ∈ typeOf(d) and 'prov:Collection' ∈ typeOf(d) > IF entity(d, [prov:type='prov:EmptyDictionary']) THEN 'prov:EmptyDictionary' ∈ typeOf(d) and 'prov:Dictionary' ∈ typeOf(d) 47) The second rule would not cause firing of the first rule, so expand the second rule. Also include prov:EmptyCollection and entity. > IF entity(d, [prov:type='prov:EmptyDictionary']) THEN 'prov:EmptyDictionary' ∈ typeOf(d) and 'prov:Dictionary' ∈ typeOf(d) and 'prov:Collection' ∈ typeOf(d) and 'prov:EmptyCollection' ∈ typeOf(d) 48) Both lines must also include: 'entity' ∈ typeOf(c) ..as PROV-Constraint "typing" would not fire from a typeof prov:Collection. > IF hadDictionaryMember(d, e, "k") THEN 'prov:Dictionary' ∈ typeOf(d) and 'entity' ∈ typeOf(e) > IF derivedByInsertionFrom(d2, d1, {("k1", e1)}) THEN 'prov:Dictionary' ∈ typeOf(d1) and 'prov:Dictionary' ∈ typeOf(d2) and 'entity' ∈ typeOf(e1) > IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN 'prov:Dictionary' ∈ typeOf(d1) and 'prov:Dictionary' ∈ typeOf(d2) 49) Similarly all of these should also include > AND prov:Collection' ∈ typeOf(c) AND 'entity' ∈ typeOf(c) ==== END OF REVIEW === Now relax..! :) -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Thursday, 24 January 2013 15:06:31 UTC