- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Thu, 24 Jan 2013 15:05:42 +0000
- To: Tom De Nies <tom.denies@ugent.be>
- Cc: Provenance Working Group <public-prov-wg@w3.org>
Apologies for my slightly late review which you will find below.
As usual my review ended up massive - so apologies if this reads very
negative - it is more like a red-pen exercise.
In fact I am quite impressed with the specification - it reads very
well, and the semantics is very solid (specially compared to earlier
back-and-forths on collections). Well done!
See below for detailed review.
On Wed, Jan 16, 2013 at 8:17 AM, Tom De Nies <tom.denies@ugent.be> wrote:
> The latest editor's draft is here:
> https://dvcs.w3.org/hg/prov/raw-file/default/dictionary/prov-dictionary.html#dictionary-xml-schema
> Questions for reviewers
> - Is the notation of Dictionary concepts clear & acceptable for you? (in
> PROV-N, PROV-O and/or PROV-XML)
Yes
> - Are the constraints acceptable, or are they too loose/too strict?
> -- In particular, can the constraint "IF derivedByRemovalFrom(d2, d1,
> {"k1"}) THEN hadDictionaryMember(d1, e1, "k1") " be dropped, or do you
> strongly support it?
> - Is the name PROV-DICTIONARY appropriate for the document?
Yes
> - Can this be released as a first public working draft?
No
> - If not, where are the blocking issues?
>From below: 1, 8b, 9, 11, 12, 15, 16, 17, 19, 20, 29, 34, 39, 41, 46, 47, 48, 49
> - If yes, are there other issues to work on?
See below. These are mainly editorial/presentational.
> In your review please include ISSUE-614
Done!
Please find below detailed review:
> 2.1 Dictionary membership
> key: a key key_1 that is associated with the specified entity;
1) This should specify that the key is a *Value* as according to
PROV-DM http://www.w3.org/TR/prov-dm/#term-value - as it is specified
it could be interpreted as also an identifier - although the PROV-N
example uses a "string value". (I know it is confusing because DM
for some reason chose to call this Value rather than the more
appropriate term Literal - but here in a dictionary the actual 'value'
is the Entity (identifier), and the key is a (literal) Value!
Therefore this should be clarified with DM hyperlink.)
> Note that dictionary membership implies collection membership, but not vice versa.
2) This should link forward to inference 1
(dictionary-membership-collection-membership)
3) Also I don't quite understand this. So a prov:Dictionary kind of
collection can have members that don't have keys?
entity(d, [prov:type='prov:Dictionary' ])
// implies:
entity(d, [prov:type='prov:Collection ])
hadDictionaryMember(d, e1, "k1")
// implies:
hadMember(d, e1)
// But what if we also see?
hadMember(d, e3)
// are you saying this would NOT imply the below?
hadDictionaryMember(d, e3, ?unknownKey)
If so then I am a bit confused - a prov:Dictionary to be useful should
be a constrained prov:Collection in which every member is associated
with a key. This should be added to the Conceptual Definition of
Dictionary above.
If there is no such implication (of course the key is unknown until
stated otherwise), I am not sure in which cases such a data type could
be useful. It would be like describing an array type of collection,
but where some items are allowed to not have a position. (which is
quite different from saying they have an unknown position!)
> > // d1 is a dictionary, with unknown content
4) Change to "with (so far) unknown content" ? Both example 1 and 2
and others - specially those that later actually do state the members.
> 2.2 Dictionary insertion
> d0 is the set { }
> ...
5) I thought d0 was a Dictionary, not a set. s/set/dictionary/g here
and in other examples.
It is not simply a set of pairs, because we don't allow multiple
entries for same key.
> attributes: an optional set (attrs) of attribute-value pairs representing additional information about this relation.
6) Could one of the lines in example 3 perhaps include a
dcterms:description or something? Just because it can get confusing
with the new {("key", value)} syntax as opposed to the attribute
syntax.
derivedByInsertionFrom(d2, d1, {("k3", e3)}, [ dcterms:description =
"An optional attribute" ] )
> 2.2 Dictionary Insertion
> 2.3 Dictionary Removal
> In particular, no assumptions are needed regarding the mutability of a data structure that is subject to updates
7) So from the examples given, it seems that both insertion and
removal are "complete" statements - for instance after
derivedByRemovalFrom(d3, d2, {"k1", "k3"})
we conclude that "k2" remains as a key in d3.
I very much prefer this semantics, rather than a very earlier draft,
where members could come and go out of the blue and you never really
knew much. I think also this is OK as this is a specific
prov:Dictionary thing rather than a general thing about
prov:Collection's.
Therefore 2.2 and 2.3 should clarify this and change
> key-entity-set: the inserted key-entity pairs (
to
> key-entity-set: all inserted key-entity pairs (
> key-set: a set of deleted keys (..)
to
> key-set: a set of all deleted keys (..)
Inference 7 (insertion-removal-membership) makes this explicit (yay!)
- so a forward reference would be good.
3. PROV-N Notation of Dictionary Concepts
8a) This section should start with a paragraph explaining that
PROV-DICTIONARY specifies an extension according to PROV-N Extension
chapter - http://www.w3.org/TR/prov-n/#extensibility - but using the
same namespace
8b) The extension grammar should be downloadable and linked to from
introduction of PROV-N notation.
9) However, to be a valid extension, the terms must be given as a
QUALIFIED_NAME. With the proposed syntax in this document - the
QNAMEs are in the default name space, which can be customized in
PROV-N - http://www.w3.org/TR/prov-n/#expression-NamespaceDeclaration
- and therefore easily overlap with other extensions.
It is not explicit in PROV-N (perhaps it should be!) - but a missing
default namespace declaration would make the QNAMEs invalid.
> A qualified name's prefix is optional. If a prefix occurs in a qualified name, the prefix must refer to a namespace declared in a namespace declaration. In the absence of prefix, the qualified name belongs to the default namespace.
(Side note: PROV-N should require that PN_PREFIX is a declared prefix
in the current scope - and that if it is empty, that the default
namespace has been declared)
The prefix prov: is reserved in PROV-N.
I therefore suggest to rename all the Dictionary concepts to qualified
names using prov:* - that is:
prov:hadDictionaryMember
prov:derivedByInsertionFrom
prov:derivedByRemovalFrom
Thus we would avoid the embarrassing situation of PROV-Dictionary-N
documents not being valid PROV-N documents - while documents using
other extensions would be.
10) This section paragraph should mention that prov:Dictionary and
prov:EmptyDictionary is declared using regular entity statements (with
example). This could be inserted before 3.1.
3.1 Membership (PROV-N)
> membershipExpression ::= 'hadDictionaryMember' '(' dIdentifier ',' entity ',' key ')'
> key key (Non-Terminal)
11) "key" is not a defined Non-Terminal - neither here nor in PROV-N
document or grammar.
"dIdentifier" is similarly a new non-Terminal that must be defined
here. (presumably it is an entity which is a prov:Dictionary type).
Combined with issue #1 this is very confusing, and so I have made this
and equivalent below a blocker.
Use the style of the PROV-N document:
http://www.w3.org/TR/prov-n/#expression-Entity
.. here it also introduces optionalAttributeValuePairs, etc.
12) This is not valid grammar according to the EBNF section of PROV-N
as it uses 'single quotes' rather than "double"
http://www.w3.org/TR/prov-n/#grammar-notation
Change this and following declaration to style with double quotes:
> membershipExpression ::= "hadDictionaryMember" "(" dIdentifier "," entity "," key ")"
13) For some reason I was not able to copy-paste the ::== line above
correctly from neither Firefox nor Chrome - the 'single quotes'
disappeared! Why? CSS? Ensure quotes are regular characters.
> Example 6
>
>
> In this example, d is a dictionary known to have e0, e1, and e2 as members, and may have other members.
> entity(e0)
> ..
> hadDictionaryMember(d, e0, "k0")
14) I think this and following examples can be shortened - you should
not explain the PROV Dictionary semantics again in this section and
don't need to repeat the full examples - only the syntax of the
particular statement. So change style to only:
> Example 6
> // e0 is member in d under key "k0"
> hadDictionaryMember(d, e0, "k0")
> 3.2 Insertion
> derivationByInsertionFromExpression ::= derivedByInsertionFrom ( identifier , dIdentifier , dIdentifier , { keyValuePairs } optional-attribute-values )
15) Non-Terminals not matching table below: identifier,
keyValuePairs, optional-attribute-values
Change to match table.
16) Not defined: dIdentifier, keyEntitySet/keyValuePairs
Remember that for keyValuePairs the first pair is not optional.
17) as the identifiers is optional, the separator should be ";" not
",". This also applies to derivedByRemovalFrom
18) Hyperlinks for prov-n terminals like 'identifier' are wrong - they
go to non-existing local anchors rather than to PROV-N - like
http://www.w3.org/TR/prov-n/#prod-identifier
> 3.3 Removal
> derivationByRemovalFromExpression ::= derivedByRemovalFrom ( identifier , dIdentifier , dIdentifier , { keySet } optional-attribute-values )
19) optional-attribute-values -> optionalAttributeValuePairs
> key-set keySet
20) Undefined terminal keySet
Remember that the first key is not optional.
> The following table summarizes how each constituent of a PROV-DM Membership maps to a non-terminal.
> The following table summarizes how each constituent of a PROV-DM Insertion maps to a non-terminal.
> The following table summarizes how each constituent of a PROV-DM Removal maps to a non-terminal.
21) PROV-DM -> PROV-DICTIONARY (or simply Dictionary)
> 4. Ontological definition of dictionary
Note that I have only checked the Turtle syntax here 'by hand' - they
should be formally checked programmatically.
> prov:hadDictionaryMember [
> a prov:KeyValuePair;
> prov:key "k1"^^xsd:string
> prov:value :e1
22) Fix indentation of prov:value (do NEVER use TAB character in such
examples as it has inconsistent rendering)
23) Missing ; after xsd:string (invalid Turtle syntax)
> A dictionary may be empty, in which case it should be described as an instance of the subclass prov:EmptyDictionary.
24) This makes sense regardless of syntax - can a similar statement be
made in section 2.?
> PROV-O provides two kinds of involvements
25) PROV-O --> PROV-DICTIONARY
> prov:qualifiedInsertion is used to (..)
> prov:qualifiedRemoval is used to specify (..)
26) Text should relate/hyperlink this to the insertion/removal sections earlier.
> prov:Dictionary
> back to collections classes
27) hyperlink from headers - does not work. Remove or change to link
to "These terms are used" section.
> :wasCreatedBy :bob;
28) Remove this as it is potentially confusing with
prov:wasAttributedTo (or just change to prov:wAT)
> described with properties
> prov:derivedByInsertionFromop prov:qualifiedRemovalop prov:qualifiedInsertionop prov:derivedByRemovalFromop
29) Somehow the most important property, prov:hadDictionaryMember
seems to be missing here!
> prov:EmptyDictionary
>
> :d1 a prov:Dictionary;
> prov:derivedByInsertionFrom :d;
30) I find it confusing that this example uses most lines to talk
about a non-empty dictionary. Simplify to just show the boring:
> :d a prov:EmptyDictionary .
> An empty dictionary.
31) Could you add the obvious "i.e. has no members"
> prov:insertedKeyValuePair [
> a prov:KeyValuePair;
> prov:key "k1"^^xsd:string;
> prov:value :e1;
> ]
32) I know we had discussions about this before (when this was in
PROV-O) as one can consider either the value or the keyvaluepair to be
what is inserted. But this is getting quite verbose..
inserterkeyvaluepair keyvaluepair key and value! Puh!
How about simplifying to:
prov:inserted [
a prov:KeyValuePair;
prov:key "k1"^^xsd:string;
prov:value :e1;
]
.. or possibly prov:insertedPair ?
This is similar to how the insertion is a prov:Insertion and not a
prov:DictionaryInsertion.
I would keep prov:removedKey as it's important to know it's the *key*
that was removed.
> prov:Removal
> removing one or more key-value pairs
33) Add ", specified by prov:removedKey"
(as you don't specify the removed key/value pairs)
> prov:dictionary
> has domain
> prov:Insertion
> prov:Removal
34) This specifies that the domain is the intersection of Insertion
and Removal - thus any Insertion with a prov:dictionary becomes also
a Removal etc. This is surely wrong.
Formally you should specify this as the domain of the union of
Insertion and Removal - alternatively as a common "abstract"
superclass prov:DictionaryInfluence.
See PROV-O http://www.w3.org/TR/prov-o/#owl-profile
As the union would take the extension out of OWL-RL, and there already
is precedence for other kind of influences, I would suggest making a
new prov:DictionaryInfluence which can be superclass of Insertion and
Removal, and in domain of prov:dictionary.
35) Where can I find a download link for the OWL of this PROV
Dictionary extension? I expected to find it in te beginning of section
4. I would also expect section 4 to have an explanation of the
namespaces and how importing <http://www.w3.org/ns/prov#> one would
get both PROV-O and extensions like PROV-Dictionary, while
<http://www.w3.org/ns/prov-o> would give only PROV-O. I would then
also expect a separate versionIRI to import if I only wanted
PROV-Dictionary - like <http://www.w3.org/ns/prov-dictionary#>
> prov:qualifiedInsertion
> If this Dictionary prov:derivedByInsertionFrom another Dictionary :e, then it can qualify how it did perform the Insertion using prov:qualifiedInsertion [ a prov:Insertion; prov:dictionary :e; prov:inserted [a prov:KeyValuePair; prov:key "k1"^^xsd:string; prov:value :foo] ].
36) I know I probably wrote this text - but without any indentation it
is very difficult to read. Could this be simplified to some kind of
mirror of prov:derivedByInsertionFrom? Same for prov:qualifiedRemoval.
37) example for qualifiedInsertion should not detail the members of
our-old-baseball-team-field-positions as that is confusing and makes
example large.
> 5. XML Schema Dictionary
> In this section, we provide the XML Schema to use dictionaries with the [PROV-XML] serialization.
Ironically, you don't (only fragments) - where can I find and download
the XML Schema?
Note that I have not checked the syntax of XSD statements in this
section - if I had the schema file I could have validated it.
38) Modify to "This section details how to describe dictionaries with
the [PROV-XML] serialization".
39) Provide download link for XML schema - also include usage
instructions as per imports etc. (equivalent as for OWL imports above)
40) Rename section to "PROV-XML representation of Dictionary" -
similarly section 4 from "Ontological Definition of Dictionary" to
"PROV-O representation of Dictionary"
> 5.2 Key-Value Pair
> <xs:element name="key" type="xs:String" />
41) The specification earlier specifies key as any value/literal - so
xs:string (lowercase S!) is too specific - it would not allow
integers, for instance. Change to xs:anySimpleType here and in 5.5
Removal
> <xs:element name="key" type="xs:String" maxOccurs="unbounded" />
> members of a dictionary are specified by listing all the key-value pairs inside a prov:DictionaryMembership element
42) Remove "all the" -- according to semantics of hadDictionaryMember
> 6. Constraints Associated with Dictionary
This section is very solid, good work!
> These inferences and constraints need to be applied to obtain valid provenance when using dictionaries.
43) I don't think they *need to* be applied? Rephrase to "MAY be
applied in order to ensure valid provenance".
I think a similar kind of disclaimer as in
http://www.w3.org/TR/prov-constraints/#purpose (or just linking to
this) could be useful, so that it's clear that these constraints are
not a MUST to use dictionaries.
> Inference 1 (dictionary-membership-collection-membership)
> IF hadDictionaryMember(d, e1, "k1") THEN hadMember(d, e1)
44) I would also add the inverse inference dictionary-all-members-have-keys:
> Inference X (dictionary-all-members-have-keys)
>
> Here k is a (potentially unknown) key
>
> IF hadMember(d, e1) AND 'prov:Dictionary' ∈ typeOf(d)
> THEN hadDictionaryMember(d, e1, k)
If we don't specify this, opens for a prov:Dictionary to also contain
entities which don't have a key. This could be confusing. For instance
this would be valid:
entity(d0, [prov:type='prov:EmptyDictionary'])
entity(d1, [prov:type='prov:Dictionary'])
entity(d2, [prov:type='prov:Dictionary'])
derivedByInsertionFrom(d1, d0, {("k1", e1)})
// implied insertion-membership
//hadDictionaryMember(d1, e1, "k1")
// implied by dictionary-membership-collection-membership:
//hadMember(d1, e1)
derivedByRemovalFrom(d2, d1, {"k1"})
// Invalid by impossible-removal-membership:
//hadDictionaryMember(d2, e1, "k1")
// however this is still valid ..
hadMember(d2, e1)
// Unless we add:
entity(d2, [prov:type='prov:EmptyDictionary'])
// which implies
//entity(d2, [prov:type='prov:EmptyCollection'])
//which is invalid by constraint membership-empty-collection in PROV-CONSTRAINTS
// However, we might now still add this confusing statement:
hadMember(d1, e2)
// By the PROV-DICTIONARY constraints (including the completeness
constraint impossible-insertion-insertion and
membership-insertion-membership) we would not here expect any other
keys than "k1" in d1 - so we SHOULD be able to conclude that e2==e1 -
however we can't do that unless we introduce my inference
dictionary-all-members-have-keys:
// dictionary-all-members-have-keys implies
//hadDictionaryMember(d1, e2, k)
// and by key-single-entity and membership-insertion-membership and
membership-empty-collection:
//e2 == e1
> IF hadDictionaryMember(d1, e, "k") and derivedByInsertionFrom(d2, d1, {("k1", e1),...,("kn", en)}) and k ∉ {k1,...,kn} THEN hadDictionaryMember(d2, e, "k")
45) Don't remove the quotes: "k" ∉ {"k1",...,"kn"}
> IF derivedByRemovalFrom(d2, d1, {"k1",...,"kn"}) and derivedByInsertionFrom(d2, d1, {("k1", e1),...,("km",em)})THEN INVALID
46) This implies that "k1", etc. needs to be on both sides (Implying
an intersection/containment of keys). This constraint is true no
matter the key/value pairs, so change to same style as below:
> Here, KV1 and KV2 are sets of key-entity pairs.
> IF derivedByRemovalFrom(d2, d1, KV1) and derivedByInsertionFrom(d2, d1, KV2) THEN INVALID
> 6.3 Typing
> IF entity(d, [prov:type='prov:Dictionary']) THEN 'prov:Dictionary' ∈ typeOf(d) and 'prov:Collection' ∈ typeOf(d)
> IF entity(d, [prov:type='prov:EmptyDictionary']) THEN 'prov:EmptyDictionary' ∈ typeOf(d) and 'prov:Dictionary' ∈ typeOf(d)
47) The second rule would not cause firing of the first rule, so
expand the second rule. Also include prov:EmptyCollection and entity.
> IF entity(d, [prov:type='prov:EmptyDictionary']) THEN 'prov:EmptyDictionary' ∈ typeOf(d) and 'prov:Dictionary' ∈ typeOf(d) and 'prov:Collection' ∈ typeOf(d) and 'prov:EmptyCollection' ∈ typeOf(d)
48) Both lines must also include:
'entity' ∈ typeOf(c)
..as PROV-Constraint "typing" would not fire from a typeof prov:Collection.
> IF hadDictionaryMember(d, e, "k") THEN 'prov:Dictionary' ∈ typeOf(d) and 'entity' ∈ typeOf(e)
> IF derivedByInsertionFrom(d2, d1, {("k1", e1)}) THEN 'prov:Dictionary' ∈ typeOf(d1) and 'prov:Dictionary' ∈ typeOf(d2) and 'entity' ∈ typeOf(e1)
> IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN 'prov:Dictionary' ∈ typeOf(d1) and 'prov:Dictionary' ∈ typeOf(d2)
49) Similarly all of these should also include
> AND prov:Collection' ∈ typeOf(c) AND 'entity' ∈ typeOf(c)
==== END OF REVIEW ===
Now relax..! :)
--
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
Received on Thursday, 24 January 2013 15:06:31 UTC