# Re: PROV-DICTIONARY internal review for first public working draft (ISSUE-614)

From: Tom De Nies <tom.denies@ugent.be>
Date: Tue, 29 Jan 2013 12:59:21 +0100
Message-ID: <CA+=hbbepZr+sbRZbRW7GDMrnDdFTU9=Tpv6JW_ztDScKdAEB1g@mail.gmail.com>
To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Cc: Provenance Working Group <public-prov-wg@w3.org>
Hi Stian,

thanks again for your extensive review. I responded to each issue below.
You'll see that many of the issues arose in other reviews as well, so they
were no trouble fixing.

As I see it, only one real point of discussion remains after going through
your review: (item 3 in your review) whether or not we include the
constraint
IF hadMember(d, e) and 'Dictionary' \in typeOf(d) THEN
hadDictionaryMember(d, e, "k") with k and unknown key.
I suggest we leave this out for this draft, and make it a formal issue,
which we can discuss as a group before the next draft.

> Please find below detailed review:
>
> > 2.1 Dictionary  membership
> >     key: a key key_1 that is associated with the specified entity;
>
> 1) This should specify that the key is a *Value* as according to
> PROV-DM http://www.w3.org/TR/prov-dm/#term-value - as it is specified
> it could be interpreted as also an identifier - although the PROV-N
> example uses a "string value".   (I know it is confusing because DM
> for some reason chose to call this Value rather than the more
> appropriate term Literal - but here in a dictionary the actual 'value'
> is the Entity (identifier), and the key is a (literal) Value!
> Therefore this should be clarified with DM hyperlink.)
>
Indeed, Luc and Paolo's review also mentioned this. It's fixed.

>
> > Note that dictionary membership implies collection membership, but not
> vice versa.
>
> 2) This should link forward to inference 1
> (dictionary-membership-collection-membership)
>
Done.

>
> 3) Also I don't quite understand this.  So a prov:Dictionary kind of
> collection can have members that don't have keys?
>
> entity(d, [prov:type='prov:Dictionary' ])
> // implies:
> entity(d, [prov:type='prov:Collection ])
>
> // implies:
>
> // But what if we also see?
> // are you saying this would NOT imply the below?
>
> If so then I am a bit confused - a prov:Dictionary to be useful should
> be a constrained prov:Collection in which every member is associated
> with a key. This should be added to the Conceptual Definition of
> Dictionary above.
>
> If there is no such implication (of course the key is unknown until
> stated otherwise), I am not sure in which cases such a data type could
> be useful. It would be like describing an array type of collection,
> but where some items are allowed to not have a position.  (which is
> quite different from saying they have an unknown position!)
>
>
See explanation above, this remains an issue to be discussed.

>
> > > // d1 is a dictionary, with unknown content
>
> 4) Change to "with (so far) unknown content" ?   Both example 1 and 2
> and others - specially those that later actually do state the members.
>
>
> Done.

>
> > 2.2 Dictionary insertion
> >  d0 is the set { }
> > ...
>
> 5) I thought d0 was a Dictionary, not a set.   s/set/dictionary/g here
> and in other examples.
>
> It is not simply a set of pairs, because we don't allow multiple
> entries for same key.
>
> Fixed.

>
> > attributes: an optional set (attrs) of attribute-value pairs
>
> 6) Could one of the lines in example 3 perhaps include a
> dcterms:description or something? Just because it can get confusing
> with the new {("key", value)} syntax as opposed to the attribute
> syntax.
>
>
> derivedByInsertionFrom(d2, d1, {("k3", e3)}, [ dcterms:description =
> "An optional attribute" ] )
>
> Done.

>
> > 2.2 Dictionary Insertion
> > 2.3 Dictionary Removal
> > In particular, no assumptions are needed regarding the mutability of a
> data structure that is subject to updates
>
> 7) So from the examples given, it seems that both insertion and
> removal are "complete" statements - for instance after
>
>   derivedByRemovalFrom(d3, d2, {"k1", "k3"})
>
> we conclude that "k2" remains as a key in d3.
>
> I very much prefer this semantics, rather than a very earlier draft,
> where members could come and go out of the blue and you never really
> knew much. I think also this is OK as this is a specific
> prov:Dictionary thing rather than a general thing about
> prov:Collection's.
>
>
> Therefore 2.2 and 2.3 should clarify this and change
>
> > key-entity-set: the inserted key-entity pairs (
> to
> > key-entity-set: all inserted key-entity pairs (
>
> > key-set: a set of deleted keys (..)
> to
> > key-set: a set of all deleted keys (..)
>
> Inference 7 (insertion-removal-membership) makes this explicit (yay!)
> - so a forward reference would be good.
>
>
> Done.

>
>
> 3. PROV-N Notation of Dictionary Concepts
>
> PROV-DICTIONARY specifies an extension according to PROV-N Extension
> chapter - http://www.w3.org/TR/prov-n/#extensibility - but using the
> same namespace
>
> I have added the following text:

> The notation used for dictionaries in this document extends the standard
> PROV-N according to the principles described in the PROV-N extensibility
> chapter <http://www.w3.org/TR/prov-n/#extensibility>. However, because
> dictionaries are defined in the same namespace as the rest of PROV-N, the
> terms in this document do not have a non-empty prefix. For the remainder of
> this document, we will assume that the default namespace
> http://www.w3.org/ns/prov# is used, and thus, no prefix is specified for
> the terms associated with dictionaries.
>
Is this acceptable to you?

> introduction of PROV-N notation.
>
> We noticed that although there is a grammar folder on the mercurial
either. (or we couldn't find the link).
We suggest that we wait until PROV-N does this in the final spec, and then
conform to that when we link to it in the next draft.

> 9) However, to be a valid extension, the terms must be given as a
>         QUALIFIED_NAME.  With the proposed syntax in this document - the
> QNAMEs are in the default name space, which can be customized in
> PROV-N - http://www.w3.org/TR/prov-n/#expression-NamespaceDeclaration
> - and therefore easily overlap with other extensions.
>
> It is not explicit in PROV-N (perhaps it should be!) - but a missing
> default namespace declaration would make the QNAMEs invalid.
>
> > A qualified name's prefix is optional. If a prefix occurs in a qualified
> name, the prefix must refer to a namespace declared in a namespace
> declaration. In the absence of prefix, the qualified name belongs to the
> default namespace.
>
> (Side note: PROV-N should require that PN_PREFIX is a declared prefix
> in the current scope - and that if it is empty, that the default
> namespace has been declared)
>
>
> The prefix prov: is reserved in PROV-N.
>
> I therefore suggest to rename all the Dictionary concepts to qualified
> names using prov:* - that is:
>
> prov:derivedByInsertionFrom
> prov:derivedByRemovalFrom
>
> Thus we would avoid the embarrassing situation of PROV-Dictionary-N
> documents not being valid PROV-N documents - while documents using
> other extensions would be.
>
>
See the text added to the beginning of the PROV-N section, and our response
to 8a. Does this remain a blocking issue for you now that this text is
included? I'd like to remark that PROV-LINKS does not declare this
namespace either, and is published as a working draft.

>
> 10) This section paragraph should mention that prov:Dictionary and
> prov:EmptyDictionary is declared using regular entity statements (with
> example). This could be inserted before 3.1.
>
>
> Done.

>
> 3.1 Membership  (PROV-N)
>
> > membershipExpression ::= 'hadDictionaryMember' '(' dIdentifier ','
> entity ',' key ')'
> > key   key (Non-Terminal)
>
> 11) "key" is not a defined Non-Terminal - neither here nor in PROV-N
> document or grammar.
>
> "dIdentifier" is similarly a new non-Terminal that must be defined
> here. (presumably it is an entity which is a prov:Dictionary type).
>
>
> Combined with issue #1 this is very confusing, and so I have made this
> and equivalent below a blocker.
>
>
> Use the style of the PROV-N document:
>
> http://www.w3.org/TR/prov-n/#expression-Entity
>
> .. here it also introduces optionalAttributeValuePairs, etc.
>
>
>
Fixed (after previous reviews). Could you check the expressions again? I
believe everything should be linked and defined correctly now.

>
> 12) This is not valid grammar according to the EBNF section of PROV-N
> as it uses 'single quotes' rather than "double"
>
> http://www.w3.org/TR/prov-n/#grammar-notation
>
> Change this and following declaration to style with double quotes:
>
> > membershipExpression ::= "hadDictionaryMember" "(" dIdentifier ","
> entity "," key ")"
>
> Now uses double quotes.

>
> 13) For some reason I was not able to copy-paste the ::== line  above
> correctly from neither Firefox nor Chrome - the 'single quotes'
> disappeared! Why? CSS? Ensure quotes are regular characters

This appears to be because the quotes are added through CSS. I noticed that
in PROV-N, this is done differently now, and the expressions are pulled
from the grammar by a javascript script that runs on the page. We will fix
this when we create the grammar document.

>
> > Example 6
> >
> >
> > In this example, d is a dictionary known to have e0, e1, and e2 as
> members, and may have other members.
> >  entity(e0)
> > ..
>
>
> 14) I think this and following examples can be shortened - you should
> not explain the PROV Dictionary semantics again in this section and
> don't need to repeat the full examples - only the syntax of the
> particular statement. So change style to only:
>
> > Example 6
> > // e0 is member in d under key "k0"
>
>
> Shortened.

> > 3.2 Insertion
> >  derivationByInsertionFromExpression ::= derivedByInsertionFrom (
> identifier , dIdentifier , dIdentifier , { keyValuePairs }
> optional-attribute-values )
>
> 15) Non-Terminals not matching table below: identifier,
> keyValuePairs, optional-attribute-values
> Change to match table.
>
>
Fixed (after previous reviews). Could you check the expressions again? I
believe everything should be linked and defined correctly now.

> 16) Not defined: dIdentifier, keyEntitySet/keyValuePairs
>
> Remember that for keyValuePairs the first pair is not optional.
>
> Fixed (after previous reviews). Could you check the expressions again? I
believe everything should be linked and defined correctly now.

>
> 17) as the identifiers is optional, the separator should be ";" not
> ",". This also applies to  derivedByRemovalFrom
>
> Fixed.

>
> 18) Hyperlinks for prov-n terminals like 'identifier' are wrong - they
> go to non-existing local anchors rather than to PROV-N - like
> http://www.w3.org/TR/prov-n/#prod-identifier
>
>
> Fixed (after previous reviews). Could you check the expressions again? I
believe everything should be linked and defined correctly now.

> > 3.3 Removal
> > derivationByRemovalFromExpression ::= derivedByRemovalFrom ( identifier
> , dIdentifier , dIdentifier , { keySet } optional-attribute-values )
>
> 19) optional-attribute-values -> optionalAttributeValuePairs
>
> Fixed.

>
> > key-set       keySet
>
> 20) Undefined terminal keySet
>
> Remember that the first key is not optional.
>
> Fixed.

>
> > The following table summarizes how each constituent of a PROV-DM
> Membership maps to a non-terminal.
> > The following table summarizes how each constituent of a PROV-DM
> Insertion maps to a non-terminal.
> > The following table summarizes how each constituent of a PROV-DM Removal
> maps to a non-terminal.
>
> 21) PROV-DM -> PROV-DICTIONARY (or simply Dictionary)
>
>
>
>
> > 4. Ontological definition of dictionary
>
> Note that I have only checked the Turtle syntax here 'by hand' - they
> should be formally checked programmatically.
>
>
This remains a todo when we publish the owl file.

>
> >       a prov:KeyValuePair;
> >       prov:key   "k1"^^xsd:string
> >          prov:value :e1
>
> 22) Fix indentation of prov:value  (do NEVER use TAB character in such
> examples as it has inconsistent rendering)
>
> Fixed.

>
> 23) Missing ; after xsd:string (invalid Turtle syntax)
>
> Fixed.

>
> > A dictionary may be empty, in which case it should be described as an
> instance of the subclass prov:EmptyDictionary.
>
> 24) This makes sense regardless of syntax - can a similar statement be
>
> We revised this sentence after a previous review. I've added something
similar to the conceptual definition:

> Note that the complete content of a dictionary is unknown unless it can be
> traced back to an empty dictionary through a series of insertions and
> removals. If an asserter wants to explicitly state that a dictionary is
> empty, it is recommended that the prov:type prov:EmptyCollection is used.

Is this ok for you?

>
> >  PROV-O provides two kinds of involvements
>
> 25) PROV-O --> PROV-DICTIONARY
>
> Done.

> >  prov:qualifiedInsertion is used to  (..)
> > prov:qualifiedRemoval is used to specify  (..)
>

26) Text should relate/hyperlink this to the insertion/removal sections
> earlier.
>
>
> > prov:Dictionary
>
> >  back to collections classes
>
> to "These terms are used" section.
>
> Done, they point to the overview now.

>
>
> >    :wasCreatedBy    :bob;
>
> 28) Remove this as it is potentially confusing with
> prov:wasAttributedTo  (or just change to prov:wAT)
>
> Done.

>
> > described with properties
> >    prov:derivedByInsertionFromop prov:qualifiedRemovalop
> prov:qualifiedInsertionop prov:derivedByRemovalFromop
>
> 29) Somehow the most important property, prov:hadDictionaryMember
> seems to be missing here!
>
>
> Oops! fixed.

> >  prov:EmptyDictionary
> >
> > :d1 a prov:Dictionary;
> >   prov:derivedByInsertionFrom :d;
>
> 30) I find it confusing that this example uses most lines to talk
> about a non-empty dictionary.  Simplify to just show the boring:
>
> >  :d  a prov:EmptyDictionary .
>
> Done

>
> > An empty dictionary.
>
> 31) Could you add the obvious "i.e. has no members"
>
>
> Done.

> >  prov:insertedKeyValuePair [
> >         a prov:KeyValuePair;
> >         prov:key   "k1"^^xsd:string;
> >         prov:value :e1;
> >     ]
>
>
> PROV-O) as one can consider either the value or the keyvaluepair to be
> what is inserted. But this is getting quite verbose..
> inserterkeyvaluepair keyvaluepair key and value! Puh!
>
>
>  prov:inserted [
>          a prov:KeyValuePair;
>          prov:key   "k1"^^xsd:string;
>          prov:value :e1;
>       ]
>
> .. or possibly prov:insertedPair  ?
>
> This is similar to how the insertion is a prov:Insertion and not a
> prov:DictionaryInsertion.
>
> I would keep prov:removedKey as it's important to know it's the *key*
> that was removed.
>
>
Sam and I discussed this, and came to the following conclusion. While it is
true that this syntax is quite verbose, it is also the only one that is
100% clear and cannot induce confusion or discussion. This, and the fact
that the other reviewers did not see a problem with this property has made
us decide that this is the safest choice, and we will keep it. If you
consider it a serious problem, we can discuss it as an issue for the next
draft.

>
> > prov:Removal
> > removing one or more key-value pairs
>
> 33) Add ", specified by prov:removedKey"
>
> (as you don't specify the removed key/value pairs)
>
> done.

>
> > prov:dictionary
> > has domain
> >       prov:Insertion
> >       prov:Removal
>
> 34) This specifies that the domain is the intersection of Insertion
> and Removal  - thus any Insertion with a prov:dictionary becomes also
> a Removal etc. This is surely wrong.
>
>
> Formally you should specify this as the domain of the union of
> Insertion and Removal - alternatively as a common "abstract"
> superclass prov:DictionaryInfluence.
>
> See PROV-O http://www.w3.org/TR/prov-o/#owl-profile
>
> As the union would take the extension out of OWL-RL, and there already
> is precedence for other kind of influences, I would suggest making a
> new prov:DictionaryInfluence which can be superclass of Insertion and
> Removal, and in domain of prov:dictionary.
>
>
> Done.

>
>
> Dictionary extension? I expected to find it in te beginning of section
> 4. I would also expect section 4 to have an explanation of the
> namespaces and how importing <http://www.w3.org/ns/prov#> one would
> get both PROV-O and extensions like PROV-Dictionary, while
> <http://www.w3.org/ns/prov-o> would give only PROV-O.  I would then
> also expect a separate versionIRI to import if I only wanted
> PROV-Dictionary - like <http://www.w3.org/ns/prov-dictionary#>
>
>
> As discussed on the call, we included a link to the owl file in the
document, but the file itself might still be updated. (we linked to the raw
"editor's draft" version, to allow ourselves some time to update and verify
the owl syntax.)
I''ve also included the text you asked for:

> The classes and properties defined in this document will be included in
> the default namespace of PROV. Users of the ontology have the option of
> importing<http://www.w3.org/ns/prov#>, which includes all extensions,
> including PROV-Dictionary, or if they wish to have only [PROV-O<#bib-PROV-O>]
> terms, they can import <http://www.w3.org/ns/prov-o#>. Similarly, <
> http://www.w3.org/ns/prov-dictionary#> holds only the PROV-Dictionary
> (Note that this file is unfinished at the time of this working draft, and
> may be subject to change.)

Is this text acceptable to you?

>
> > prov:qualifiedInsertion
> > If this Dictionary prov:derivedByInsertionFrom another Dictionary :e,
> then it can qualify how it did perform the Insertion using
> prov:qualifiedInsertion [ a prov:Insertion; prov:dictionary :e;
> prov:inserted [a prov:KeyValuePair; prov:key "k1"^^xsd:string; prov:value
> :foo] ].
>
> 36) I know I probably wrote this text - but without any indentation it
> is very difficult to read. Could this be simplified to some kind of
> mirror of prov:derivedByInsertionFrom? Same for prov:qualifiedRemoval.
>
>
> Done.

> 37) example for qualifiedInsertion should not detail the members of
> our-old-baseball-team-field-positions as that is confusing and makes
> example large.
>
> Done.

>
> > 5. XML Schema Dictionary
> > In this section, we provide the XML Schema to use dictionaries with the
> [PROV-XML] serialization.
>
> Ironically, you don't (only fragments) - where can I find and download
> the XML Schema?
>

We've included the link. XSD schema will actually be there today. Again, we
point to the editors space, to allow corrections after publication of the
document.

>
> Note that I have not checked the syntax of XSD statements in this
> section - if I had the schema file I could have validated it.
>
>
> We will validate the schema

> 38) Modify to "This section details how to describe dictionaries with
> the [PROV-XML] serialization".
>
>
> Done.

> instructions as per imports etc. (equivalent as for OWL imports above)
>
>
> Done.

> 40) Rename section to "PROV-XML representation of Dictionary" -
> similarly section 4 from "Ontological Definition of Dictionary" to
> "PROV-O representation of Dictionary"
>
>
> done.

> > 5.2 Key-Value Pair
> > <xs:element name="key" type="xs:String" />
>
> 41) The specification earlier specifies key as any value/literal - so
> xs:string (lowercase S!) is too specific - it would not allow
> integers, for instance. Change to xs:anySimpleType here and in 5.5
> Removal
> > <xs:element name="key" type="xs:String" maxOccurs="unbounded" />
>
> Done.

>
>
> >  members of a dictionary are specified by listing all the key-value
> pairs inside a prov:DictionaryMembership element
>
> 42) Remove "all the" -- according to semantics of  hadDictionaryMember
>
>
> Done.

> > 6. Constraints Associated with Dictionary
>
> This section is very solid, good work!
>
>
> > These inferences and constraints need to be applied to obtain valid
> provenance when using dictionaries.
>
> 43) I don't think they *need to* be applied? Rephrase to "MAY be
> applied in order to ensure valid provenance".
>
> I think a similar kind of disclaimer as in
> http://www.w3.org/TR/prov-constraints/#purpose (or just linking to
> this) could be useful, so that it's clear that these constraints are
> not a MUST to use dictionaries.
>
> Changed, and added reference to purpose section of constraints.

>
> >  Inference 1 (dictionary-membership-collection-membership)
>
>
> 44) I would also add the inverse inference
> dictionary-all-members-have-keys:
>
> > Inference X (dictionary-all-members-have-keys)
> >
> > Here k is a (potentially unknown) key
> >
> > IF hadMember(d, e1) AND 'prov:Dictionary' ∈ typeOf(d)
> > THEN hadDictionaryMember(d, e1, k)
>
>
>
> If we don't specify this, opens for a prov:Dictionary to also contain
> entities which don't have a key. This could be confusing. For instance
> this would be valid:
>
> entity(d0, [prov:type='prov:EmptyDictionary'])
> entity(d1, [prov:type='prov:Dictionary'])
> entity(d2, [prov:type='prov:Dictionary'])
>
> derivedByInsertionFrom(d1, d0, {("k1", e1)})
> // implied insertion-membership
> // implied by dictionary-membership-collection-membership:
>
> derivedByRemovalFrom(d2, d1, {"k1"})
> // Invalid by impossible-removal-membership:
>
> // however this is still valid ..
>
> entity(d2, [prov:type='prov:EmptyDictionary'])
> // which implies
> //entity(d2, [prov:type='prov:EmptyCollection'])
> //which is invalid by constraint membership-empty-collection in
> PROV-CONSTRAINTS
>
> // However, we might now still add this confusing statement:
>
> // By the PROV-DICTIONARY constraints (including the completeness
> constraint impossible-insertion-insertion and
> membership-insertion-membership) we would not here expect any other
> keys than "k1" in d1 - so we SHOULD be able to conclude that e2==e1 -
> however we can't do that unless we introduce my inference
> dictionary-all-members-have-keys:
> // dictionary-all-members-have-keys implies
> // and by key-single-entity and membership-insertion-membership and
> membership-empty-collection:
> //e2 == e1
>
>
You make a valid point, and I think it needs to be discussed before we add
it. See response at the beginning of this email.
Since this is not a blocker, we will discuss it for the next WD.

> >  IF hadDictionaryMember(d1, e, "k") and derivedByInsertionFrom(d2, d1,
> {("k1", e1),...,("kn", en)}) and k ∉ {k1,...,kn} THEN
>
> 45) Don't remove the quotes: "k" ∉ {"k1",...,"kn"}
>
> Done.

>
> >  IF derivedByRemovalFrom(d2, d1, {"k1",...,"kn"}) and
> derivedByInsertionFrom(d2, d1, {("k1", e1),...,("km",em)})THEN INVALID
>
> 46) This implies that "k1", etc. needs to be on both sides (Implying
> an intersection/containment of keys). This constraint is true no
> matter the key/value pairs, so change to same style as below:
>
> > Here, KV1 and KV2 are sets of key-entity pairs.
> >  IF derivedByRemovalFrom(d2, d1, KV1) and derivedByInsertionFrom(d2, d1,
> KV2) THEN INVALID
>
> Done.

>
>
> > 6.3 Typing
>
> > IF entity(d, [prov:type='prov:Dictionary']) THEN 'prov:Dictionary' ∈
> typeOf(d) and 'prov:Collection' ∈ typeOf(d)
> > IF entity(d, [prov:type='prov:EmptyDictionary']) THEN
> 'prov:EmptyDictionary' ∈ typeOf(d) and 'prov:Dictionary' ∈ typeOf(d)
>
> 47) The second rule would not cause firing of the first rule, so
> expand the second rule. Also include prov:EmptyCollection and entity.
>
> > IF entity(d, [prov:type='prov:EmptyDictionary']) THEN
> 'prov:EmptyDictionary' ∈ typeOf(d) and 'prov:Dictionary' ∈ typeOf(d) and
> 'prov:Collection' ∈ typeOf(d) and 'prov:EmptyCollection' ∈ typeOf(d)
>
> Done.

>
> 48) Both lines must also include:
>  'entity' ∈ typeOf(c)
>
> ..as PROV-Constraint "typing" would not fire from a typeof prov:Collection.
>
>
> Done.

>
> > IF hadDictionaryMember(d, e, "k") THEN 'prov:Dictionary' ∈ typeOf(d) and
> 'entity' ∈ typeOf(e)
> > IF derivedByInsertionFrom(d2, d1, {("k1", e1)}) THEN 'prov:Dictionary' ∈
> typeOf(d1) and 'prov:Dictionary' ∈ typeOf(d2) and 'entity' ∈ typeOf(e1)
> > IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN 'prov:Dictionary' ∈
> typeOf(d1) and 'prov:Dictionary' ∈ typeOf(d2)
>
> 49) Similarly all of these should also include
> > AND prov:Collection' ∈ typeOf(c) AND 'entity' ∈ typeOf(c)
>
> Done.

>
> ==== END OF REVIEW ===
>
> Now relax..! :)
>
>
I'll try. Is this a blocking issue? ;)

Received on Tuesday, 29 January 2013 11:59:56 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:51:28 UTC