Published prov-dm - entities and attributes, and a couple of editorial comments from Graham Klyne on 2011-10-21 (public-prov-wg@w3.org from October 2011)

From: Graham Klyne <GK@ninebynine.org>
Date: Fri, 21 Oct 2011 09:10:47 +0100
To: W3C provenance WG <public-prov-wg@w3.org>, Luc Moreau <l.moreau@ecs.soton.ac.uk>
CC: "Myers, Jim" <MYERSJ4@rpi.edu>
Message-ID: <4EA12907.4050602@ninebynine.org>
Referencing: http://www.w3.org/TR/2011/WD-prov-dm-20111018/

<editorial>
In "Status of this document":  "visition on" -> "vision of"? (or something else 
intended?)
</editorial>

...

Looking for statements that may not align with the discussion between Jim and 
myself about entity attributes.  I can't find the text that I previously 
understood to require a distinction between characterizing and 
non-characterizing attributes.  But I think the legacy of that approach is still 
present in that many of the model concepts seem to be defined in terms of these 
attributes, hence requiring their existence, where I think that in many cases 
the concepts can simply be asserted.

For example, section 5.3.5 Complementarity is constrained in terms of attributes 
and intervals when one could simply assert, say:

   entity(rs_l2,[location="The Mall"])
   entity(rs,[created="1870"])
   wasComplementOf(rs_l2, rs)

without worrying whether the attributes satisfy some arbitrary constraints.


== Section 5.2.1 ==

"If an asserter wishes to characterize a thing with the same attribute-value 
pairs over several intervals, then they are required to assert multiple entity 
expressions, each with its own identifier (so as to allow potential dependencies 
between the various entity expressions to be expressed)."

Under the model that Jim and I were discussing, I think this would not be true: 
a set of attributes could apply through discontinuous intervals, so there would 
not necessarily be a fixed correspondence between entities and intervals between 
events.  The effect described here might be achieved by including start-time and 
end-time values (or other event-related values) as entity attributes.

...

"An entity assertion is about a characterized thing, whose situation in the 
world may be variant. An entity assertion is made at a particular point and is 
invariant, in the sense that its attributes are assigned a value as part of that 
assertion."

I think this statement is confusing, in that I don't think the attributes are 
*assigned* as part of the assertion.  Rather, I'd say the entity is constrained 
to the (set of instances?) for which the attributes have the values given.  Or, 
to put it another way, the entity has given values for the given attributes 
throughout its lifetime - whatever that maybe.

...

Section 5.3.2

"the existence of an attribute-value pair in the entity expression identified by 
e is a pre-condition for the termination of the activity represented by the 
process execution expression identified by pe."

I think this mandatory existence of an attribute is inconsistent with the claim 
"There is no assumption that the set of attributes is complete" from section 
5.2.1 (the trivial extreme case of which being that all attributes are absent).

...

Section 5.3.3 (and maybe 5.3.2)

I'm seeing a lot of use of attributes to define the notions of derivation.  This 
seems to me to be at odds with the use of attributes to constrain an entity.  As 
constraints, they are independent variables, but as used in say 
"wasDerivedFrom(e2,e1,pe,q2,q1) or wasDerivedFrom(e2,e1) holds if and only if 
the values of some attributes of the entity expression identified by e2 are 
partly or fully determined by the values of some attributes of the entity 
expression identified by e1." at least some of the attributes are dependent 
variables.

Again, if the presence of the attributes is optional - as might be the case is 
an entity is simply named by assertion - this constraint may be unsatisfiable in 
situations that are intuitively derivations.

Example: I may create a map by grabbing a copy of a weather report, extracting 
the weather map and removing the weather symbols.  That map is derived from the 
weather report, but with no further attributes indicated.

   entity(w) -- the weather report, no attributes
   entity(m) -- a map, no attributes
   wasDerivedFrom(m,e) -- should be true, but the constraints don't allow this

...

<editorial>
Section 7.2.1

I think the term wasAttributeTo(...) should probably be wasAttributedTo(...)

(note insertion of 'd' before 'To')
</editorial>

...

Finally, I think there's something that's been missed out in the focus on using 
attributes as defining values.

Myn understanding is that the motivation for entities as views of resources came 
from the need to make persistence provenance assertions about aspects of dynamic 
things in the world.  Sections 2.1 and 3 discusses this motivation, but the only 
point I'm seeing where this is even approached definitively is section 5.3.5. 
But the notion that an Entity is a constrained form of some original resource is 
not addressed in the constraints (even though the motivation is discussed in the 
examples), so the motivating situation is not covered by the data model.

For me, the missing concept of an entity being a "view" of some resource remains 
the cornerstone of my understanding of how resource/entity story hangs together, 
so it's difficult for me to discuss effectively when it's not part of the model 
(and this is part of why I struggle to provide you with comments as issues 
against the document rather than more general discussions in email).

Also, I still don't really understand why "complementarity" is important, but I 
accept that some members feel it is.

I think that if we started with the notion of "IVPof" (possibly renamed) which 
would relate entities into a DAG (with a distinguished set of entities that are 
not IVPof anything corresponding roughly to a notion of "real world resources") 
the explanation of provenance about dynamic resources would be easier to convey.

#g
Received on Friday, 21 October 2011 08:11:32 UTC