PROV-ISSUE-100 (Entity definition): Section 5.2.1 Entity [Conceptual Model]

Raised by: Satya Sahoo
On product: Conceptual Model

My review comments for Section 5.2.1 Entity in the current version of the conceptual model document:

1. In PROV-DM, an entity expression is a representation of an identifiable characterized thing.

Issue: Since the section heading is for Entity and the PROV DM component is Entity, I am confused why we are defining "Entity Expression" and not "Entity"?

2. An instance of an entity expression, noted entity(id, [ attr1=val1, ...]) in PROV-ASN contains an identifier id identifying a characterized thing; contains a set of attribute-value pairs [ attr1=val1, ...], representing this characterized thing's situation in the world.

Issue: When we refer to an entity in provenance assertions (in different applications), do we use the identifier to refer to it or both identifier + attribute-value pairs?

3. The assertion of an instance of an entity expression, entity(id, [ attr1=val1, ...]), states, from a given asserter's viewpoint, the existence of an identifiable characterized thing, whose situation in the world is represented by the attribute-value pairs, which remain unchanged during a characterization interval, i.e. a continuous interval between two events in the world. 

Issue: Are the terms "characterization interval" and "continuous interval" defined by time values? What do we mean by "continuous interval" between two events?

4. If an asserter wishes to characterize a thing with the same attribute-value pairs over several intervals, then they are required to assert multiple entity expressions, each with its own identifier (so as to allow potential dependencies between the various entity expressions to be expressed). 

Issue: If a thing with same attribute-value pairs exists over several time? intervals - what will be the dependencies between the various entity expressions (since entity expressions = identifier + attribute-value pairs)? If they are different versions of an entity, they will have distinguishing attributes other than the simple occurrence at different points of time. Further, we multiple entity identifiers are used to refer to the same entity, then how do we reconcile them later? 

I believe this consideration is not required and adds a layer of complexity.

5. A characterization interval may collapse into a single instant.

Issue: Are we referring to time values. We seem to be using terms like "characterization interval", "continuous interval", "single instant" etc. as surrogates for time. I suggest that we explicitly use "time" if all these other terms are not distinguishable from time.

