ISSUE 229 (was: Reviewing PROV-DM)

I drafted this, then put the text into the tracker, not meaning to send this 
message. Please reply to the thread mentioning ISSUE 229


On 30/01/2012 10:44, Graham Klyne wrote:
> I've started another full review of PROV-DM, so far having done up to about
> section 5.4.
> While the content is making much more sense than it did last time I reviewed it,
> I am finding some of the text to be repetitive, confusing and in some cases
> strangely phrased. I think a main goal of this document needs to be to offer an
> approachable description of the underlying data model and ASN notation that can
> be used by developers and information designers. I think the document could
> benefit from a serious round of sub-editing (without intending to change the
> substantive content).
> I also think that a refactoring of the DM concepts (without fundamentally
> changing the underlying intended semantics) could help to eliminate a lot of
> repetitive text. These comments relate to the recent "domain of discourse" vote,
> but I'm coming at this from a more holistic perspective.
> It seems to me that the domain of discourse contains the following concepts:
> Entity
> Activity
> Agent
> Event
> Plan
> Account
> in that these are the various things about which the provenance language aims to
> make assertions, and that all of these could be considered types of Entity (with
> the possible exception of Event). I think we've already established that most if
> not all of these are kinds of entity.
> If the descriptions were refactored around such a structure, I believe much of
> the repetitive description of attributes could be focused in one place. I would
> be inclined to separate attributes from the other type declarations, so we'd end
> up with primitive ASM expressions like these:
> Entity(id)
> Activity(id, start?, end?)
> Agent(id)
> Plan(id)
> Event(Id, time?)
> Account(id)
> Attributes(id, [attr1=val1, attr2=val2, ...])
> Where the Attributes expression could be applied to any of the preceding
> concepts, and the description of attributes would consequently be needed only
> once. The main objection I see to this is that it would mean that, say, the ASN
> expression:
> Entity(id, [attr1=val1, attr2=val2, ...])
> would be replaced by two expressions:
> Entity(id)
> Attributes(id, [attr1=val1, attr2=val2, ...])
> I would counter this by having the ASN (but not the underlying model) allow the
> first form as a syntactic sugar for the second.
> ...
> I also felt that the handling of Activity start and end was not consistent:
> according to the text, the times given correspond to Events. So why not have
> them *be* Events - that would mean we have a total of 6 event types rather than
> just 4, but the description of the "Lamport clock" timelines could be focused on
> the description of Event alone.
> ...
> I think all of this could be done with minimal change to the underlying
> semantics, and that coupled with a significant round of sub-editing and
> reorganization of some of the text could lead to a document that is much easier
> to follow.
> #g
> --

Received on Monday, 30 January 2012 11:29:19 UTC