ISSUE 229 (was: Reviewing PROV-DM) from Graham Klyne on 2012-01-30 (public-prov-wg@w3.org from January 2012)

From: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
Date: Mon, 30 Jan 2012 11:20:12 +0000
To: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
CC: W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <4F267CEC.7090308@zoo.ox.ac.uk>

I drafted this, then put the text into the tracker, not meaning to send this 
message. Please reply to the thread mentioning ISSUE 229 
http://www.w3.org/2011/prov/track/issues/229

#g
--

On 30/01/2012 10:44, Graham Klyne wrote:
> I've started another full review of PROV-DM, so far having done up to about
> section 5.4.
>
> While the content is making much more sense than it did last time I reviewed it,
> I am finding some of the text to be repetitive, confusing and in some cases
> strangely phrased. I think a main goal of this document needs to be to offer an
> approachable description of the underlying data model and ASN notation that can
> be used by developers and information designers. I think the document could
> benefit from a serious round of sub-editing (without intending to change the
> substantive content).
>
> I also think that a refactoring of the DM concepts (without fundamentally
> changing the underlying intended semantics) could help to eliminate a lot of
> repetitive text. These comments relate to the recent "domain of discourse" vote,
> but I'm coming at this from a more holistic perspective.
>
> It seems to me that the domain of discourse contains the following concepts:
> Entity
> Activity
> Agent
> Event
> Plan
> Account
> in that these are the various things about which the provenance language aims to
> make assertions, and that all of these could be considered types of Entity (with
> the possible exception of Event). I think we've already established that most if
> not all of these are kinds of entity.
>
> If the descriptions were refactored around such a structure, I believe much of
> the repetitive description of attributes could be focused in one place. I would
> be inclined to separate attributes from the other type declarations, so we'd end
> up with primitive ASM expressions like these:
>
> Entity(id)
> Activity(id, start?, end?)
> Agent(id)
> Plan(id)
> Event(Id, time?)
> Account(id)
> Attributes(id, [attr1=val1, attr2=val2, ...])
>
> Where the Attributes expression could be applied to any of the preceding
> concepts, and the description of attributes would consequently be needed only
> once. The main objection I see to this is that it would mean that, say, the ASN
> expression:
>
> Entity(id, [attr1=val1, attr2=val2, ...])
>
> would be replaced by two expressions:
>
> Entity(id)
> Attributes(id, [attr1=val1, attr2=val2, ...])
>
> I would counter this by having the ASN (but not the underlying model) allow the
> first form as a syntactic sugar for the second.
>
> ...
>
> I also felt that the handling of Activity start and end was not consistent:
> according to the text, the times given correspond to Events. So why not have
> them *be* Events - that would mean we have a total of 6 event types rather than
> just 4, but the description of the "Lamport clock" timelines could be focused on
> the description of Event alone.
>
> ...
>
> I think all of this could be done with minimal change to the underlying
> semantics, and that coupled with a significant round of sub-editing and
> reorganization of some of the text could lead to a document that is much easier
> to follow.
>
> #g
> --
>
>

Received on Monday, 30 January 2012 11:29:19 UTC