W3C home > Mailing lists > Public > public-prov-wg@w3.org > May 2012

Re: Proposal on PROV-DM reorganization

From: Paolo Missier <Paolo.Missier@ncl.ac.uk>
Date: Tue, 22 May 2012 06:34:55 -0700
Message-ID: <4FBB95FF.6020102@ncl.ac.uk>
To: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
CC: Luc Moreau <l.moreau@ecs.soton.ac.uk>, W3C provenance WG <public-prov-wg@w3.org>
Graham, Luc

comments inline

On 5/22/12 3:09 AM, Graham Klyne wrote:
> On 22/05/2012 09:33, Luc Moreau wrote:
>> Hi Graham,
>> Quick feedback
>> On 22/05/2012 07:16, Graham Klyne wrote:
>>> Luc,
>>> I just took a quick look. I think this is a useful improvement for someone
>>> approaching provenance. I won't repeat here all the comments I made previously
>>> since you indicate this is a work-in-progress.
>>> My main comment on your structure is that I think derivation in section 2
>>> should be "up there" with entity and activity - I would probably aim to use
>>> this section to introduce the notion of a provenance trace. Even if derivation
>>> is treated separately in section 5, for the introduction I think it's part of
>>> the entity-activity pattern. (This comment is based on an understanding that
>>> derivation is an entity-entity relation that indicates there is a chain of
>>> used/generated property pairs between the entities. But this isn't stated
>>> explicitly - am I misunderstanding something here?)
>> I think the key point about the "incremental pattern" of Derivation is that the
>> used/generated properties may not have been asserted. Also, it is not a
>> sufficient condition for derivation, but a necessary one. So, that's why for me,
>> derivation is not in the same grouping.
> I accept that they may not be asserted, but I assumed that conceptually at least
> the use/generation steps would exist in a derivation.
I am happy to leave Derivation where it is. If you move it up, it indeed suggests that usage/generation must be involved in 
derivation. I suggest to add a fwd link to sec. 5.3 for clarity.
> I'm not understanding the second part of what you say here (which suggests that
> this might be too subtle to distinguish in this overview).
>>> I can't see any purpose served by table 2.
>> It's the only place in the introduction where we map concept names to properties
>> names. Otherwise, how can someone reading the definition of 'Derivation' know
>> for sure it's realized by property wasDerivedFrom?
> Isn't that covered in section 5?  (haven't checked).  My point is that at the
> place where it occurs, it's adding clutter to the document when what we really
> should be aiming for is to instill, as clearly and succinctly as possible, a
> basic working model for provenance in the reader's mind.
the duality between concepts and their corresponding UML constructs is going to generate confusion unless it is explained.
I think this distinction should be made in a separate sub-section before the current 2.1, rather than after fig. 1. Table 2 should 
then be moved to that section.
>>> In table 3, I'd suggest dropping the tick-columns for core/extended structures
>>> - I think the section cross-references are sufficient (though I note that some
>>> link to the wrong place - but I assume that's because this is WIP). I'd also
>>> suggest including forward links to the corresponding sub-sections in section 5.
>> I dropped the extended column, since all components had it. I also fixed the
>> links and added foreard links to section 5.
works better, but I thought the interpretation of that column was "all components, including some of the core ones, also have an 
extended representation". It may be worth stating so explicitly.
>>> I think your section 2 can be made into a compact and easily-assimilated
>>> overview of core provenance structure. Looking at this, I think the
>>> light-touch treatment here of the extension structures is also useful (which
>>> is back-tracking slightly on one of my earlier comments). If we go ahead with
>>> this broad structure, I'll come back later and make more detailed editorial
>>> suggestions as seems appropriate.
>> Please, suggestion welcome on section 2.2. There is a tension between light
>> touch (and still useful) and too detail. While
>> I want to remain light touch, I am not convinced it is all entirely useful or
>> understandable.
> Ack.
>>> I haven't yet looked in detail at the subsequent sections. My main structural
>>> criteria for these would be that specific entries are easily located when the
>>> document is used for reference purposes, and the document structure seems to
>>> provide that.
>>> With reference to your comments re. section 3 - I would be inclined to move it
>>> into the introduction section, but also to trim the explanation and rely more
>>> on the referenced prov-n document. A brief description of the purpose of
>>> PROV-N, a link to the specification and maybe the examples should be enough, I
>>> think.
>> I had some push back to move this in section 1, since this document is not about
>> serialization.
> That's why I would trim it back.  The introduction covers topics such as
> document conventions, and it seems to me a reference to PROV-N is part of that.
>    As it's an external reference, a *small* amount of explanation might be
> appropriate.
The reasoning for having prov-n intro there is that this is the immediately before the example where the notation is used. It would 
be awkward to find this section any earlier in the document. I think it is in the right place.

more comments:

fig. 2,3 are broken

- the use of the new box in the UML diagrams is non-standard, in fact I don't think it has a valid interpretation. If you decide to 
group classes, you want to use Packages (I now no longer have a way to fix this). A class can belong to multiple packages (please 
correct if I remember wrong) but you cannot have a package that "straddles" a class.


> #g
> --
>>> I (still) think the position of the example (section 4) between the overview
>>> (section 2) and the more detailed descriptions (section 5) breaks the flow of
>>> the reference material. I think this is less of a problem than it was, as the
>>> first-time developer can switch from "sequential reading mode" to "reference
>>> mode"
>> OK. I am toying with the idea of moving the second subexample (section 4.2) till
>> after the reference section, and make it use some of the extended constructs.
>> Luc
>>> #g
>>> --
>>> On 21/05/2012 22:32, Luc Moreau wrote:
>>>> Hi Graham,
>>>> I have been experimenting with section 2, and early preview
>>>> is visible from
>>>> https://dvcs.w3.org/hg/prov/raw-file/tip/model/working-copy/wd6-prov-dm-with-core.html
>>>> Some responses to your comments.
>>>> On 21/05/12 12:15, Graham Klyne wrote:
>>>>> Hi Paul,
>>>>> Re: http://www.w3.org/2011/prov/wiki/ProvDM_ConsensusProposal
>>>>> I think this proposal is an improvement, though it goes less far than I
>>>>> personally would choose. I would still prefer a stand-alone document covering
>>>>> the core patterns, but there is apparently no appetite for that within the
>>>>> working group so I shall not push that point.
>>>>> Beyond that, here are some specific suggestions relating to your proposal:
>>>>> 1. I'd prefer to see core patterns as a separate top level section rather than
>>>>> a sub-section of the overview. I feel that would help to convey its role as a
>>>>> self-contained set of related ideas around which the others structures and
>>>>> terms can be used as needed.
>>>> I now have three subsections in section 2, respectively related to core,
>>>> extended, and components.
>>>> I feel they fit well in an overview section. Moving one or all of them to the
>>>> toplevel would lead to a proliferation
>>>> of toplevel sections, which I am not keen on.
>>>>> 2. I'd like the diagram to be at the *start* of the core patterns, not at the
>>>>> end. I believe it can provide a mental framework for a reader to relate the
>>>>> concepts as they are described in the ensuing sections. I'd also suggest the
>>>>> diagram (per current DM) be revised to be visually styled more like the one in
>>>>> the PROV-O document. (I'll help with that if asked.)
>>>> Yes, it's done.
>>>> The diagram was updated, using another tool.
>>>> Now, one can possibly improve on the diagrams, but we do not want to introduce
>>>> an ad-hoc graphical notation. We use UML for all our class diagrams.
>>>>> 3. I would not separate Entities/Activities and Derivation into separate
>>>>> sub-sections. When we talk about using provenance in applications, I note that
>>>>> we most commonly talk about a "provenance trace" - and it is the
>>>>> interconnection of entities, activities, generation and usage that gives us
>>>>> derivation, which in my perception is a central element of a provenance trace.
>>>>> Thus, I would suggest presenting these concepts together, then introducing
>>>>> agents and associated inter-relationships in a separate sub-section. I think
>>>>> this is what Tim suggested in the last teleconference.
>>>> The reason for keeping this subsection is that I want to parallel the component
>>>> structure.
>>>> If people are happy with moving component 3 before component 2 (talk about
>>>> derivations before agents),
>>>> I am happy to do so. However, I received some push back.
>>>>> 4. I'm not sure that "advanced" is the best term for features that are not
>>>>> part of the core pattern. I can live with it, but I'll also try and come up
>>>>> with some alternatives.
>>>> Now using extended.
>>>>> 5. I'm all for looking to improve modularity of the design, as you also
>>>>> mention in your proposal.
>>>> It's an important aspect of the DM and therefore has been given an overview
>>>> section in 2.3
>>>>> 6. I'm not sure that it really adds any value to mark core patterns throughout
>>>>> the document as you suggest. Once a reader has internalized the core patterns,
>>>>> I think they're pretty obvious when they occur.
>>>> The only mark up occurs in tables 3/4, section 5. I am not proposing to do it
>>>> anywhere else.
>>>> Cheers,
>>>> Luc
>>>>> #g
>>>>> --
>>>>> On 20/05/2012 11:01, Paul Groth wrote:
>>>>>> Hi All,
>>>>>> During last week's telcon [1] the chairs were tasked to come-up with a
>>>>>> proposal that tried to reflect consensus on reorganization of the data
>>>>>> model. This would take into account both Graham's proposal [2] as well
>>>>>> as the WG discusion and prior agreements.
>>>>>> We've come up with with the following proposal:
>>>>>> http://www.w3.org/2011/prov/wiki/ProvDM_ConsensusProposal
>>>>>> We hope this reflects a consensus with the working group and something
>>>>>> we could proceed on. Is this a good foundation to proceed?
>>>>>> Thanks
>>>>>> Paul
>>>>>> [1] http://www.w3.org/2011/prov/meeting/2012-05-17
>>>>>> [2] http://www.w3.org/2011/prov/wiki/ProvDM_Proposal_for_restructuring

-----------  ~oo~  --------------
Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org
School of Computing Science, Newcastle University,  UK
Received on Tuesday, 22 May 2012 13:35:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 22 May 2012 13:35:59 GMT