W3C home > Mailing lists > Public > public-prov-wg@w3.org > February 2012

Re: Comments to the working draft 4 of DM

From: Paul Groth <p.t.groth@vu.nl>
Date: Tue, 21 Feb 2012 21:00:25 +0100
Message-ID: <C5DBF9DD-31BB-4709-9F21-B8098FD1F9D3@vu.nl>
CC: Provenance Working Group WG <public-prov-wg@w3.org>
To: Jun Zhao <jun.zhao@zoo.ox.ac.uk>
Hi Jun

Thanks for the clarifications. I think they're good comments and should be addressed. They seem primarily editorial so I'll let the editors take them

Luc , Paolo? :-)

Paul

On Feb 21, 2012, at 20:39, Jun Zhao <jun.zhao@zoo.ox.ac.uk> wrote:

> Hi Paul,
> 
> Sorry for my delay of my reply...
> 
> On 19/02/2012 19:56, Paul Groth wrote:
>> Hi Jun,
>> 
>> I let the editors respond in more detail. Thanks for the review!
>> 
>> ==Goal==
>> I believe, The goal of the first document (PROV-DM Part 1) is to present
>> the terms of the data model in natural language. It is a conceptual
>> model. At least this is what I think :-) Maybe it should be said more
>> explicitly...
> 
> Ditto.
>> 
>> ==Scruffy&  Proper==
>> In terms of "proper" and "scruffy" provenance here's what I believe we
>> meant by these terms at F2F2. We identified two use cases:
>> 
>> 1. The ability to use the PROV vocabulary to make provenance statements
>> about existing things on the Web. Think for example adding simple
>> provenance metadata (i.e. authorship) in a web page.
>> 2. The ability to exchange PROV information between provenance systems
>> where a static or fixed view of data is key. This is common in current
>> provenance tracking systems. Think exchanging information between
>> version control systems or two scientific workflow systems.
>> 
>> Number 1 is the "scruffy" use-case, we don't want people to have care
>> about fixing the state of things whereas Number 2 is the "proper"
>> use-case where being able to refer to a specific partial state is
>> important. So scruffy and proper aren't about minimal and non-minimal.
>> It's about what sort of semantics a user wants to support.
> 
> Ok, that's totally different from what I have in mind. I'll write 
> another email after I thinking through whether/how we should get this 
> into the doc.
> 
>> 
>> ==Lightweight??==
>> I'm curious as to what you consider lightweight? Currently, we have 3
>> "core" classes and edges between those. I guess the Figure in Section
>> 2.5 seems fairly lightweight to me... I wonder what you think?
> 
> Yes, we have three 3 classes and their edges in the overview section, 
> but many more in the core section. Again, what do we mean by core? 
> Section 2 is much more lightweight than sect 4, which is good. The name 
> of core is confusing, at least for me, who doesn't have all the context 
> to interpret its actual meaning.
> 
> The figure in sect 2.5 is very lightweight, but it doesn't correspond to 
> the content in section 2. In that figure, there are edges like used, 
> wasGeneratedBy, wasAssociatedWith, are they meant to match to Use, 
> Generation, Association? This is not a precise matching. We don't have 
> an agreed definition of what we mean by data model. We are leaving 
> readers to interpret the figure and the content.
> 
> And what about Plan, which is mentioned in the content but not in the 
> figure.
> 
> And what about wasStartedBy, wasEndedBy, addedOnBehalfOf, and 
> wasDerivedFrom, they are not discussed in the content.
> 
> And many other such of kind of inconsistency. Am I reading a different 
> version of draft from you and Luc?
> 
>> 
>> Just a note on the goal of the prov-dm document. It is to be accessible
>> but it's not the entry point for the set of specifications. At the F2F2,
>> it was agreed that the entry point would be the Primer and then the
>> Ontology (or other serialization) and then one could drill down to the
>> data model and finally to the semantics document. So this document may
>> have more than one would want in a brief introduction.
> 
> Is this also clear in the document?
>> 
>> ==Definition Repetition==
>> Section 4 repeats many definition, actually by my request, so that for
>> each term we have its definition. It acts as a glossary of terms.
> 
> I think section 4 is the right place for definitions, because that's 
> what that section is for. But section 2 is meant to give overviews, 
> right? Do we need that sort of formality in section 2? It just looks 
> complex and verbose, purely from a presentation point of view.
> 
> HTH,
> 
> -- Jun
>> 
>> cheers
>> Paul
>> 
>> 
>> Jun Zhao wrote:
>>> These comments are respect to the DM working draft 4,
>>> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html.
>>> accessed on February 17, 2011.
>>> 
>>> First of all, as my first time of reading the DM working draft, with my
>>> very fresh pair of eyes, I would like to say well done to the group.
>>> There are a lot of very interesting ideas in the model document, clearly
>>> reflecting a lot of deep thinking about the problem domain. And I like
>>> very much the position of the DM as for an interchange language. So well
>>> done, guys!
>>> 
>>> However, if the main goal of this new version of the working draft is to
>>> simplify what we had, particularly to enable "an upgrade path, from
>>> 'scruffy provenance' (term TBD), to 'precise provenance' (term TBD)", I
>>> am not sure this goal was achieved!
>>> 
>>> Here are what I think and why:
>>> 
>>> 1. In the introduction section, there is no such introduction about
>>> 'scruffy provenance' (term TBD), or 'precise provenance' (term TBD). I
>>> think this is a key that should be brought in the front, and which
>>> should be used to structure the rest of the document. And this is not
>>> the case atm, IMO.
>>> 
>>> 2. The Overview section: I am not sure I see much difference between
>>> this section and the section giving definitions to the 'core'. I would
>>> rather expect to see an overview of the model, for example, for the
>>> scruffy and precise level, what terms and properties we have at each
>>> level etc. I am sure Luc knows that the overview diagram needs update
>>> and I couldn't read the figure properly even printed the doc with
>>> high-resolution laser printer:)
>>> 
>>> 3. I used the terminology of "terms" and "properties", but actually I
>>> don't what this data model is. What do we mean by "data model"? Is it a
>>> conceptual model, logical model, entity relationship model, or something
>>> else? It's not clearly stated and I am confused what terminologies I
>>> should used when referring to the model:(
>>> 
>>> 4. The Example section: Would it be a good idea to define an example up
>>> in the front and use it throughout the whole document? I don't find a
>>> description about an example in this section and I found it hard to
>>> follow the 'examples' given in Section 3. And in the rest of the
>>> document, examples from many different scenarios are used. I wonder
>>> whether that prevents us from simplifying the reading of the spec.
>>> 
>>> 5. Section 4, the PROM-DM Core: There are a lot of repetition with the
>>> overview section. And I wonder what we mean by "core". The core almost
>>> includes "all" the DM terms (apart from the few in section 5). My
>>> understanding of "core" would be really the essential set of DM terms
>>> that are must-haves to express the minimal provenance. IMO, the current
>>> "core" is rather inclusive, and provides constructs that can be used to
>>> support some rather complex provenance expressions.
>>> 
>>> If we can agree on the notion of "scruffy" (minimal??) and "precise"
>>> (extended??), maybe the core part can be used to correspond to the
>>> "scruffy" part, and make it lighter, more succinct, and easier and
>>> quicker to grasp and follow?
>>> 
>>> 6. There are many cross-references that don't quite work in the current
>>> working draft, like saying some terms are mentioned in the previous or
>>> another section. I didn't include these problems here because I think
>>> these were caused by the re-structuring. I could list them out once the
>>> structure gets more stable.
>>> 
>>> 7. There are also some technical points that I marked down in the
>>> review, which I didn't raise here either, because I am 'new' to the
>>> group and I don't want to re-open closed issues. What's the stage of the
>>> technical part of DM? Are there still open technical discussions?
>>> 
>>> 
>>> In my opinion I think the document still needs some more work on the
>>> structuring and organization front to make it simplified.
>>> 
>>> I think we should make a better use of the notion of "scruffy"
>>> (minimal??) and "precise" (extended??), and use this to guide the
>>> restructuring of the document.
>>> 
>>> Thoughts?
>>> 
>>> HTH,
>>> 
>>> -- Jun
>>> 
>>> 
>> 
> 
Received on Tuesday, 21 February 2012 20:00:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 13:06:56 GMT