W3C home > Mailing lists > Public > public-prov-wg@w3.org > February 2012

Re: Comments to the working draft 4 of DM

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Tue, 21 Feb 2012 20:46:27 +0000
Message-ID: <EMEW3|9e2af48547efc93d899192b2c04ace22o1KKkr08L.Moreau|ecs.soton.ac.uk|4F4402A3.9020604@ecs.soton.ac.uk>
To: Jun Zhao <jun.zhao@zoo.ox.ac.uk>
CC: Paul Groth <p.t.groth@vu.nl>, Provenance Working Group WG <public-prov-wg@w3.org>
Hi Jun

Questions below:

On 21/02/12 19:39, Jun Zhao wrote:
> Hi Paul,
> Sorry for my delay of my reply...
> On 19/02/2012 19:56, Paul Groth wrote:
>> Hi Jun,
>> I let the editors respond in more detail. Thanks for the review!
>> ==Goal==
>> I believe, The goal of the first document (PROV-DM Part 1) is to present
>> the terms of the data model in natural language. It is a conceptual
>> model. At least this is what I think :-) Maybe it should be said more
>> explicitly...
> Ditto.
>> ==Scruffy&  Proper==
>> In terms of "proper" and "scruffy" provenance here's what I believe we
>> meant by these terms at F2F2. We identified two use cases:
>> 1. The ability to use the PROV vocabulary to make provenance statements
>> about existing things on the Web. Think for example adding simple
>> provenance metadata (i.e. authorship) in a web page.
>> 2. The ability to exchange PROV information between provenance systems
>> where a static or fixed view of data is key. This is common in current
>> provenance tracking systems. Think exchanging information between
>> version control systems or two scientific workflow systems.
>> Number 1 is the "scruffy" use-case, we don't want people to have care
>> about fixing the state of things whereas Number 2 is the "proper"
>> use-case where being able to refer to a specific partial state is
>> important. So scruffy and proper aren't about minimal and non-minimal.
>> It's about what sort of semantics a user wants to support.
> Ok, that's totally different from what I have in mind. I'll write 
> another email after I thinking through whether/how we should get this 
> into the doc.
>> ==Lightweight??==
>> I'm curious as to what you consider lightweight? Currently, we have 3
>> "core" classes and edges between those. I guess the Figure in Section
>> 2.5 seems fairly lightweight to me... I wonder what you think?
> Yes, we have three 3 classes and their edges in the overview section, 
> but many more in the core section. Again, what do we mean by core? 
> Section 2 is much more lightweight than sect 4, which is good. The 
> name of core is confusing, at least for me, who doesn't have all the 
> context to interpret its actual meaning.
> The figure in sect 2.5 is very lightweight, but it doesn't correspond 
> to the content in section 2. In that figure, there are edges like 
> used, wasGeneratedBy, wasAssociatedWith, are they meant to match to 
> Use, Generation, Association? This is not a precise matching. We don't 
> have an agreed definition of what we mean by data model. We are 
> leaving readers to interpret the figure and the content.

There was supposed to be an introduction to figure 2.5, and it fell 
between the cracks. It has to be written.

> And what about Plan, which is mentioned in the content but not in the 
> figure.

The introduction would have said that figure 2.5 is just giving the key 
elements and relations.
> And what about wasStartedBy, wasEndedBy, addedOnBehalfOf, and 
> wasDerivedFrom, they are not discussed in the content.

wasStartedBy/wasEndedBy are at risk, hence not discussed.

Derivation/Responsibility correspond to wasDerivedFrom/addedOnBehalfOf

The introduction to the figure should have explained it.

> And many other such of kind of inconsistency. Am I reading a different 
> version of draft from you and Luc?
>> Just a note on the goal of the prov-dm document. It is to be accessible
>> but it's not the entry point for the set of specifications. At the F2F2,
>> it was agreed that the entry point would be the Primer and then the
>> Ontology (or other serialization) and then one could drill down to the
>> data model and finally to the semantics document. So this document may
>> have more than one would want in a brief introduction.
> Is this also clear in the document?

Jun, just a reminder, we are asking whether this document can become an 
editor's draft!
So, it is currently a draft of a draft and in no way finished!
>> ==Definition Repetition==
>> Section 4 repeats many definition, actually by my request, so that for
>> each term we have its definition. It acts as a glossary of terms.
> I think section 4 is the right place for definitions, because that's 
> what that section is for. But section 2 is meant to give overviews, 
> right? Do we need that sort of formality in section 2? It just looks 
> complex and verbose, purely from a presentation point of view.

What do you mean formality? what is complex? verbose? each "so-called 
definition" is about 2 sentences long.

> HTH,
> -- Jun
>> cheers
>> Paul
>> Jun Zhao wrote:
>>> These comments are respect to the DM working draft 4,
>>> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html. 
>>> accessed on February 17, 2011.
>>> First of all, as my first time of reading the DM working draft, with my
>>> very fresh pair of eyes, I would like to say well done to the group.
>>> There are a lot of very interesting ideas in the model document, 
>>> clearly
>>> reflecting a lot of deep thinking about the problem domain. And I like
>>> very much the position of the DM as for an interchange language. So 
>>> well
>>> done, guys!
>>> However, if the main goal of this new version of the working draft 
>>> is to
>>> simplify what we had, particularly to enable "an upgrade path, from
>>> 'scruffy provenance' (term TBD), to 'precise provenance' (term TBD)", I
>>> am not sure this goal was achieved!
>>> Here are what I think and why:
>>> 1. In the introduction section, there is no such introduction about
>>> 'scruffy provenance' (term TBD), or 'precise provenance' (term TBD). I
>>> think this is a key that should be brought in the front, and which
>>> should be used to structure the rest of the document. And this is not
>>> the case atm, IMO.
>>> 2. The Overview section: I am not sure I see much difference between
>>> this section and the section giving definitions to the 'core'. I would
>>> rather expect to see an overview of the model, for example, for the
>>> scruffy and precise level, what terms and properties we have at each
>>> level etc. I am sure Luc knows that the overview diagram needs update
>>> and I couldn't read the figure properly even printed the doc with
>>> high-resolution laser printer:)
>>> 3. I used the terminology of "terms" and "properties", but actually I
>>> don't what this data model is. What do we mean by "data model"? Is it a
>>> conceptual model, logical model, entity relationship model, or 
>>> something
>>> else? It's not clearly stated and I am confused what terminologies I
>>> should used when referring to the model:(
>>> 4. The Example section: Would it be a good idea to define an example up
>>> in the front and use it throughout the whole document? I don't find a
>>> description about an example in this section and I found it hard to
>>> follow the 'examples' given in Section 3. And in the rest of the
>>> document, examples from many different scenarios are used. I wonder
>>> whether that prevents us from simplifying the reading of the spec.
>>> 5. Section 4, the PROM-DM Core: There are a lot of repetition with the
>>> overview section. And I wonder what we mean by "core". The core almost
>>> includes "all" the DM terms (apart from the few in section 5). My
>>> understanding of "core" would be really the essential set of DM terms
>>> that are must-haves to express the minimal provenance. IMO, the current
>>> "core" is rather inclusive, and provides constructs that can be used to
>>> support some rather complex provenance expressions.
>>> If we can agree on the notion of "scruffy" (minimal??) and "precise"
>>> (extended??), maybe the core part can be used to correspond to the
>>> "scruffy" part, and make it lighter, more succinct, and easier and
>>> quicker to grasp and follow?
>>> 6. There are many cross-references that don't quite work in the current
>>> working draft, like saying some terms are mentioned in the previous or
>>> another section. I didn't include these problems here because I think
>>> these were caused by the re-structuring. I could list them out once the
>>> structure gets more stable.
>>> 7. There are also some technical points that I marked down in the
>>> review, which I didn't raise here either, because I am 'new' to the
>>> group and I don't want to re-open closed issues. What's the stage of 
>>> the
>>> technical part of DM? Are there still open technical discussions?
>>> In my opinion I think the document still needs some more work on the
>>> structuring and organization front to make it simplified.
>>> I think we should make a better use of the notion of "scruffy"
>>> (minimal??) and "precise" (extended??), and use this to guide the
>>> restructuring of the document.
>>> Thoughts?
>>> HTH,
>>> -- Jun
Received on Tuesday, 21 February 2012 20:47:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:12 UTC