- From: Paul Groth <p.t.groth@vu.nl>
- Date: Tue, 21 Feb 2012 21:00:25 +0100
- To: Jun Zhao <jun.zhao@zoo.ox.ac.uk>
- CC: Provenance Working Group WG <public-prov-wg@w3.org>
Hi Jun Thanks for the clarifications. I think they're good comments and should be addressed. They seem primarily editorial so I'll let the editors take them Luc , Paolo? :-) Paul On Feb 21, 2012, at 20:39, Jun Zhao <jun.zhao@zoo.ox.ac.uk> wrote: > Hi Paul, > > Sorry for my delay of my reply... > > On 19/02/2012 19:56, Paul Groth wrote: >> Hi Jun, >> >> I let the editors respond in more detail. Thanks for the review! >> >> ==Goal== >> I believe, The goal of the first document (PROV-DM Part 1) is to present >> the terms of the data model in natural language. It is a conceptual >> model. At least this is what I think :-) Maybe it should be said more >> explicitly... > > Ditto. >> >> ==Scruffy& Proper== >> In terms of "proper" and "scruffy" provenance here's what I believe we >> meant by these terms at F2F2. We identified two use cases: >> >> 1. The ability to use the PROV vocabulary to make provenance statements >> about existing things on the Web. Think for example adding simple >> provenance metadata (i.e. authorship) in a web page. >> 2. The ability to exchange PROV information between provenance systems >> where a static or fixed view of data is key. This is common in current >> provenance tracking systems. Think exchanging information between >> version control systems or two scientific workflow systems. >> >> Number 1 is the "scruffy" use-case, we don't want people to have care >> about fixing the state of things whereas Number 2 is the "proper" >> use-case where being able to refer to a specific partial state is >> important. So scruffy and proper aren't about minimal and non-minimal. >> It's about what sort of semantics a user wants to support. > > Ok, that's totally different from what I have in mind. I'll write > another email after I thinking through whether/how we should get this > into the doc. > >> >> ==Lightweight??== >> I'm curious as to what you consider lightweight? Currently, we have 3 >> "core" classes and edges between those. I guess the Figure in Section >> 2.5 seems fairly lightweight to me... I wonder what you think? > > Yes, we have three 3 classes and their edges in the overview section, > but many more in the core section. Again, what do we mean by core? > Section 2 is much more lightweight than sect 4, which is good. The name > of core is confusing, at least for me, who doesn't have all the context > to interpret its actual meaning. > > The figure in sect 2.5 is very lightweight, but it doesn't correspond to > the content in section 2. In that figure, there are edges like used, > wasGeneratedBy, wasAssociatedWith, are they meant to match to Use, > Generation, Association? This is not a precise matching. We don't have > an agreed definition of what we mean by data model. We are leaving > readers to interpret the figure and the content. > > And what about Plan, which is mentioned in the content but not in the > figure. > > And what about wasStartedBy, wasEndedBy, addedOnBehalfOf, and > wasDerivedFrom, they are not discussed in the content. > > And many other such of kind of inconsistency. Am I reading a different > version of draft from you and Luc? > >> >> Just a note on the goal of the prov-dm document. It is to be accessible >> but it's not the entry point for the set of specifications. At the F2F2, >> it was agreed that the entry point would be the Primer and then the >> Ontology (or other serialization) and then one could drill down to the >> data model and finally to the semantics document. So this document may >> have more than one would want in a brief introduction. > > Is this also clear in the document? >> >> ==Definition Repetition== >> Section 4 repeats many definition, actually by my request, so that for >> each term we have its definition. It acts as a glossary of terms. > > I think section 4 is the right place for definitions, because that's > what that section is for. But section 2 is meant to give overviews, > right? Do we need that sort of formality in section 2? It just looks > complex and verbose, purely from a presentation point of view. > > HTH, > > -- Jun >> >> cheers >> Paul >> >> >> Jun Zhao wrote: >>> These comments are respect to the DM working draft 4, >>> http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/towards-wd4.html. >>> accessed on February 17, 2011. >>> >>> First of all, as my first time of reading the DM working draft, with my >>> very fresh pair of eyes, I would like to say well done to the group. >>> There are a lot of very interesting ideas in the model document, clearly >>> reflecting a lot of deep thinking about the problem domain. And I like >>> very much the position of the DM as for an interchange language. So well >>> done, guys! >>> >>> However, if the main goal of this new version of the working draft is to >>> simplify what we had, particularly to enable "an upgrade path, from >>> 'scruffy provenance' (term TBD), to 'precise provenance' (term TBD)", I >>> am not sure this goal was achieved! >>> >>> Here are what I think and why: >>> >>> 1. In the introduction section, there is no such introduction about >>> 'scruffy provenance' (term TBD), or 'precise provenance' (term TBD). I >>> think this is a key that should be brought in the front, and which >>> should be used to structure the rest of the document. And this is not >>> the case atm, IMO. >>> >>> 2. The Overview section: I am not sure I see much difference between >>> this section and the section giving definitions to the 'core'. I would >>> rather expect to see an overview of the model, for example, for the >>> scruffy and precise level, what terms and properties we have at each >>> level etc. I am sure Luc knows that the overview diagram needs update >>> and I couldn't read the figure properly even printed the doc with >>> high-resolution laser printer:) >>> >>> 3. I used the terminology of "terms" and "properties", but actually I >>> don't what this data model is. What do we mean by "data model"? Is it a >>> conceptual model, logical model, entity relationship model, or something >>> else? It's not clearly stated and I am confused what terminologies I >>> should used when referring to the model:( >>> >>> 4. The Example section: Would it be a good idea to define an example up >>> in the front and use it throughout the whole document? I don't find a >>> description about an example in this section and I found it hard to >>> follow the 'examples' given in Section 3. And in the rest of the >>> document, examples from many different scenarios are used. I wonder >>> whether that prevents us from simplifying the reading of the spec. >>> >>> 5. Section 4, the PROM-DM Core: There are a lot of repetition with the >>> overview section. And I wonder what we mean by "core". The core almost >>> includes "all" the DM terms (apart from the few in section 5). My >>> understanding of "core" would be really the essential set of DM terms >>> that are must-haves to express the minimal provenance. IMO, the current >>> "core" is rather inclusive, and provides constructs that can be used to >>> support some rather complex provenance expressions. >>> >>> If we can agree on the notion of "scruffy" (minimal??) and "precise" >>> (extended??), maybe the core part can be used to correspond to the >>> "scruffy" part, and make it lighter, more succinct, and easier and >>> quicker to grasp and follow? >>> >>> 6. There are many cross-references that don't quite work in the current >>> working draft, like saying some terms are mentioned in the previous or >>> another section. I didn't include these problems here because I think >>> these were caused by the re-structuring. I could list them out once the >>> structure gets more stable. >>> >>> 7. There are also some technical points that I marked down in the >>> review, which I didn't raise here either, because I am 'new' to the >>> group and I don't want to re-open closed issues. What's the stage of the >>> technical part of DM? Are there still open technical discussions? >>> >>> >>> In my opinion I think the document still needs some more work on the >>> structuring and organization front to make it simplified. >>> >>> I think we should make a better use of the notion of "scruffy" >>> (minimal??) and "precise" (extended??), and use this to guide the >>> restructuring of the document. >>> >>> Thoughts? >>> >>> HTH, >>> >>> -- Jun >>> >>> >> >
Received on Tuesday, 21 February 2012 20:00:58 UTC