W3C home > Mailing lists > Public > public-xg-prov@w3.org > November 2010

RE: W3C Provenance Working Group Charter - another alternate version for discussion

From: Myers, Jim <MYERSJ4@rpi.edu>
Date: Thu, 18 Nov 2010 10:55:36 -0500
Message-ID: <B7376F3FB29F7E42A510EB5026D99EF2040DCE76@troy-be-ex2.win.rpi.edu>
To: "Luc Moreau" <l.moreau@ecs.soton.ac.uk>, "Paulo Pinheiro da Silva" <paulo@utep.edu>
Cc: <public-xg-prov@w3.org>
Apologies for being silent this week - hard to get coherent time here,
so some random thoughts. My take on the technical issues being raised in
the edits is that:

The basic core that was addressed by OPM is not controversial but naming
of concepts could be improved  (the text changes are more focused on
making it clearer that OPM didn't invent these concepts - it's value is
really as evidence that this is roughly the right scope to address (OPM
was the set that we could get agreement on)). 

I do see a few places where people are suggesting stretching that scope
a bit:

Sources - the idea of an agent or mutable resource from which a resource
of interest (the thing were documenting the provenance of) comes.
Nominally this could be dealt with by recording a an agent controlling a
publication process to produce the resource and I think the question to
resolve is whether a special construct would be useful. I think the PML
folks would argue that it is since an agent-process-resource relation is
too generic to signal that being a source is special (i.e. an article
derived from the NYTimes differs in importance from the same article
being handed to you by Joe the newspaper seller (both are just
agent-process-resource constructs). With others in the XG group having
special constructs for publication/retrieval from a service, it seems
like consensus might be possible on this and I think having discussion
of this be part of the working group scope would be useful.

Another construct that looks useful is some link between provenance and
the plan/recipe that was being followed. What that recipe is seems to
differ - a workflow template, logical rules, mathematical function,
scientific experiment protocol, a business contract, etc. - but the
basic capability to make a link between a process and the recipe again
seems like a useful and relatively non-controversial extension that a
working group could address.

A third area where it may make sense to do something would be to make a
connection to mutable resources. I think this is a hard problem in the
general case but some extension to standardize how one might link
resources to a mutable thing as versions might be something that could
be agreed to. Along the lines of the paper I sent in to IPAW this year,
I think this is an area where a working group could really get stuck,
but it's also one where many groups have some capability and we've seen
it arise in many use cases, so some capability here might broaden the
usability. I tend to think of this as a profile that connect provenance
with an existing versioning model rather than something new developed as
part of a language.

Beyond this, I think we enter the area of research/domain extensions
that showed up in the charter in the 'however the languages also have
lots of differences...' part. (Other than wordsmithing -  to try to make
it clearer that these differences are not a problem for reaching a
standard but are instead a good way to delineate the scope of provenance
that seems to have settled down and be done in common ways versus the
set of advanced features where researchers are still experimenting,
trying to discover what aspects of provenance provide the most value - I
don't think I've seen other concrete technical suggestions for more

The last thing I see is continuing wordsmithing to make it clear that
OPM is not the only (or first) provenance language while also
acknowledging that the XG group found it useful as evidence for what
aspects of provenance were ready for standardization. I suspect that we
could continue to edit this aspect forever (if Yolanda let us) - it will
be important that we all let go of the text when we can live with it
versus when we really happy with it. I've started and stopped editing a
couple of times this week to try and come up with text that would move
this aspect of things forward, but have not succeeded. 


From: public-xg-prov-request@w3.org
[mailto:public-xg-prov-request@w3.org] On Behalf Of Luc Moreau
Sent: Wednesday, November 17, 2010 5:27 PM
To: Paulo Pinheiro da Silva
Cc: public-xg-prov@w3.org
Subject: Re: W3C Provenance Working Group Charter - another alternate
version for discussion


Thanks for editing the draft charter and sending it to the group.

Discussion with Satya have indicated that the *Name of the Provenance
Language* will
be controversial. I suggest we don't focus on this issue, and we
acknolwedge the XG will
identify its name. I agree with your proposal of naming it XG, or FOO,
NPL or something neutral.

However, all the feedback I have heard from people involved in
standardization activities,
is that we have to have a clear scope. By indicating OPM, we meant not
just a name, but a precise list 
of provenance concepts.
To avoid an ambiguity, I attach this list of terms.  I will argue that
each term in this list has got
a fairly precise meaning. I also acknolwedge that we can revisit the
terminology, if appropriate.

Your proposal is however vague about its starting point. A quick grep
over pml-p indicates:

grep 'owl:Class ' pml-provenance.owl  | wc

      32      64    1466

grep 'Property ' pml-provenance.owl  | grep -v onProperty | wc

      52     104    3018


Are you telling us the starting point is 80+ concepts?

Your document also indicates " The Working Group has an aggressive
timetable based on the premise that it builds on existing work once we
have a clear understanding of the boundaries of the  new model. ". So,
you are explicitly leaving the scoping activity to the XG . I feel this
is not the right approach. It is up to us to scope this model, in the
charter definition.  TBL's suggestion was to list the terms to take into

A few further points.
a. While I am in favour of a graphical notation to illustrate provenance
concepts, I think it is dangerous to 
promise a full graphical language. Experience in OPM is that beyond
nodes and edges, the rest is very textual,
and overall is not very visual beyond toy examples.  So, by all means,
graphical illustration, but not a full
graphical language.

b. I am strongly in favour of a definition of a language in plain
English, independently of any representation language.
It's part of the "accessibility agenda". We should be able to describe
the provenance language without referring to an OWL ontology.

c.  I am keen to reach out to the non semantic web community. What about


PS I can't believe SC has connectivity problems ;-)

On 17/11/2010 21:43, Paulo Pinheiro da Silva wrote: 

Dear All, 

Deborah and I had a discussion on Monday.  This discussion was in follow
up to the meeting that Jim, Deborah, and I had at RPI two weeks ago and
that was reported by Jim through an email to the group. I did an editing
pass in the original draft of the charter on Monday and Deborah took an
edit pass on top of that late Monday. The updated version of the draft
attached here is in review mode so that you can see the rationale behind
our changes (and hopefully comment them further). 

We were hoping that Jim would be able to do an edit pass but his has
been very busy at Supercomputing 2010 and probably with challenging
connectivity. This means that the comments in this updated draft may not
necessarily reflect Jim's opinions. 

We understand that the document is going to spur some discussion but we
would like to highlight some of the principles used during our
conversation and that Deborah and I considered in our comments: 

We understand the following: 
1)    The provenance community needs to make progress soon if the
community wants the outcomes of the proposed working group to have
2)    Provenance has many dimensions and that the group has a good
understanding of some dimensions while our collective understanding of
other dimensions is still very superficial - thus the working group will
need to focus its efforts in the well-known parts of provenance - the
so-called core concepts of provenance; 
3)    No single provenance language can claim to have representation
mechanisms for all already-identified core provenance concepts and just
core provenance concepts (i.e., no language is a minimal representation
of core provenance concepts). However, we also understand that the
provenance languages discussed in the Provenance Incubator Group have
ways of representing most of these core concepts and that the proposed
working group needs to leverage all such languages in order to make
progress fast. 

Many thanks, 
Paulo (Deborah and Jim)
Received on Thursday, 18 November 2010 17:15:50 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:56:00 UTC