W3C home > Mailing lists > Public > public-prov-wg@w3.org > October 2012

RE: Provenance specs: have we lost sight of the goal?

From: Freimuth, Robert, Ph.D. <Freimuth.Robert@mayo.edu>
Date: Wed, 10 Oct 2012 20:03:54 +0000
To: Provenance Working Group <public-prov-wg@w3.org>
Message-ID: <76A706C559A90249BA321EE35470B85701FA5E@MSGPEXCEI12A.mfad.mfroot.org>
Hi all,

I'm somewhat new to this group, but as a newbie I can relate to the concerns raised in Graham's email.  I've been defining and working with data standards for quite a while, and I have first- and second-hand experience with working groups in both W3C and HL7.  I know how much effort it takes to create standards like PROV and I appreciate all of the time, patience, and perseverance that the members of this group have put into it.

Commenting on several aspects of the thread:

[Graham] > A common theme that has emerged is that the provenance specs are over-complicated, and that as a result many people (being non-provenance specialists) just will not use it.  I've suggested to these people that they submit last-call comments to the working group, but the general response has been along the lines of "Why should I bother?  It doesn't matter to me, I won't use it".

I struggled through the specs.  I still haven't read them all.  I did, however, take the time to record and submit my comments in the hopes that they would provide a helpful "outsider's view" of the specs.  I know it is hard to get feedback, especially from people that might not have a great interest in the topic.  I'm not sure there is much you can do about that.

The concepts behind provenance are actually relatively simple.  (The definition of those concepts and the relationships between them are another story, as my feedback suggests.)  In my opinion, PROV should include a conceptual model that defines all of the entities and relationships, without getting bogged down in formal notation or syntax.  This could provide a very concise, implementation-independent overview of the model.  The conceptual model could then be implemented using any number of technologies, each of which could have a separate (platform-specific) spec.  This approach is based on MDA.

[Paul] > One suggestion I would make is to put the information on "where to start" on our wiki page so that people know what they should read.

This would be very helpful.  It should also indicate what part(s) of the docs are normative and which are implementation-specific.  A concise platform-independent model (see above) might provide a good "where to start".

[Tom] > I believe our main goal is facilitating the interchange of provenance, not making sure every non-expert can assert his own after 10 minutes of reading.

I agree with this.  Complexity isn't a bad thing, especially when it is necessary for interoperability.  A complex spec does, however, require extra effort to make sure that it is approachable from a variety of viewpoints.

[Graham] > "It doesn't matter to me, I won't use it"

To be honest, I'm not sure if I'll use PROV.  I will try to adopt parts of it, but I doubt I'll do anything with RDF because it isn't compatible with my use case.

I work in a health care environment and I am trying to integrate provenance into a system that tracks guidelines for clinical practice and decision support.  Most of the software that would create, manage, and consume provenance information is enterprise-grade and object-oriented.  In addition, the clinical IT teams that develop and maintain the software do not have expertise in semantic web technology.  It simply would not be practical for me to design a solution that uses RDF.  This is why I created my own OO view of PROV when I reviewed the spec (see http://lists.w3.org/Archives/Public/public-prov-comments/2012Jul/0010.html).

This is, of course, also the reason why so many of the comments that I submitted about PROV concern the data model and why resolutions that use RDF/SW approaches don't necessarily work for my case (see http://www.w3.org/2011/prov/track/issues/520 for an example).  Some of the freedoms and strengths that semantic web approaches provide present problems in other technologies.

I recognize that my use case may not be representative, nor may it be a primary target for PROV.  Given the statements in the root of this thread, however, I thought I would throw my 2 cents in.

Thanks,
Bob



________________________________
From: public-prov-wg-request@listhub.w3.org [mailto:public-prov-wg-request@listhub.w3.org] On Behalf Of Tom De Nies
Sent: Wednesday, October 10, 2012 3:52 AM
To: public-prov-wg@w3.org
Subject: Re: Provenance specs: have we lost sight of the goal?

Hi Graham,

I think your statement of "Why should I bother?  It doesn't matter to me, I won't use it" applies to any standard out there. I'm sure that for a non-linked-data-expert the RDF specifications appear daunting, but that doesn't mean they're not widely used.
I believe our main goal is facilitating the interchange of provenance, not making sure every non-expert can assert his own after 10 minutes of reading. You don't expect to learn all of CSS in one day either. Maybe you'd be able to position some divs and change some colors after some practice, much as you'd be able to make some basic PROV assertions after reading the PROV-DM core or the primer. Also, much like there are tools to manage a page's layout for people who don't want to learn CSS, you can expect tools to manage PROV information associated with a document. These tools are created by experts, who know the spec in and out, but the /use/ of the standard is widespread, even among non-experts.

Just my thoughts on the matter...
Enjoy the rest of your holiday!

Best regards,
Tom

2012/10/10 Paolo Missier <Paolo.Missier@ncl.ac.uk<mailto:Paolo.Missier@ncl.ac.uk>>
Hi Paul, Graham

regarding pathway to adoption and simplicity, I think the primer does a good job at showing how to do "simple things simply". That should help reduce adoption anxiety.
Next, I think a best practices doc with a collection of provenance patterns and possibly case studies will be key to adoption.

-Paolo




On 10/10/2012 07:27, Paul Groth wrote:
Hi Graham,

First, enjoy your holiday.

I think you make 2 a bit different points that I'd like to respond to
around adoption and simplicity.

1) Adoption
I'm actually enthusiastic about adoption. We already have two software
implementations coming from outside the WG [1], [2]. As well as
positive reports of usage by WG members. Once we get to
recommendations prov will be included as part of the rdfa initial
context [3] and I already know of several other users and extensions
of prov.

Obviously, we need to do more to encourage and increase adoption. I'm
open for suggestions here. Personally, I would like to do more blog
posts showing the ease with which you can use prov, for example, in
RDFa.

2) Simplicity
As you know, the group has done a lot of work on making things simpler
and easier to access. prov-primer and prov-o are both simple. I think
prov-xml will also be easy to understand. PROV-DM is long but has a
clear organization and is not meant as the entry point for the specs.
Clearly, prov-constraints is not simple but is not aimed at the target
audience of non-provenance specialist.

So my question, is there any way, that we can get concrete criticisms
so that we can address these concerns?

One suggestion I would make is to put the information on "where to
start" on our wiki page so that people know what they should read.


I think the whole group wants to make prov a success and I think it
will be. The demand for provenance interchange is there and we have a
solid solution. Now we need to complete the specs and also make sure
that they are properly communicated.

regards,
Paul


[1] OpenRDF Auditing Repository
http://www.openrdf.org/doc/alibaba/2.0-rc5/alibaba-repository-auditing/index.html
[2] Callimachus
http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0001.html
[3] http://www.w3.org/2011/rdfa-context/rdfa-1.1

On Wed, Oct 10, 2012 at 12:04 AM, Graham Klyne <GK@ninebynine.org<mailto:GK@ninebynine.org>> wrote:
(Now that I'm on holiday, away from the day-to-day pressures of getting stuff
done, I find a little time to put down some nagging doubts I've been having
about how our work is going...)

Over the past few weeks, I have had informal discussions with a small number of
people about the provenance specifications.  A common theme that has emerged is
that the provenance specs are over-complicated, and that as a result many people
(being non-provenance specialists) just will not use it.  I've suggested to
these people that they submit last-call comments to the working group, but the
general response has been along the lines of "Why should I bother?  It doesn't
matter to me, I won't use it".

This raises for me the possibility that we are working in an "echo chamber",
hearing only the views of people who have a particular and deep interest in
provenance, but not hearing the views of a wider audience who he hope will
include and consume limited amounts of provenance information in their applications.

Maybe it's only me, and the rest of you aren't hearing this kind of comment.
But if you are I think that, as we go through the last call process, it is
appropriate to reflect and consider if what we are producing is really relevant
to the wider community we aim to serve.  Have we become too bound up with fine
distinctions that don't matter, or don't apply in the same way, to the majority
of potential provenance-generating and provenance-using applications?   Have we
sacrificed approachability and simplicity that encourages widespread take-up on
the altar of premature optimization to support particular usage scenarios?

While I think these are relevant questions, I'm not sure if and what we might do
about them.  But I also fear that what we produce may turn out to be irrelevant
in the long run.

#g
--





--
-----------  ~oo~  --------------
Paolo Missier - Paolo.Missier@newcastle.ac.uk<mailto:Paolo.Missier@newcastle.ac.uk>, pmissier@acm.org<mailto:pmissier@acm.org>
School of Computing Science, Newcastle University,  UK
http://www.cs.ncl.ac.uk/people/Paolo.Missier
Received on Wednesday, 10 October 2012 20:04:25 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:19 UTC