W3C home > Mailing lists > Public > public-xg-prov@w3.org > October 2010

Re: A proposed provenance wg draft charter

From: Paul Groth <pgroth@gmail.com>
Date: Sun, 24 Oct 2010 15:50:28 +0200
Message-ID: <4CC439A4.90606@gmail.com>
To: Olaf Hartig <hartig@informatik.hu-berlin.de>
CC: public-xg-prov@w3.org
Hi Olaf,

Thanks for the comments. Really good. Some replies in-line

Olaf Hartig wrote:
> Hello,
>
> I would support a W3C provenance WG. Thanks Paul and Luc for putting together
> the draft charter. However, I have some comments and questions regarding the
> draft:
>
> 1.) Regarding Sec.2, third bullet point "Specify how to embed provenance in
> document with RDFa ..." and regarding point (1) in Deliverable D4:
>   * Why is this only about embedding provenance in HTML documents? Provenance
> of data retrieved from the Web (e.g. from a Linked Data URI look-up interface,
> or from a SPARQL endpoint) is equally important I would say.

It shouldn't be just about RDFa. This is probably not clear enough. We 
want to be able to retrieve provenance of any web-resource, through the 
mechanisms you mention URI look-up interface or a sparql endpoint.


>
> 2.) Regarding Deliverable D4: What does "(3) how to query provenance through a
> SPARQL endpoint" mean? What do you have in mind here?
>
This would specify about retrieving provenance for a resource using 
sparql. So given a resource, how would you write a sparql query to 
retrieve that resource provenance.


> 3.) Regarding Sec. 2.2 Out of Scope - Why is database provenance out of scope?
> This charter focus on provenance of things on the Web as far as I understand.
> I think the W3C's understanding of (the future of) the Web is a Web of
> documents _and_ data. This includes SPARQL query services (aka endpoints) as
> an important way to provide access to data on the Web. Hence, the provenance
> of SPARQL result sets retrieved from the Web is one of the things that should
> be addressed by a W3C provenance WG; this provenance includes provenance of
> each single result in the result set. The work by Irini and colleagues is an
> important first step here. The question of SPARQL result set provenance becomes
> even more important when we consider the integration of a simple query
> federation approach (i.e. the new SERVICE clause) in the upcoming new version
> of the query language or when we think about alternative query execution
> approaches that answer queries over data from multiple sources (e.g. my link
> traversal based query execution approach).

I think what we were trying to go for here is to emphasize that working 
group won't specify the *mechanism* by which you go about tracking 
provenance in a triple store. It's about exposing an interoperability 
model. So if you should be able to describe a sparql result set's 
provenance using the model but we won't tell you how to track it. Also, 
there's a question about whether the model is the most efficient way to 
represent provenance at the triple level.



> 4.) Regarding Sec.2 "The Working group will keep this two-pronged approach for
> the mapping to RDF: a simple vocabulary allowing provenance to be asserted
> easily, and an ontology that extends the vocabulary with permitted inference."
> - Why? I'm not familiar with the OPM ontology and what it provides in addition
> to OPMV, but why shouldn't it be possible to satisfy both requirements (ease
> of asserting provenance and permitting inferences) with a single vocabulary?
> I would say that it requires at least some investigation whether an easy to
> use vocabulary can or can not provide for all kinds of inferencing possible
> with OWL. For instance, our Provenance Vocabulary provides support for
> inferring additional statements using some of the constructs available in
> OWL2.
>

So the OPM ontology (OPMO) supports inferences about provenance using 
OWL. But there were some things that were easier if you took some 
"features" out of the simpler vocabulary. Jun can explain this better. 
But one example was the notion of being able to express edges as simple 
rdf edges instead of using reification. Or for example, inferring 
account membership.

Obviously, you can put these in one file. But this distinction seemed 
nice to provide an easy introduction to people.


> Greetings,
> Olaf
>
>
> On Friday 15 October 2010 13:58:35 Paul Groth wrote:
>> Hi All,
>>
>> Today on the call we are scheduled to talk about preparations for the
>> final report. Luc and I feel that to write a compelling final report we
>> should be clear about exactly what the report should recommend. There
>> has been some consensus that a working group should be formed around the
>> recommendations extracted from the scenarios (
>> http://www.w3.org/2005/Incubator/prov/wiki/Recommendations_for_scenarios).
>>
>> To that end, we have prepared a draft working group charter (
>> http://users.ecs.soton.ac.uk/lavm/draft-charter.html ). We note this is
>> only *our own* proposal and we see this as a starting point for
>> discussion within the group.
>>
>> We look forward to any comments, questions, thoughts about this
>> proposal. We hope this helps the group to continue to coalesce around a
>> way forward.
>>
>> Thanks,
>> Paul and Luc
>
Received on Sunday, 24 October 2010 13:51:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 24 October 2010 13:51:06 GMT