W3C home > Mailing lists > Public > public-xg-prov@w3.org > October 2010

Re: A proposed provenance wg draft charter

From: Olaf Hartig <hartig@informatik.hu-berlin.de>
Date: Mon, 25 Oct 2010 08:41:16 +0200
To: public-xg-prov@w3.org
Message-Id: <201010250841.18578.hartig@informatik.hu-berlin.de>
Hey,

On Sunday 24 October 2010 15:50:28 Paul Groth wrote:
> Hi Olaf,
> 
> Thanks for the comments. Really good. Some replies in-line
> [...]
> > 1.) Regarding Sec.2, third bullet point "Specify how to embed provenance
> > in document with RDFa ..." and regarding point (1) in Deliverable D4:
> >   * Why is this only about embedding provenance in HTML documents?
> >   Provenance of data retrieved from the Web (e.g. from a Linked Data URI
> >   look-up interface, or from a SPARQL endpoint) is equally important I
> >   would say.
> 
> It shouldn't be just about RDFa. This is probably not clear enough. We
> want to be able to retrieve provenance of any web-resource, through the
> mechanisms you mention URI look-up interface or a sparql endpoint.
 
Some comments regarding the last sentence:

* Here you speak about "retrieve provenance" but the part of the draft chapter 
I referred to with my question are about embedding (at least, that's what I 
thought it was about).

* (related to the previous point) It was not my intention to mention Linked 
Data URI look-up interface and SPARQL endpoints as mechanisms to retrieve 
provenance. Instead, I wanted to suggest the following: if the WG specifies how 
to embed provenance into HTML documents then it should also specify how to 
embed provenance into the RDF graphs retrieved from a Linked Data interface 
and into the result set retrieved from a SPARQL endpoint.

* You speak about "provenance of any web-resource". I still struggle to see 
how Web resources, in general, have provenance. To me provenance is associated 
primarily with specific representations of Web resources that we retrieve from 
the Web.

> > 2.) Regarding Deliverable D4: What does "(3) how to query provenance
> > through a SPARQL endpoint" mean? What do you have in mind here?
> 
> This would specify about retrieving provenance for a resource using
> sparql. So given a resource, how would you write a sparql query to
> retrieve that resource provenance.

Do we talk about a SPARQL endpoint that exposes a dataset which explicitly 
contains provenance information here? In this case it shouldn't be too difficult 
to write such queries; you only have to know which provenance vocabulary is 
being used to represent provenance information in the dataset.

However, another, related question: what do you understand as "a resource" 
here? If it is a Web resource again (i.e. something that can be requested 
directly using a URL), then what are examples for provenance of it?
 
> [...]
> > 4.) Regarding Sec.2 "The Working group will keep this two-pronged
> > approach for the mapping to RDF: a simple vocabulary allowing provenance
> > to be asserted easily, and an ontology that extends the vocabulary with
> > permitted inference." - Why? I'm not familiar with the OPM ontology and
> > what it provides in addition to OPMV, but why shouldn't it be possible
> > to satisfy both requirements (ease of asserting provenance and
> > permitting inferences) with a single vocabulary? I would say that it
> > requires at least some investigation whether an easy to use vocabulary
> > can or can not provide for all kinds of inferencing possible with OWL.
> > For instance, our Provenance Vocabulary provides support for inferring
> > additional statements using some of the constructs available in OWL2.
> 
> So the OPM ontology (OPMO) supports inferences about provenance using
> OWL. But there were some things that were easier if you took some
> "features" out of the simpler vocabulary. Jun can explain this better.
> But one example was the notion of being able to express edges as simple
> rdf edges instead of using reification. Or for example, inferring
> account membership.

Ah, I see the difference. Thanks for clarification.

Greetings,
Olaf
Received on Monday, 25 October 2010 06:41:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 25 October 2010 06:41:56 GMT