- From: Iker Huerga <ihuerga@linkatu.net>
- Date: Thu, 12 May 2011 10:10:21 +0200
- To: public-prov-wg@w3.org
Hello Olaf, All,
> Out of curiosity I tried to describe the processing steps of the example
> using the Provenance Vocabulary [1].
Great work.
> 1.) The example does not talk about specific points in time at which the
> different processing steps happened (Hence, I omitted corresponding
> statements in my description). Shouldn't the example extended with such
> kind of information?
In my opinion, yes it should.
> 2.) Processing step 4 says: "analyst (alice) downloads a turtle
> serialization (lcp1) ..." While I was trying to describe that fact, it
> felt strange that Alice was the agent/actor that accessed the server.
> Hence, I would say that Alice cannot download lcp1 directly, she must use
> an HTTP client software for that. Same for Bob in processing step 8.
> Should we add that to the example?
I agree with Olaf, I think that the object of the prv:performedBy
propertys should be an HTTP agent, for instance an sparql endpoint in a
query scenario.
> 3.) Processing step 7 says "government (gov) publishes an update (d2) of
> data (d1) as a new Web resource (r2)". That's inconsistent with processing
> steps 1 and 3 where gov publishes a Web resource r1 with RDF data f1
> generated from d1. Question: Was it the intention that gov now publishes
> d2 directly; wouldn't it be more consistent if gov were publishing RDF
> data f2 which was obtained from d2?
I think this could be achieved through SPARQL CREATE and INSERT (both
included in SPARQL 1.1) by creating a new graph and then inserting the
new triples. But for this example I would modify the Processing step 7
as Olaf suggests.
Regarding processing step 2, I think that Olaf's suggestion of making
ex:prov a Named Graph containing provenance information would be the
best option. In my honest opinion, I am not a provenance expert,
provenance information shouldn't be added to the HTTP payload, this
could cause a network overhead . In the Web scenario there will be
agents requesting either for provenance information or not.
If the approach, as I read in the "Guide to the Provenance Vocabulary",
is to extend tools for automatically publishing provenance information,
I would recommend that these tools generate a different graph for
provenance information for each prv:DataItem. I will give an example
extending processing step 2.
being exf1= http://example.org/f1/ and ex=http://example.org/
exf1:prov rdf:type dcterms:ProvenanceStatement;
rdf:about
ex:f1. # I really do not know
whether rdf:about can be used
ex:f1 rdf:type prv:DataItem; # in
this context or not. In that case sioc:about could
prv:createdBy [rdf:type prv:DataCreation; # be used instead
prv:usedData ex:d1;
prv:performedBy ex:gov ] .
Thus, agent could automatically retrieve provenance information if
necessary just requesting resource's URI plus prov, for instance.
What do you think about this approach? Is it a misconception by myself?
Best Regards.
--
Iker Huerga Sánchez
Co-Founder, LINKatu
Polo de Innovación Garaia
Goiru, 1. Edificio A, 4º Piso
20500 Arrasate - Gipuzkoa
T+34 943 712 072 F+34 943 712 223
ihuerga@linkatu.net
http://www.linkatu.net
Received on Thursday, 12 May 2011 08:13:18 UTC