- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Fri, 21 Oct 2011 17:08:50 +0100
- To: Paul Groth <p.t.groth@vu.nl>
- Cc: "public-prov-wg@w3.org" <public-prov-wg@w3.org>
On Fri, Oct 21, 2011 at 15:41, Paul Groth <p.t.groth@vu.nl> wrote: > I want to say that the post was derived from the video. > Here's what I naturally wrote down: > @prefix prov: <http://www.w3.org/ns/prov-o/>. > <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> > prov:wasDerivedFrom > <http://www.ted.com/talks/paul_bloom_the_origins_of_pleasure.html>. > This implies that both the post and the youtube video are of type > prov:Entity. > But that seems wrong because they are not characterized things. They could > change. Or is the url enough of a characterization? If you think the resource behind the URIs might change (as most can), you should provide some attributes to help describe the entity. I believe it COULD be valid for you to use the "real" URIs here, as your simple account does not cover the earlier or later versions of the two resources. You should however then include some attributes to help merge with other accounts which might have a different view, as a minimum a timestamp or description of the content. We don't really have a generic timestamp feature in PROV, but you can say when an entity was generated: <http://www.ted.com/talks/paul_bloom_the_origins_of_pleasure.html> prov:wasGeneratedAt [ time:inXSDDateTime "2011-10-17T18:25:00Z" ] . <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> prov:wasGeneratedAt [ time:inXSDDateTime "2011-10-17T18:30:00Z" ] . <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> prov:wasDerivedFrom <http://www.ted.com/talks/paul_bloom_the_origins_of_pleasure.html> . (I'm not too comfortable with this approach either - because the asserter is in a way claiming that the TED talk HTML was created at 18:25, which is probably not something you as the asserter know. By PROV-DM this should be kinda-OK, he is merely identifying an entity, which describes a thing in the world - which in this case is a web page. Different accounts don't need to agree on their entity descriptions or provenance assertions even if they are using the same identifiers (and somehow are talking about the same things). Of course, as pointed out by Satya "URIs have a global scope and are interpreted consistently regardless of context" - so I should not just make up an URI like <http://thinklinks.wordpress.com/stian-stole-your-namespace> and claim that this URI shows the location of my slippers - we should both interpret this as a identifying the resource "stian-stole-your-namespace" on the HTTP server reachable by the DNS name thinklinks.wordpress.com. Approaches like the PAV ontology (http://code.google.com/p/pav-ontology/) solves the timestamp issue by an intermediary: :doc a pav:Sourcedocument ; pav:retrievedFrom <http://www.ted.com/talks/paul_bloom_the_origins_of_pleasure.html> ; pav:sourceAccessedOn "2011-10-17T18:25:00Z" . However here we have introduced an intermediary :doc (similar to our prov:Entity) which you still need to mint an URI for. A different account which includes several revisions of the resource, provided by Wordpress database, for instance, would need to identify each of these using other identifiers, such as local IDs in the RDF document: @prefix prov: <http://www.w3.org/ns/prov-o/> . @prefix time: <http://www.w3.org/2006/time#> . <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> prov:wasGeneratedAt :creationTime . :creationTime a prov:Time ; time:inXSDDateTime "2011-10-15T15:00Z" . :blog1 a prov:Entity; prov:wasGeneratedAt :creationTime ; # i.e. generated at same time as: prov:wasComplementOf <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> . :tedTalk a prov:Entity ; # So this is not the generation time of the talk HTML - but # the generation time of the overlapping entity description # (as the author saw it and embedded its video in :blog2) prov:wasGeneratedAt [ time:inXSDDateTime "2011-10-17T18:25:00Z" ] ; prov:wasComplementOf <http://www.ted.com/talks/paul_bloom_the_origins_of_pleasure.html> . :blog2 a prov:Entity ; prov:wasGeneratedAt [ time:inXSDDateTime "2011-10-17T18:30:00Z" ] ; prov:wasComplementOf <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> ; notYetInProv:wasRevisionOf :blog1 ; prov:wasDerivedFrom :blog1 ; # Embedded the video this time prov:wasDerivedFrom :tedTalk . I much prefer this approach, but it does become more verbose. It still makes <http://www.ted.com/talks/paul_bloom_the_origins_of_pleasure.html> an prov:Entity - but we don't say anything more about it because we simply don't know its provenance. (I still believe that we need something stronger than wasComplementOf above - we know for a fact that :blog2 is fully within the timespan of <http://thinklinks.wordpress.com/2011/07/31/why-provenance-is-fundamental-for-people/> but I can't see how to express this in PROV) -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Friday, 21 October 2011 16:09:47 UTC