Re: Use case for g-snaps from Sandro Hawke on 2011-10-03 (public-rdf-prov@w3.org from October 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 03 Oct 2011 11:28:37 -0400
To: Kai Eckert <kai@informatik.uni-mannheim.de>
Cc: public-rdf-prov@w3.org
Message-ID: <1317655717.9596.7.camel@waldron>
On Mon, 2011-10-03 at 13:30 +0200, Kai Eckert wrote:
> Hi all,
> 
> Am 02.10.2011 14:29, schrieb Richard Cyganiak:
> >> From a provenance viewpoint, we require the thing we talk about in
> >> the provenance to be identifiable.  With URI-less g-snaps, this is
> >> going to become more challenging.
> >
> > You can assign IRIs to anything in RDF, including any concept you can
> > possibly think of. You can assign IRIs to cars and people in RDF,
> > even though RDF doesn't know anything about cars and people. Same for
> > g-snaps.
> >
> > The question is whether the RDF data model should contain built-in
> > constructs for certain things. Experience shows that built-in support
> > for *some* of the g-* things would be a win. The argument is that
> > support for g-boxes is sufficient, because that allows handling of
> > g-snaps as immutable g-boxes.
> 
> Support (only) for g-boxes makes sense from a minimalistic viewpoint 
> that embeds RDF into the existing web infrastructure. The content behind 
> a URI can change and there is (so far) no standardized way to mark a URI 
> as stable regarding the content.
> 
> However, there are best practice examples, like Wikipedia or W3C TRs, 
> that create these stable URIs to preserve old versions of the content 
> and make them accessible. This is enough for the web, but not sufficient 
> for the linked data web. Users can make sense of it, as for the content, 
> but machines need explicit semantics. If such a semantic regarding 
> g-snaps, with the support to name them, would be built into RDF, it 
> would tremendously help applications to make sense of the data and use 
> additional provenance information.
> 
> If we do not have this support, we still can create these URIs according 
> to some best practice and help applications making sense of it by 
> describing these URIs, e.g. with vocabulary provided by the Provenance 
> WG -- something like prov:isStableVersion.
> 
> The problem with best practices is that there can be many of them; and 
> in applications, every best practice needs a specific implementation. We 
> have use cases where we need these identifications of stable content and 
> we want describe the provenance of this content. The complexity and the 
> problems remain the same, the only question is, on which level we solve 
> these problems. I see three possibilities:
> 
> 1. Within RDF, i.e. we somehow define URIs as identifiers of g-snaps.
> 
> 2. Within Provenance, i.e. we have a vocabulary and a data model that 
> allows the identification and definition of g-snaps together with their 
> relation to various g-boxes.
> 
> 3. In-between, i.e. we uncouple the issue from the mere provenance 
> description and build provenance on something like Memento [1].
> 
> I would like to see a solid solution to this problem as close to RDF as 
> possible, as I consider this an important building block for a lot of 
> applications, not least the trust layer...
> 
> Provenance WG is not focused on the provenance of RDF, but to the 
> general description of provenance. That's IMHO not close enough to 
> deliver the solution for such a fundamental requirement.
> 
> If it does not directly belong to RDF (which is reasonable), we should 
> think about the third option and nonetheless see how we can reach a 
> standardization of the identification and accessability of g-snaps.
> 
> Cheers,
> 
> Kai
> 
> [1] http://www.mementoweb.org/

I think I agree with everything you say, but I don't see why this leads
us to need something more than constant g-boxes (instead of g-snaps), or
why this would be only a "best practice" instead of the same level of
standard as other things in RDF.     If I give you the URI of a g-box
and state that its content never changes, then how is that different (in
capability) from me giving you the URI of a graph?    As far as I can
tell, it's much better because you can potentially dereference the URI. 

  -- Sandro
Received on Monday, 3 October 2011 15:28:52 UTC