Re: Use case for g-snaps from Kai Eckert on 2011-10-03 (public-rdf-prov@w3.org from October 2011)

From: Kai Eckert <kai@informatik.uni-mannheim.de>
Date: Mon, 03 Oct 2011 13:30:09 +0200
To: public-rdf-prov@w3.org
Message-ID: <4E899CC1.1070808@informatik.uni-mannheim.de>
Hi all,

Am 02.10.2011 14:29, schrieb Richard Cyganiak:
>> From a provenance viewpoint, we require the thing we talk about in
>> the provenance to be identifiable.  With URI-less g-snaps, this is
>> going to become more challenging.
>
> You can assign IRIs to anything in RDF, including any concept you can
> possibly think of. You can assign IRIs to cars and people in RDF,
> even though RDF doesn't know anything about cars and people. Same for
> g-snaps.
>
> The question is whether the RDF data model should contain built-in
> constructs for certain things. Experience shows that built-in support
> for *some* of the g-* things would be a win. The argument is that
> support for g-boxes is sufficient, because that allows handling of
> g-snaps as immutable g-boxes.

Support (only) for g-boxes makes sense from a minimalistic viewpoint 
that embeds RDF into the existing web infrastructure. The content behind 
a URI can change and there is (so far) no standardized way to mark a URI 
as stable regarding the content.

However, there are best practice examples, like Wikipedia or W3C TRs, 
that create these stable URIs to preserve old versions of the content 
and make them accessible. This is enough for the web, but not sufficient 
for the linked data web. Users can make sense of it, as for the content, 
but machines need explicit semantics. If such a semantic regarding 
g-snaps, with the support to name them, would be built into RDF, it 
would tremendously help applications to make sense of the data and use 
additional provenance information.

If we do not have this support, we still can create these URIs according 
to some best practice and help applications making sense of it by 
describing these URIs, e.g. with vocabulary provided by the Provenance 
WG -- something like prov:isStableVersion.

The problem with best practices is that there can be many of them; and 
in applications, every best practice needs a specific implementation. We 
have use cases where we need these identifications of stable content and 
we want describe the provenance of this content. The complexity and the 
problems remain the same, the only question is, on which level we solve 
these problems. I see three possibilities:

1. Within RDF, i.e. we somehow define URIs as identifiers of g-snaps.

2. Within Provenance, i.e. we have a vocabulary and a data model that 
allows the identification and definition of g-snaps together with their 
relation to various g-boxes.

3. In-between, i.e. we uncouple the issue from the mere provenance 
description and build provenance on something like Memento [1].

I would like to see a solid solution to this problem as close to RDF as 
possible, as I consider this an important building block for a lot of 
applications, not least the trust layer...

Provenance WG is not focused on the provenance of RDF, but to the 
general description of provenance. That's IMHO not close enough to 
deliver the solution for such a fundamental requirement.

If it does not directly belong to RDF (which is reasonable), we should 
think about the third option and nonetheless see how we can reach a 
standardization of the identification and accessability of g-snaps.

Cheers,

Kai

[1] http://www.mementoweb.org/
-- 
Kai Eckert
Universitätsbibliothek Mannheim
Stellv. Leiter Abteilung Digitale Bibliotheksdienste
Schloss Schneckhof West / 68131 Mannheim
Tel. 0621/181-2946  Fax 0621/181-2918
Received on Monday, 3 October 2011 11:30:36 UTC