Re: Reification - whats best practice? from Frank Manola on 2004-08-27 (www-rdf-interest@w3.org from August 2004)

From: Frank Manola <fmanola@acm.org>
Date: Fri, 27 Aug 2004 10:41:36 -0400
To: Bob MacGregor <macgregor@isi.edu>
CC: www-rdf-interest@w3.org
Message-ID: <412F4820.7040409@acm.org>
Bob MacGregor wrote:

> 
> What many seem to be missing is that the notion of a "named container of 
> triples" is common to all solutions
> that are being argued.  A document of RDF triples is a subcase.  A named 
> graph is a subcase
> (or the same case).  A reified statement is the subcase where the 
> container can contain only
> one triple. 

I'm not sure people are really missing this point.  They may simply be 
recognizing that the "container" (structural) aspect isn't the main 
issue in defining contexts.  The main issue is defining the special 
semantics of contexts (or distinguishing among the various special 
semantics people seem to want to associate with them).  How statements 
about the container apply to the statements in the container, what it 
means to interpret a statement inside and outside a context, and so on. 
  Keep in mind that simple provenance of statements is only one possible 
use to which the "context" idea has been put.  The general idea has been 
interpreted in many ways, not all of them consistent.  (I'm also 
doubtful about equating a container that contains a single triple with a 
reified statement, but that's a nit).

> 
> When someone says "I'm not using contexts, because we can't agree on 
> what they mean, but
> here is my named container solution", they are really using contexts -- 
> they've just settled
> on a particular case. 

Perhaps, but distinguishing among the cases (even when they look the 
same structurally) remains part of the problem.  After all, we have 
containers in RDF now.

> 
> The issue of quads is in some sense secondary.  A quad is just the 
> syntactic glue that relates
> each triple in a container to the container.  So the fundamental issue 
> relates to contexts, not quads.
>

Concur.

> 
> Summing up. 
> 
> - The current RDF provides miserable support for provenance data (to 
> cite the
> most obvious use case for contexts). 

I think this can be put another way:  Even if we throw out the current 
reification vocabulary as totally unusable, current RDF provides 
precisely the same support for provenance data (and any other data 
describing statements) that it does for describing people, cars, genes, 
electrical equipment, and anything else people have used RDF for.  RDF 
provides no built-in support for associating URIs or bNodes with 
statements, but it provides no support for associating URIs or bNodes 
with people, cars, genes, or electrical equipment either.  Similarly, 
and again assuming we throw out the current reification vocabulary, RDF 
provides the same vocabulary support for describing statements that it 
does for describing people, cars, genes, or electrical equipment.  In 
all those cases, people interested in describing those things have to 
come up with a vocabulary of classes and properties, and the associated 
semantics.  One idea to promote progress in this area might be to focus 
on defining this vocabulary and its semantics, so we can see if general 
agreement can be reached.
> 
> - Named containers of triples provide a solution.  Many implementations 
> of named
> containers already exist.

Named containers of triples might provide a *structural* solution, but 
they don't necessarily define all the necessary semantics for the common 
interpretation of such containers.  Also, I think this also assumes some 
mechanism for automatically identifying things as triples (i.e., lets 
enumerate *all* the requirements we're assuming).  Otherwise, we could 
do all this now:  we can construct URIs that denote triples, give them a 
type (say "triple"), say that those things are members of containers, 
and give the containers names (and other properties).

> 
> - Most implementers of triple stores implement some
> form of container inside their systems to indicate the source of the 
> triples, but RDF
> doesn't provide a means for them to expose that mechanism.

It seems to me that providing a means to expose underlying structure 
like this is doing things the wrong way around.  We ought to define (at 
the RDF level) the semantics of what we want, and make the 
implementations implement it, rather than forcing a provenance model to 
reflect mechanisms people have chosen for various implementation purposes.

--Frank

> 
> - The SOURCE construct in BRQL would provide a solution that is narrower 
> than,
> but consistent with, a general named container solution.
> 
> My near-term recommendation would be to pester the DAWG committee to include
> SOURCE in their spec.  Its quite hard to get traction in this area, and 
> that is currently
> our best shot.
> 
> Cheers, Bob
>
Received on Friday, 27 August 2004 14:39:08 UTC