Re: Reification - whats best practice? from Bob MacGregor on 2004-08-27 (www-rdf-interest@w3.org from August 2004)

From: Bob MacGregor <macgregor@isi.edu>
Date: Fri, 27 Aug 2004 08:08:22 -0700
To: Frank Manola <fmanola@acm.org>
CC: www-rdf-interest@w3.org
Message-ID: <412F4E66.205@isi.edu>
Hi Frank,

You make many good points; I don't like to get deeply nested, so I'll
respond just on top.

You say RDF already has containers.  True -- its easy to create a container
of things that denote "entities", but its MUCH less practical to create 
a container
of statements.  Yes, its doable, but this is the Turing argument all 
over again -- we
already have assembly language, but we would like to code in Java.

You are insisting on semantics.  RDF has almost no semantics -- graphs are
just graphs; there is no attempt to assign truth per se.  I'm pushing 
for named
containers, another data structure, with no built-in semantics pe se (except
that the contexts I use allow for contexts within contexts, which induces
a few entailments).  Note: Pat Hayes has carefully insured that RDF 
statement
reification has essentially no semantics.

Basically, you are advocating a cerebral exercise, followed by adoption.
The problem is that its hard to appreciate the utility of something like
contexts unless you have the option to use them (not just imagine what
it would be like).  Reified statements are a good negative example -- on
paper, they look promising, but in practice they s*ck.  Only relatively
few of us have the luxury of building applications using a real context 
mechanism
(have you?).

I'm pushing for SOURCE in BRQL because it temporizes on the
semantics (sticking to something where the semantics is relatively simple
and well-defined) but it gets the structural aspects right (which is what I
 really care about).

Cheers, Bob

Frank Manola wrote:

> Bob MacGregor wrote:
>
>>
>> What many seem to be missing is that the notion of a "named container 
>> of triples" is common to all solutions
>> that are being argued.  A document of RDF triples is a subcase.  A 
>> named graph is a subcase
>> (or the same case).  A reified statement is the subcase where the 
>> container can contain only
>> one triple. 
>
>
> I'm not sure people are really missing this point.  They may simply be 
> recognizing that the "container" (structural) aspect isn't the main 
> issue in defining contexts.  The main issue is defining the special 
> semantics of contexts (or distinguishing among the various special 
> semantics people seem to want to associate with them).  How statements 
> about the container apply to the statements in the container, what it 
> means to interpret a statement inside and outside a context, and so 
> on.  Keep in mind that simple provenance of statements is only one 
> possible use to which the "context" idea has been put.  The general 
> idea has been interpreted in many ways, not all of them consistent.  
> (I'm also doubtful about equating a container that contains a single 
> triple with a reified statement, but that's a nit).
>
>>
>> When someone says "I'm not using contexts, because we can't agree on 
>> what they mean, but
>> here is my named container solution", they are really using contexts 
>> -- they've just settled
>> on a particular case. 
>
>
> Perhaps, but distinguishing among the cases (even when they look the 
> same structurally) remains part of the problem.  After all, we have 
> containers in RDF now.
>
>>
>> The issue of quads is in some sense secondary.  A quad is just the 
>> syntactic glue that relates
>> each triple in a container to the container.  So the fundamental 
>> issue relates to contexts, not quads.
>>
>
> Concur.
>
>>
>> Summing up.
>> - The current RDF provides miserable support for provenance data (to 
>> cite the
>> most obvious use case for contexts). 
>
>
> I think this can be put another way:  Even if we throw out the current 
> reification vocabulary as totally unusable, current RDF provides 
> precisely the same support for provenance data (and any other data 
> describing statements) that it does for describing people, cars, 
> genes, electrical equipment, and anything else people have used RDF 
> for.  RDF provides no built-in support for associating URIs or bNodes 
> with statements, but it provides no support for associating URIs or 
> bNodes with people, cars, genes, or electrical equipment either.  
> Similarly, and again assuming we throw out the current reification 
> vocabulary, RDF provides the same vocabulary support for describing 
> statements that it does for describing people, cars, genes, or 
> electrical equipment.  In all those cases, people interested in 
> describing those things have to come up with a vocabulary of classes 
> and properties, and the associated semantics.  One idea to promote 
> progress in this area might be to focus on defining this vocabulary 
> and its semantics, so we can see if general agreement can be reached.
>
>>
>> - Named containers of triples provide a solution.  Many 
>> implementations of named
>> containers already exist.
>
>
> Named containers of triples might provide a *structural* solution, but 
> they don't necessarily define all the necessary semantics for the 
> common interpretation of such containers.  Also, I think this also 
> assumes some mechanism for automatically identifying things as triples 
> (i.e., lets enumerate *all* the requirements we're assuming).  
> Otherwise, we could do all this now:  we can construct URIs that 
> denote triples, give them a type (say "triple"), say that those things 
> are members of containers, and give the containers names (and other 
> properties).
>
>>
>> - Most implementers of triple stores implement some
>> form of container inside their systems to indicate the source of the 
>> triples, but RDF
>> doesn't provide a means for them to expose that mechanism.
>
>
> It seems to me that providing a means to expose underlying structure 
> like this is doing things the wrong way around.  We ought to define 
> (at the RDF level) the semantics of what we want, and make the 
> implementations implement it, rather than forcing a provenance model 
> to reflect mechanisms people have chosen for various implementation 
> purposes.
>
> --Frank
>
>>
>> - The SOURCE construct in BRQL would provide a solution that is 
>> narrower than,
>> but consistent with, a general named container solution.
>>
>> My near-term recommendation would be to pester the DAWG committee to 
>> include
>> SOURCE in their spec.  Its quite hard to get traction in this area, 
>> and that is currently
>> our best shot.
>>
>> Cheers, Bob
>>


-- 
=====================================
Robert MacGregor, Senior Project Leader
macgregor@isi.edu
Phone: 310/448-8423
Fax:  310/822-6592
Mobile: 310/251-8488
USC Information Sciences Institute
4676 Admiralty Way, Marina del Rey, CA 90292
=====================================
Received on Friday, 27 August 2004 15:08:56 UTC