Re: Provenance as a first-class citizen


It may be true that reification is rare in some contexts, but it's 
really common in others.  At my company (BBN Technologies), we have a 
number of SemWeb applications, and reification plays a large role in 
most of them.  In some cases it is used for provenance, but in others it 
goes deeper.  For instance, dividing statements into categories.  One of 
the major problems we've been dealing with is the gross inefficiency of 
most reification implementations.  (Any hints here would be welcome.)

Here's a reification question for the larger group:  This thread has 
repeatedly asserted that the fourth URI in a quad (or 4-tuple) "solves" 
the reification problem.  However, as far as I know, there is no general 
agreement on what that fourth URI should represent.  Does it identify 
the source of the statement?  Does it identify the statement itself?  Or 
are there other options?  It seems to me that if quads were made part of 
the RDF standard (whether optional or not), then the standard should 
specify what the fourth URI is.



-------- Original Message  --------
From: Sandro Hawke <>
To: ben syverson <>
Subject: Re:Provenance as a first-class citizen
Date: 3/17/2006 3:56 PM

> Ben Syverson wrote:
>> On Mar 17, 2006, at 11:04 AM, Garrett Wollman wrote:
>>> I'm certain that this has been said before by people better-informed
>>> than I, but the more I look at RDF the more certain I am that basing
>>> it on triples rather than 4-tuples was a serious mistake.
>> I agree 1000%. Using triples means that by default statements are  
>> trusted and not reified. It suggests a top-down approach, rather than  
>> a bottom-up one. This is one reason that tags/keywords are more  
>> appealing to people than the SW.
> I disagree.
> RDF is based on triples because triples are an excellent single building
> block for making arbitrary statements.
> For making statements about statements -- which you're talking about --
> you need something more complex, like quads or reification, but that's
> relatively rare (even if it's very interesting).
> Publishing statements as triples makes sense.  Whatever you want your
> web page to say, just put those statements on the page.  You shouldn't
> have to put on the page a statement that those statements are on the
> page and are true.  Say "The sky is blue", not "I am now telling you
> that the sky is blue."
> For reasoning about statements, yes, of course use quads.  When I
> harvest RDF data, of course I keep track of what web pages said what.
> But I don't usually need to re-publish that harvester data; that's like
> my web browser publishing my browsing history along with the browser
> cache.  There are applications where that's useful, sure, but it's
> hardly the main way data moves around the web.
>     -- sandro

Received on Tuesday, 21 March 2006 13:35:14 UTC