Re: Provenance as a first-class citizen

* Sandro Hawke <> [2006-03-17 15:56-0500]
> Ben Syverson wrote:
> > On Mar 17, 2006, at 11:04 AM, Garrett Wollman wrote:
> > > I'm certain that this has been said before by people better-informed
> > > than I, but the more I look at RDF the more certain I am that basing
> > > it on triples rather than 4-tuples was a serious mistake.

I agree with everything you say here, except the bit about "rare", which
I'm agnostic on. WIll there be more writers than readers on the Semantic
Web? Who knows :) Publishers, as you note, should just say stuff, and
not feel the need to reify at the triple level that they've said it.
Consumers should, at some level of their application, take account of
who said what. Especially when they're merging and aggregating
(something that the RDF approach directly encourages, by being so
merge-able). I've never found triple-based reification attractive; it's 
too granular, amongst other things. Publishers probably should do a few
little things in their RDF that are at the document/graph level rather
than per-triple, eg. assert that they're the dc:creator of the RDF/XML
document, and publish some form of digital signature. Edd has a nice 
writeup of a simple PGP/GPG-based approach that folk in the FOAF
community were experimenting with: --- perhaps if some
techniques like that were more deployed, consumers of RDF would find 
more value in quadstore techniques? Particularly as quads are now being
exposed in a standard way via SPARQL...


> > 
> > I agree 1000%. Using triples means that by default statements are  
> > trusted and not reified. It suggests a top-down approach, rather than  
> > a bottom-up one. This is one reason that tags/keywords are more  
> > appealing to people than the SW.
> I disagree.
> RDF is based on triples because triples are an excellent single building
> block for making arbitrary statements.
> For making statements about statements -- which you're talking about --
> you need something more complex, like quads or reification, but that's
> relatively rare (even if it's very interesting).
> Publishing statements as triples makes sense.  Whatever you want your
> web page to say, just put those statements on the page.  You shouldn't
> have to put on the page a statement that those statements are on the
> page and are true.  Say "The sky is blue", not "I am now telling you
> that the sky is blue."
> For reasoning about statements, yes, of course use quads.  When I
> harvest RDF data, of course I keep track of what web pages said what.
> But I don't usually need to re-publish that harvester data; that's like
> my web browser publishing my browsing history along with the browser
> cache.  There are applications where that's useful, sure, but it's
> hardly the main way data moves around the web.
>     -- sandro

Received on Saturday, 18 March 2006 14:01:05 UTC