Re: Provenance as a first-class citizen from Harry Halpin on 2006-03-18 (semantic-web@w3.org from March 2006)

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Sat, 18 Mar 2006 18:19:11 +0000
To: semantic-web@w3.org
Message-ID: <441C4F1F.8000801@ibiblio.org>
Is it just me or does it seem like the sensible thing to get a W3C
Recommendation on using named graphs (quads) as an *optional* feature of
RDF? Then people that want to publish data to URIs (Sandro) can do that
without using quads, and people that are merging and aggregating
information can do so in a standardized manner that's already
implemented? Or have I just missed the W3C recommendation for quads?
This would be a rather small move, but I think it would help deal with
some of the issues around provenance.

As for reification, people can just keep ignoring it :)

It seems like this is precisely the sort of thing the W3C should do,
which is standardize best practice as learned through experience. The
real problem would be if there was really worthwhile cases of provenance
that quads didn't catch...

                                     -harry



Dan Brickley wrote:

>* Sandro Hawke <sandro@w3.org> [2006-03-17 15:56-0500]
>  
>
>>Ben Syverson wrote:
>>    
>>
>>>On Mar 17, 2006, at 11:04 AM, Garrett Wollman wrote:
>>>      
>>>
>>>>I'm certain that this has been said before by people better-informed
>>>>than I, but the more I look at RDF the more certain I am that basing
>>>>it on triples rather than 4-tuples was a serious mistake.
>>>>        
>>>>
>
>I agree with everything you say here, except the bit about "rare", which
>I'm agnostic on. WIll there be more writers than readers on the Semantic
>Web? Who knows :) Publishers, as you note, should just say stuff, and
>not feel the need to reify at the triple level that they've said it.
>Consumers should, at some level of their application, take account of
>who said what. Especially when they're merging and aggregating
>(something that the RDF approach directly encourages, by being so
>merge-able). I've never found triple-based reification attractive; it's 
>too granular, amongst other things. Publishers probably should do a few
>little things in their RDF that are at the document/graph level rather
>than per-triple, eg. assert that they're the dc:creator of the RDF/XML
>document, and publish some form of digital signature. Edd has a nice 
>writeup of a simple PGP/GPG-based approach that folk in the FOAF
>community were experimenting with: 
>http://usefulinc.com/foaf/signingFoafFiles --- perhaps if some
>techniques like that were more deployed, consumers of RDF would find 
>more value in quadstore techniques? Particularly as quads are now being
>exposed in a standard way via SPARQL...
>
>Dan
>
>  
>
>>>I agree 1000%. Using triples means that by default statements are  
>>>trusted and not reified. It suggests a top-down approach, rather than  
>>>a bottom-up one. This is one reason that tags/keywords are more  
>>>appealing to people than the SW.
>>>      
>>>
>>I disagree.
>>
>>RDF is based on triples because triples are an excellent single building
>>block for making arbitrary statements.
>>
>>For making statements about statements -- which you're talking about --
>>you need something more complex, like quads or reification, but that's
>>relatively rare (even if it's very interesting).
>>
>>Publishing statements as triples makes sense.  Whatever you want your
>>web page to say, just put those statements on the page.  You shouldn't
>>have to put on the page a statement that those statements are on the
>>page and are true.  Say "The sky is blue", not "I am now telling you
>>that the sky is blue."
>>
>>For reasoning about statements, yes, of course use quads.  When I
>>harvest RDF data, of course I keep track of what web pages said what.
>>But I don't usually need to re-publish that harvester data; that's like
>>my web browser publishing my browsing history along with the browser
>>cache.  There are applications where that's useful, sure, but it's
>>hardly the main way data moves around the web.
>>
>>    -- sandro
>>    
>>
>
>  
>
Received on Saturday, 18 March 2006 18:19:29 UTC