Re: Provenance as a first-class citizen from Giovanni Tummarello on 2006-03-20 (semantic-web@w3.org from March 2006)

From: Giovanni Tummarello <g.tummarello@gmail.com>
Date: Mon, 20 Mar 2006 23:44:18 +0100
To: Harry Halpin <hhalpin@ibiblio.org>, semantic-web@w3.org
Message-ID: <441F3042.9080309@gmail.com>
Except quads being just syntactic sugar to mean  "reification" thus just 
adding complexity .. :-)

there is absolutely nothing wrong with reification.
The fact that a reified statement doesnt imply the existence of the 
statement itself in the graph is a feature not a bug.
the fact that you should keep your reasoner well behaved when making 
inference over reified triples is simply.. normal (the superman thing).

So, unless there are actual reasons (which i'd be happy to hear),  w3c 
please keep RDF nice and clean as it is.

On the other hand .. syntactic support for reificaiton *should* instead 
be demandded at access level and in particular from the DAWG in SparQL.. 
since queries are usually written by humans and that would just make a 
lots of sense. But last time i asked... :-)
my2c.

Giovanni


Harry Halpin wrote:
> Is it just me or does it seem like the sensible thing to get a W3C
> Recommendation on using named graphs (quads) as an *optional* feature of
> RDF? Then people that want to publish data to URIs (Sandro) can do that
> without using quads, and people that are merging and aggregating
> information can do so in a standardized manner that's already
> implemented? Or have I just missed the W3C recommendation for quads?
> This would be a rather small move, but I think it would help deal with
> some of the issues around provenance.
>
> As for reification, people can just keep ignoring it :)
>
> It seems like this is precisely the sort of thing the W3C should do,
> which is standardize best practice as learned through experience. The
> real problem would be if there was really worthwhile cases of provenance
> that quads didn't catch...
>
>                                      -harry
>
>
>
> Dan Brickley wrote:
>
>   
>> * Sandro Hawke <sandro@w3.org> [2006-03-17 15:56-0500]
>>  
>>
>>     
>>> Ben Syverson wrote:
>>>    
>>>
>>>       
>>>> On Mar 17, 2006, at 11:04 AM, Garrett Wollman wrote:
>>>>      
>>>>
>>>>         
>>>>> I'm certain that this has been said before by people better-informed
>>>>> than I, but the more I look at RDF the more certain I am that basing
>>>>> it on triples rather than 4-tuples was a serious mistake.
>>>>>        
>>>>>
>>>>>           
>> I agree with everything you say here, except the bit about "rare", which
>> I'm agnostic on. WIll there be more writers than readers on the Semantic
>> Web? Who knows :) Publishers, as you note, should just say stuff, and
>> not feel the need to reify at the triple level that they've said it.
>> Consumers should, at some level of their application, take account of
>> who said what. Especially when they're merging and aggregating
>> (something that the RDF approach directly encourages, by being so
>> merge-able). I've never found triple-based reification attractive; it's 
>> too granular, amongst other things. Publishers probably should do a few
>> little things in their RDF that are at the document/graph level rather
>> than per-triple, eg. assert that they're the dc:creator of the RDF/XML
>> document, and publish some form of digital signature. Edd has a nice 
>> writeup of a simple PGP/GPG-based approach that folk in the FOAF
>> community were experimenting with: 
>> http://usefulinc.com/foaf/signingFoafFiles --- perhaps if some
>> techniques like that were more deployed, consumers of RDF would find 
>> more value in quadstore techniques? Particularly as quads are now being
>> exposed in a standard way via SPARQL...
>>
>> Dan
>>
>>  
>>
>>     
>>>> I agree 1000%. Using triples means that by default statements are  
>>>> trusted and not reified. It suggests a top-down approach, rather than  
>>>> a bottom-up one. This is one reason that tags/keywords are more  
>>>> appealing to people than the SW.
>>>>      
>>>>
>>>>         
>>> I disagree.
>>>
>>> RDF is based on triples because triples are an excellent single building
>>> block for making arbitrary statements.
>>>
>>> For making statements about statements -- which you're talking about --
>>> you need something more complex, like quads or reification, but that's
>>> relatively rare (even if it's very interesting).
>>>
>>> Publishing statements as triples makes sense.  Whatever you want your
>>> web page to say, just put those statements on the page.  You shouldn't
>>> have to put on the page a statement that those statements are on the
>>> page and are true.  Say "The sky is blue", not "I am now telling you
>>> that the sky is blue."
>>>
>>> For reasoning about statements, yes, of course use quads.  When I
>>> harvest RDF data, of course I keep track of what web pages said what.
>>> But I don't usually need to re-publish that harvester data; that's like
>>> my web browser publishing my browsing history along with the browser
>>> cache.  There are applications where that's useful, sure, but it's
>>> hardly the main way data moves around the web.
>>>
>>>    -- sandro
>>>    
>>>
>>>       
>>  
>>
>>     
>
>
>
Received on Monday, 20 March 2006 22:44:35 UTC