Re: why I don't like named graph IRIs in the DATASET proposal from Sandro Hawke on 2011-10-01 (public-rdf-wg@w3.org from October 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Sat, 01 Oct 2011 17:59:27 -0400
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>, "public-rdf-wg@w3.org" <public-rdf-wg@w3.org>
Message-ID: <1317506367.5766.198.camel@waldron>

On Sat, 2011-10-01 at 12:43 +0100, Richard Cyganiak wrote:
> On 30 Sep 2011, at 20:11, Sandro Hawke wrote:
> >> Named graphs are key to trust and
> >> provenance. Trust and provenance must happen at a lower level in the
> >> stack, before reasoning and inference kick in. W3C's version of the
> >> layer cake, where trust sits above reasoning, cannot work. The moment
> >> you reason with OWL over untrusted data, you [have problems].
> > 
> > I don't think we need to throw out reasoning on the fourth column.  As
> > long as we're careful about what it means -- eg: it denotes an IR which
> > may give you a Graph -- I think people are free to layer inference and
> > trust/provenance reasoning in various ways.  
> > 
> > Let's say you are using three Web data sources, S1, S2, and S3.  S1 and
> > S2 give just triples.  S3 is an ontology (perhaps a RIF document); we
> > don't really care if it's triples.   What's the problem with merging the
> > triples, doing the inference, and using the result, knowing it is no
> > more trustworthy than the least of S1, S2, and S3?  
> 
> Well, the way I see it, what happened here is that the system (on behalf of some user, I presume) decided that S1, S2 and S3 are good enough – sufficiently trustworthy – for the task at hand.
> 
> Provenance information is the basis for trust decisions. The system made the trust decision before it merged the graphs.
> 
> > Specifically, the
> > provenance of your output involves the provenance of S1, S2, S3, and the
> > reasoning steps you took. In detailing those reasoning steps, I think the identifiers for S1, S2, and S3 will be useful.
> 
> Sure. What I said was that you can't do OWL reasoning over untrusted data sources. I didn't say that you can't use graph names when recording processing steps that were taken.
> 
> > But for a later-stage provenance system to reason about S1, S2, and S3
> > is fine, I think.
> 
> I don't know what it means when you say “a provenance system reasons about XYZ”. I suppose you're not talking about OWL reasoning.

There's an example of what I mean by reasoning about provenance and the
fourth column identifier:

        So let's say we we have a concept of a semantic web home page
        for a person. We decide on the policy that if someone's home
        page says that they are a vegetarian, then we believe that they
        are a vegetarian.

That's from [1], where it's shown in N3:
        
        @forAll :x.
        {:x :homePage log:includes { :x a :Vegetarian }}=> { :x a :Vegetarian}.

(I think there might be a typo there; I can't quite parse the way
:homePage and log:includes sit together.  It looks like DanC got it
running later, slightly modified [2].)

It's not immediately obvious to me how to do this kind of stuff with
named graphs, but maybe it will come to me.  Perhaps we can show it
with SPARQL?  Query for all the people whose home pages say they are
vegetarians?

   -- Sandro

[1] http://www.w3.org/2000/10/swap/doc/Reach 
[2] 
http://dev.w3.org/cvsweb/~checkout~/2000/10/swap/test/reason/conf_reg_ex.n3?rev=1.1;content-type=text%2Fplain

Received on Saturday, 1 October 2011 21:59:31 UTC