- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sat, 1 Oct 2011 19:29:25 -0400
- To: Sandro Hawke <sandro@w3.org>
- Cc: Richard Cyganiak <richard@cyganiak.de>, Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>, "public-rdf-wg@w3.org" <public-rdf-wg@w3.org>
* Sandro Hawke <sandro@w3.org> [2011-10-01 17:59-0400] > On Sat, 2011-10-01 at 12:43 +0100, Richard Cyganiak wrote: > > On 30 Sep 2011, at 20:11, Sandro Hawke wrote: > > >> Named graphs are key to trust and > > >> provenance. Trust and provenance must happen at a lower level in the > > >> stack, before reasoning and inference kick in. W3C's version of the > > >> layer cake, where trust sits above reasoning, cannot work. The moment > > >> you reason with OWL over untrusted data, you [have problems]. > > > > > > I don't think we need to throw out reasoning on the fourth column. As > > > long as we're careful about what it means -- eg: it denotes an IR which > > > may give you a Graph -- I think people are free to layer inference and > > > trust/provenance reasoning in various ways. > > > > > > Let's say you are using three Web data sources, S1, S2, and S3. S1 and > > > S2 give just triples. S3 is an ontology (perhaps a RIF document); we > > > don't really care if it's triples. What's the problem with merging the > > > triples, doing the inference, and using the result, knowing it is no > > > more trustworthy than the least of S1, S2, and S3? > > > > Well, the way I see it, what happened here is that the system (on behalf of some user, I presume) decided that S1, S2 and S3 are good enough – sufficiently trustworthy – for the task at hand. > > > > Provenance information is the basis for trust decisions. The system made the trust decision before it merged the graphs. > > > > > Specifically, the > > > provenance of your output involves the provenance of S1, S2, S3, and the > > > reasoning steps you took. In detailing those reasoning steps, I think the identifiers for S1, S2, and S3 will be useful. > > > > Sure. What I said was that you can't do OWL reasoning over untrusted data sources. I didn't say that you can't use graph names when recording processing steps that were taken. > > > > > But for a later-stage provenance system to reason about S1, S2, and S3 > > > is fine, I think. > > > > I don't know what it means when you say “a provenance system reasons about XYZ”. I suppose you're not talking about OWL reasoning. > > There's an example of what I mean by reasoning about provenance and the > fourth column identifier: > > So let's say we we have a concept of a semantic web home page > for a person. We decide on the policy that if someone's home > page says that they are a vegetarian, then we believe that they > are a vegetarian. > > That's from [1], where it's shown in N3: > > @forAll :x. > {:x :homePage log:includes { :x a :Vegetarian }}=> { :x a :Vegetarian}. > > (I think there might be a typo there; I can't quite parse the way > :homePage and log:includes sit together. It looks like DanC got it > running later, slightly modified [2].) > {:x :homePage log:includes { :x a :Vegetarian }}=> { :x a :Vegetarian}. Just to put this modified form in peoples faces: [[ @forAll WHO. { WHO foaf:homepage ?PG. ?PG log:semantics [ log:includes { WHO a Vegetarian } ] } => { WHO a Vegetarian}. ]] where log:semantics maps from a gtext to gsnap and log:includes maps to all subgraphs. > It's not immediately obvious to me how to do this kind of stuff with > named graphs, but maybe it will come to me. Perhaps we can show it > with SPARQL? Query for all the people whose home pages say they are > vegetarians? Yeah, that's how predicated trust is generally done. We can follow the example literally: [[ PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?who a <Vegetarian> } WHERE { ?who foaf:homepage ?pg GRAPH ?pg { ?who a <Vegetarian> } } ]] and get { <Bob> a <Vegetarian> . } or we can cut to the chase and ask for all the vegetarians by changing "CONSTRUCT { … } to "SELECT ?who". Perhaps more illustrative is a more general truth maintenance operation over data like: @prefix foaf: <http://xmlns.com/foaf/0.1/> . { <Bob> foaf:homepage <bobzpage> . <Vegetarian> a <no-reason-to-lie-property> . <Genius> a <every-reason-to-lie-property> . } <bobzpage> { <Bob> a <Vegetarian> . <Bob> a <Genius> . } sparql -d asdf.trig -e 'PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?who a <Vegetarian> } WHERE { ?who foaf:homepage ?pg . ?selfClass a <no-reason-to-lie-property> . GRAPH ?pg { ?who a ?selfClass } }' yields the statements you trust Bob to make about himself: { <Bob> a <Vegetarian> . } As to doing inference over "untrusted data", (A) I think that *all* trust is predicated, and (B), it really doesn't matter if Bob claims to be a <Genius> or he claims to be a <Super-genius> and inferencing leads me to discover that he believes himself a <Genius>, my trust of the inferential closure is pretty much identical to my trust of his homepage. I believe the only exception to this is when you don't entirely trust the ontology 'cause it's got some ragged edges (as happens with large OWL-ified medical ontologies like SNOMED-CT). > -- Sandro > > [1] http://www.w3.org/2000/10/swap/doc/Reach > [2] > http://dev.w3.org/cvsweb/~checkout~/2000/10/swap/test/reason/conf_reg_ex.n3?rev=1.1;content-type=text%2Fplain > > -- -ericP
Received on Saturday, 1 October 2011 23:29:56 UTC